Mode
Text Size
Log in / Sign up

AI interventions in undergraduate health professions education show positive effects on satisfaction, confidence, and knowledge

AI interventions in undergraduate health professions education show positive effects on…
Photo by National Cancer Institute / Unsplash
Key Takeaway
Consider AI tools for health professions education cautiously due to low-certainty evidence on surrogate outcomes.

This is a systematic review and meta-analysis of randomized controlled trials examining Artificial Intelligence (AI) interventions in undergraduate health professions education. The population consisted of undergraduate health professions education students, with a total sample size of 4911 participants across the included studies. The setting was undergraduate health professions education programs. The intervention involved various AI technologies, subcategorized by type and educational function, including LLM-based personalized learning aids, LLM content generators, NLP chatbots, and non-LLM adaptive learning platforms. The comparator was standard educational interventions. No specific dosing or protocol details for the AI interventions were reported.

The primary outcome was not reported in the meta-analysis. Key secondary outcomes included satisfaction, confidence, and theoretical knowledge. For satisfaction, the analysis included 430 participants and reported positive effects with a standardized mean difference (SMD) of 0.93, a 95% confidence interval (CI) of 0.40 to 1.46. For confidence, the analysis included 609 participants and reported positive effects with an SMD of 0.91, a 95% CI of 0.54 to 1.29. For theoretical knowledge, the analysis included 955 participants and reported positive effects with an SMD of 0.53, a 95% CI of 0.13 to 0.94. All outcomes showed positive directionality.

Safety and tolerability findings were not reported. The review did not include data on adverse events, serious adverse events, discontinuations, or overall tolerability of the AI interventions.

These results should be compared cautiously to prior landmark studies in educational technology. The current meta-analysis synthesizes evidence from multiple trials, but the certainty of the evidence is low to very low, and the effects are inconsistent. Prior studies in this area have often been small or have had methodological limitations similar to those identified here.

Key methodological limitations include a high risk of bias in most studies, primarily due to poor allocation concealment and blinding. The certainty of evidence ranged from low to very low. There was substantial heterogeneity and wide confidence intervals, and prediction intervals frequently crossed the null. No studies assessed Kirkpatrick levels 3 or 4, which relate to behavior change and clinical outcomes. The effects are inconsistent and not reliably reproducible.

Clinical implications are that AI applications in undergraduate health professions education should be used cautiously and on a trial basis, as noted in the practice relevance section. The findings suggest potential benefits for student satisfaction, confidence, and theoretical knowledge, but these are surrogate outcomes. Clinicians and educators should not infer causation, overstate effects given the high risk of bias, or assume clinical outcomes from these surrogate measures.

Key questions remain unanswered. It is not reported whether AI interventions affect real-world clinical performance or patient outcomes. The long-term durability of the observed effects is unknown. Future research should address the high risk of bias, improve study design, and assess higher-level Kirkpatrick outcomes.

Study Details

Study typeMeta analysis
Sample sizen = 4,911
EvidenceLevel 1
PublishedMay 2026
View Original Abstract ↓
BACKGROUND: Health professions education faces increasing challenges from rising health care complexity, pedagogical shifts, and constrained curricular space, and rapidly expanding knowledge and technological advances. While artificial intelligence (AI) shows promise for transforming health professions education, evidence of its effectiveness remains unclear. OBJECTIVE: This study synthesized evidence from randomized controlled trials (RCTs) on the effectiveness of AI in undergraduate health professions education. METHODS: We included RCTs, randomized crossover trials, and cluster RCTs comparing AI against standard educational interventions at the undergraduate level. We excluded quasi-experimental studies and those without clear AI components. We searched PubMed, Cochrane, Embase, Educational Resources Information Center, and Web of Science up to January 26, 2026. Outcomes were categorized according by Kirkpatrick levels; risk of bias was assessed using the Risk Of Bias Instrument for Use in Systematic Reviews for Randomised Controlled Trials tool; random-effects meta-analysis was conducted in RevMan (Cochrane); and certainty of evidence was rated using the Grading of Recommendations, Assessment, Development, and Evaluation approach. AI interventions were subcategorized by technology type and educational functions, yielding 13 subcategories. RESULTS: Of 39,783 records identified, 66 RCTs (N=4911 participants; 2020-2026) were included. Subcategorized analyses across 7 outcome domains yielded 48 comparisons. Most studies had high risk of bias, mainly due to poor allocation concealment and blinding, and certainty of evidence ranged from low to very low. Large language model (LLM)-based personalized learning aids comprised the largest evidence base and showed positive effects for satisfaction (standardized mean difference [SMD] 0.93, 95% CI 0.40-1.46; 7 studies; 430 participants; I²=74%), confidence (SMD 0.91, 95% CI 0.54-1.29; 7 studies; 609 participants; I²=64%), and theoretical knowledge (SMD 0.53, 95% CI 0.13-0.94; 12 studies; 955 participants; I²=86%), all with very low certainty. Other AI subtypes, including LLM content generators, natural language processing (NLP) chatbots, and non-LLM adaptive learning platforms, showed generally favorable point estimates but substantial heterogeneity and wide CIs, often included no effect. Prediction intervals frequently crossed the null, indicating uncertainty across educational setting. No studies assessed Kirkpatrick levels 3 or 4. CONCLUSIONS: This review synthesized RCT evidence on AI in undergraduate health professions education by technology type and function, incorporating evidence certainty. Despite the large number of included studies, evidence remains insufficient to inform educational practice. Some AI interventions may improve some learning outcomes, but effects are inconsistent and not reliably reproducible. High risk of bias, heterogeneity, imprecision, and absence of higher-level outcomes limit conclusions. AI applications should therefore be used cautiously and on a trial basis.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.