Mode
Text Size
Log in / Sign up

AI-augmented tutoring systems show statistically significant improvement in expert-rated surgical skill scoresAI Tutoring Shows Potential for Surgical Skill Training

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway
Note that while AI tutoring shows statistically significant OSATS improvements, the small effect size limits clinical significance.

This meta-analysis synthesized data from 268 participants to compare AI-augmented tutoring systems against expert instruction for surgical skill acquisition. The analysis focused on performance scores (ICEMS) and skill acquisition metrics (OSATS), as well as secondary outcomes like cognitive load.

The study found a statistically significant improvement in expert-rated OSATS scores for the AI tutoring group (MD 0.20; 95% CI, 0.01 to 0.39). However, no significant difference was observed in ICEMS scores. Notably, the AI group experienced significantly higher extraneous cognitive load (MD 0.23; p = 0.01) compared to the expert instruction group.

Several limitations were identified, including low certainty of evidence for OSATS scores and a reliance on a single high-risk-of-bias study for that specific finding. Furthermore, the small OSATS advantage of 0.20 is of uncertain clinical significance as it falls below common competency cut-points. These findings suggest AI tutoring systems may offer comparable effectiveness to expert instruction in simulated environments, but they do not currently support replacing human instructors.

Researchers looked at how students learn general surgical skills when using two different methods: instruction from human experts and guidance from AI-augmented tutoring systems. The study included 268 participants practicing these skills in a simulated setting.

The results showed that the group using AI tutoring had slightly higher scores on one specific skill assessment compared to those with expert instructors. However, both groups performed similarly on other performance measures. One important finding was that students using the AI system reported a higher amount of mental effort during their training.

It is important to note that the evidence for these findings is not very strong because it relied on a single study with a high risk of bias. The small improvement seen in the AI group may not be large enough to change how surgical training is currently taught. These results suggest that while AI can be a helpful tool, it is not currently recommended as a replacement for human instructors.

What this means for you:
AI tutoring shows some promise for skill training but current evidence is too limited to replace expert instruction.

Common questions

Is AI tutoring as effective as a human teacher?

The study found that AI-augmented tutoring systems were comparable to expert instruction for learning surgical skills. While the AI group showed a small improvement in one specific scoring category, the overall results suggest that AI is currently seen as a tool rather than a replacement for human experts.

Does using an AI system make the training harder?

Yes, the study found that students who used the AI-augmented tutoring systems reported significantly higher levels of extraneous cognitive load. This means they experienced more mental effort during the learning process compared to those who were taught by human experts.

Can surgeons use AI instead of humans for training?

The study does not support replacing human instructors with AI systems at this time. Because the evidence is limited and the improvements in some areas were small, current findings suggest that AI should be used as a supplement to, rather than a substitute for, expert instruction.

Study Details

Study typeMeta analysis
Sample sizen = 268
EvidenceLevel 1
PublishedJun 2026
View Original Abstract ↓
BACKGROUND: Surgical training suffers from a global deficit; 5 billion people lack access to safe surgery, with an estimated 143 million additional procedures needed annually. Traditional surgical education, constrained by the apprenticeship model, faces critical limitations in standardization and scalability, particularly in low- and middle-income countries where expert mentors are scarce. AI-augmented tutoring systems represent a potentially transformative solution. This systematic review and meta-analysis were conducted to address that evidence gap. METHODS: Following PRISMA 2020 guidelines, we systematically searched major databases for trials comparing AI tutoring with expert instruction. Primary outcomes were performance (Intelligent Continuous Expertise Monitoring System [ICEMS] score) and skill acquisition (Objective Structured Assessment of Technical Skills [OSATS] score). Cognitive load was a secondary outcome, measured using the Mental Effort Scale (MES) and Cognitive Load Index (CLI). RESULTS: Our search yielded 40 studies for narrative synthesis. Four studies (3 RCTs and 1 pilot prospective study), encompassing a total of 268 participants, were included in the meta-analysis. AI tutoring showed a small, statistically significant improvement in expert-rated OSATS scores (MD 0.20; 95% CI, 0.01 to 0.39) with no heterogeneity (I2 = 0%) but with low certainty. The AI group reported a significantly higher extraneous cognitive load (MD 0.23; p = 0.01). No significant difference was found in ICEMS scores. CONCLUSION: AI tutoring systems demonstrated comparable effectiveness to expert instructors in simulated surgical skill acquisition. The small OSATS advantage (0.20 points) is of uncertain clinical significance, falling below commonly published competency cut-points for meaningful change on global rating scales, and is based on low-certainty evidence driven by a single high-risk-of-bias study. AI tutoring imposed a higher extraneous cognitive load. These findings do not support replacing human instructors with AI. Instead, the evidence supports a hybrid model, though this itself requires rigorous empirical validation.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.