Mode
Text Size
Log in / Sign up

Systematic review finds LLMs show promise for ophthalmic education in primary care but lack real-world validationAI shows promise for eye care training but needs real-world testing

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway
Consider LLMs as potential cognitive apprenticeship tools in ophthalmic education, but recognize the current lack of real-world validation and significant safety risks.

This systematic review and meta-analysis evaluated studies of large language models (LLMs) in ophthalmic education, training, assessment, and primary care–relevant clinical support for primary care physicians. The evidence base is dominated by vignette-based benchmarks, comparative scoring studies, and evaluations in limited-sample or controlled settings. Prospective real-world validation using learner transfer, clinical behavior change, or patient outcomes remains scarce.

LLM applications identified include serving as 'cognitive apprenticeship' partners for clinical reasoning, with specific uses in triage and reasoning drills, virtual patient interviewing, and support for structured referrals, documentation, and patient education. Most studies benchmarked LLM outputs against expert consensus or guidelines, though scoring rubrics and reference standards varied widely. Some reports noted that adding clinical photographs to prompts could reduce model accuracy.

Key limitations include significant heterogeneity across studies, rapid model iteration creating reproducibility challenges, multimodal instability, and safety risks such as hallucination, bias, and automation bias. Adverse events, tolerability, and discontinuation data were not reported. The review concludes that with clearly defined task scopes and robust safeguards, LLMs may improve the accessibility and efficiency of primary care ophthalmic education, but should augment rather than replace expert judgment.

Researchers reviewed studies about using AI language models to help train primary care doctors in eye care. These studies examined whether AI could help with education, clinical reasoning practice, and patient support tasks. The review focused on research involving primary care physicians, though specific study sizes weren't reported.

The main finding was that AI models could act as practice partners for doctors, helping with skills like patient triage reasoning, virtual interviews, and creating educational materials. However, nearly all the evidence comes from controlled tests using written medical scenarios, not from real clinics with actual patients. Some reports even noted that adding real patient photos to the AI's analysis sometimes made it less accurate.

There are important reasons to be cautious. The studies varied widely in how they tested the AI, making results hard to compare. The AI technology changes very quickly, so today's findings might not apply tomorrow. Significant safety risks exist, including the AI 'hallucinating' or making up incorrect medical information, showing bias, or causing doctors to rely on it too much without checking its work.

Readers should understand this research shows early potential, not proven effectiveness. For now, these AI tools might one day help make eye care training more accessible, but they should only support—not replace—a doctor's expert judgment. Much more testing in real-world medical settings is needed before we know how safe and useful they truly are.

What this means for you:
AI may help train doctors in eye care, but it's still experimental and needs real-world testing for safety.

Study Details

Study typeMeta analysis
EvidenceLevel 1
PublishedMar 2026
View Original Abstract ↓
BackgroundPrimary care physicians (PCPs) are a critical first contact for eye-health screening, risk stratification, and referral, yet ophthalmology training and point-of-care support in primary care remain insufficient. Recent advances in generative artificial intelligence (generative AI), particularly large language models (LLMs), may help address these gaps through conversational, scenario-based learning and structured feedback. However, the educational effectiveness, reproducibility, and safety boundaries of LLM-enabled tools in primary care ophthalmology remain unclear.MethodsWe conducted a systematic review of studies evaluating or applying LLMs in ophthalmic education, training, assessment, or primary care–relevant clinical support. PubMed, Web of Science Core Collection, and Scopus were searched from January 1, 2020 to December 31, 2025 using combined terms related to LLMs/generative AI, ophthalmology, and education or assessment. Citation chaining was also performed to reduce omission. Two reviewers independently screened records and extracted data.ResultsThe evidence base is dominated by vignette-based benchmarks, comparative scoring studies, and evaluations conducted in limited-sample or controlled settings; prospective real-world validation using learner transfer, clinical behavior change, workflow impact, or patient outcomes remains scarce. Across studies, LLMs can serve as “cognitive apprenticeship” partners by externalizing clinical reasoning and enabling repeated practice in key-feature extraction, differential diagnosis, risk stratification, and referral-threshold decisions. Applications include triage/reasoning drills, virtual patient interviewing, and support for structured referrals, documentation, and patient education, often strengthened by retrieval-augmented generation. Most studies benchmarked outputs against expert consensus or guidelines, but scoring rubrics and reference standards varied widely, limiting cross-study comparability. Some reports noted that adding clinical photographs could reduce accuracy, suggesting current multimodal models are better suited for history-based reasoning than fine-grained image interpretation. Limitations include heterogeneity, rapid model iteration, reproducibility challenges, multimodal instability, and safety risks such as hallucination, bias, and automation bias. Commonly recommended safeguards include retrieval grounding, source attribution, red-flag checklists, and human-in-the-loop review.ConclusionWith clearly defined task scopes and robust safeguards, LLMs may improve the accessibility and efficiency of primary care ophthalmic education, but should augment rather than replace expert judgment. Future work should prioritize pragmatic multicenter trials, mixed-method implementation studies, and standardized cross-lingual evaluations to define safe and effective implementation pathways.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.