Review of multimodal deep learning model for esophageal squamous cell carcinoma diagnosis
This is a review and synthesis of a model development and validation study for esophageal squamous cell carcinoma. The scope is the performance of a multimodal deep learning model (MUMA-EDx) that integrates deep learning-based magnifying endoscopy and EUS imaging. The authors synthesized findings from a retrospective cohort of 358 patients and a prospective cohort of 122 patients. Key findings include an AUC of 0.94 (95%CI 0.92-0.96) for tumor discrimination in retrospective validation and a perfect patient-level AUC of 1.00 (95%CI 1.00-1.00) in prospective testing. For multiclass invasion depth classification, the retrospective AUC was 0.95 (95%CI 0.88-0.99) and the prospective AUC was 0.80 (95%CI 0.67-0.87). The model was compared to single-modality models and novice and expert-level diagnostics. The authors note limitations, including the need for external validation and assessment in real-world clinical settings. Practice relevance is restrained, as the model is not yet ready for routine clinical use without further evidence.