Review of multimodal deep learning model for esophageal squamous cell carcinoma diagnosis

Photo by Cht Gsml / Unsplash

Digestive endoscopy : official journal of the Japan Gastroenterological Endoscopy Society Published May 1, 2026 Medically reviewed May 1, 2026 Study authors: Yu Chu-Ting, Wang Ting-Lu, Gao Ye, Wu Zhi-Han, Chen Ying-Zhou, Shi Lei, Liu Biao, Zhang Hui, Xu Hong… PubMed ↗ DOI ↗ By Dr. Lars van Dijk, PhD · Surgical, Procedural & Diagnostic

Key Takeaway

Consider the multimodal model's high diagnostic performance but recognize the need for further validation before clinical adoption.

This is a review and synthesis of a model development and validation study for esophageal squamous cell carcinoma. The scope is the performance of a multimodal deep learning model (MUMA-EDx) that integrates deep learning-based magnifying endoscopy and EUS imaging. The authors synthesized findings from a retrospective cohort of 358 patients and a prospective cohort of 122 patients. Key findings include an AUC of 0.94 (95%CI 0.92-0.96) for tumor discrimination in retrospective validation and a perfect patient-level AUC of 1.00 (95%CI 1.00-1.00) in prospective testing. For multiclass invasion depth classification, the retrospective AUC was 0.95 (95%CI 0.88-0.99) and the prospective AUC was 0.80 (95%CI 0.67-0.87). The model was compared to single-modality models and novice and expert-level diagnostics. The authors note limitations, including the need for external validation and assessment in real-world clinical settings. Practice relevance is restrained, as the model is not yet ready for routine clinical use without further evidence.

Study Details

Sample sizen = 358

EvidenceLevel 5

PublishedMay 2026

PMID41468904

View Original Abstract ↓

BACKGROUND: Early detection of esophageal squamous cell carcinoma (ESCC) is critical for optimizing patient outcomes. Magnifying endoscopy and endoscopic ultrasonography (EUS) serve as established diagnostic modalities. The multimodal ultrasound and magnifying endoscopic algorithm for early ESCC diagnostics (MUMA-EDx) integrates deep learning-based magnifying endoscopy and EUS imaging to improve early-stage ESCC identification and invasion depth assessment. METHODS: Model development and internal validation used a retrospective dataset; external validation used a prospective cohort. MUMA-EDx developed two TResNet_m-based classifiers (magnifying endoscopy/EUS) followed by feature-level fusion. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC), accuracy, sensitivity, specificity, positive predictive value, and negative predictive value. RESULTS: MUMA-EDx was developed and validated using a retrospective dataset comprising 358 patients (18 420 images) and subsequently tested prospectively on an independent cohort of 122 patients (8711 images). The feature-level multimodal approach significantly outperformed single-modality models. For tumor discrimination, the model achieved an AUC of 0.94 (95%CI 0.92-0.96) in retrospective validation and a perfect patient-level AUC of 1.00 (95%CI 1.00-1.00) in prospective testing. For the more complex task of multiclass invasion depth classification, it achieved a retrospective AUC of 0.95 (95%CI 0.88-0.99), which remained strong at 0.80 (95%CI 0.67-0.87) in the prospective cohort. In a comparative study on invasion depth classification, MUMA-EDx's performance exceeded that of novice endoscopists and was comparable to expert-level diagnostics. CONCLUSION: MUMA-EDx demonstrably delivers exceptional early ESCC detection and robust invasion depth classification, achieving performance comparable to expert endoscopists and is poised to significantly enhance diagnostic precision and patient outcomes.

Review of multimodal deep learning model for esophageal squamous cell carcinoma diagnosis

Study Details

GERD Associated with Increased Risk of Laryngeal Cancer in Meta-Analysis

Large study links GERD to higher risk of laryngeal cancer

Clinical research that matters. Delivered to your inbox.

Review of multimodal deep learning model for esophageal squamous cell carcinoma diagnosis

More on Esophageal Squamous Cell Carcinoma

Study Details

GERD Associated with Increased Risk of Laryngeal Cancer in Meta-Analysis

Large study links GERD to higher risk of laryngeal cancer

Clinical research that matters. Delivered to your inbox.

Related in ENT (Otolaryngology)

From Other Specialties