Deep learning AI models for lung nodule malignancy classification show 88% sensitivity and 75% specificityArtificial intelligence helps identify lung cancer in chest CT scans

Radiology. Artificial intelligence Published July 4, 2026 Study authors: Asmara Oke Dimas, Steenhuis Eline G M, de Jong Kim, Joseph Anita, Timmer Marjan, Tenda Eric Daniel, … PubMed ↗ DOI ↗ Editorial oversight: Dr. Amelia Tan, PhD · Internal Medicine & Chronic Disease

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway

Note that AI models show high sensitivity but only moderate specificity for lung nodule malignancy classification.

This meta-analysis evaluated the diagnostic accuracy of externally tested artificial intelligence models based on deep learning for classifying lung nodules on chest CT. The analysis included 7454 nodules across 21 studies, which included a significant proportion of Asian populations (17/21) and non-screening populations (16/21).

The primary findings indicate that these AI models achieve a sensitivity of 88% and a specificity of 75%. The reported Area Under the Receiver Operating Characteristic Curve (AUC) was 0.89, with a Positive Likelihood Ratio of 3.55 and a Negative Likelihood Ratio of 0.16. A Diagnostic Odds Ratio (DOR) of 22.4 was observed.

Analysis of model architecture revealed that CNN architectures achieved higher specificity (82%-83%) compared to other architectures (58%). However, the authors noted significant limitations, including high heterogeneity exceeding 90%, inconsistent reporting across studies, and risks of bias in patient selection and index testing.

Clinically, these AI models demonstrate high sensitivity but only moderate specificity. This suggests they may be useful as rule-out tools for malignancy in lung nodules, though the high heterogeneity limits the certainty of these findings.

How this fits prior evidence

This meta-analysis addresses a gap in quantifying the diagnostic accuracy of deep learning AI for lung nodule classification. It provides specific performance metrics that complement existing knowledge on lung cancer screening, which identifies new solid nodules with a 7.4% malignancy rate at 12 months. While the current finding highlights high sensitivity (88%) and moderate specificity (75%), it does not replace established clinical thresholds or risk assessments.

When a patient gets a chest CT scan and a small spot, or nodule, is found, it creates immediate anxiety. Doctors must figure out if that spot is harmless or a sign of lung cancer. This study looked at how artificial intelligence (AI) models perform when trying to tell the difference between benign and malignant nodules.

The researchers analyzed 7,454 nodules across various studies. They found that these AI models have high sensitivity, meaning they are good at catching potential cancer cases. However, the accuracy for specifically identifying non-cancerous spots was more moderate. Specifically, the study showed a sensitivity of 88% and a specificity of 75%.

While these tools show promise as a way to rule out cancer quickly, there are important notes to keep in mind. The data comes from a wide variety of studies with inconsistent reporting and high variation between results. Because of this, the AI is not a final diagnostic tool but rather one piece of information for doctors to use when making decisions.

What this means for you:

AI models show high sensitivity for finding lung cancer in scans but have moderate accuracy in ruling out other conditions.

Common questions

How accurate is the AI at finding lung cancer?

The study found that these artificial intelligence models had a sensitivity of 88% for identifying malignancy. This means they are quite good at catching potential cases. However, their specificity was lower at 75%, which means they are less precise at confirming when a nodule is definitely not cancer.

Is the AI better than other types of computer models?

The study looked at different model architectures. It found that convolutional neural network (CNN) architectures had higher specificity, reaching 82% to 83%, compared to only 58% for other reported models.

Can doctors rely on AI as the final word for a diagnosis?

No, these AI models are not definitive diagnostic tools. Because the study showed high variation in results and some risks of bias in how patients were chosen, they should be used as one part of the clinical picture rather than a final answer.

Study Details

Study typeMeta analysis

EvidenceLevel 1

Follow-up24.0 mo

PublishedJul 2026

PMID42233760

View Original Abstract ↓

Purpose To evaluate the pooled diagnostic accuracy of externally tested artificial intelligence (AI) models for malignancy classification of lung nodules at chest CT. Materials and Methods A systematic search of PubMed, Embase, Web of Science, Cumulative Index to Nursing and Allied Health Literature, and the Cochrane Library was performed in March 2023 and updated on January 12, 2025, to identify studies evaluating AI models for malignancy classification of lung nodules at chest CT using pathology and/or at least 2-year follow-up as reference standards. Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool, and pooled sensitivity and specificity were estimated using bivariate random-effects models. Results Twenty-one studies including 7454 nodules were analyzed, with lung cancer prevalence ranging from 5.7% (17 of 297) to 91.5% (214 of 234). All models were based on deep learning; 17 of the 21 studies (81%) involved Asian populations, 16 (76%) used nonscreening populations, 14 (67%) reported two-dimensional or three-dimensional convolutional neural network (CNN) architectures, and eight (38%) specified predefined malignancy thresholds. High risk of bias was identified in five studies for patient selection and in two for index testing. Pooled sensitivity was 88%, pooled specificity was 75%, positive likelihood ratio was 3.55, negative likelihood ratio was 0.16, area under the receiver operating characteristic curve was 0.89, and the diagnostic odds ratio was 22.4. Heterogeneity was high ( > 90%). Model architecture was associated with specificity, with higher values in studies reporting two-dimensional or three-dimensional CNNs compared with those without reported architecture (82%-83% vs 58%, = .03; meta-regression = .02); other subgroup analyses showed no evidence of differences. Conclusion Externally tested AI models demonstrated high sensitivity but moderate specificity for malignancy classification of lung nodules at chest CT, supporting a potential role in rule-out strategies. However, substantial heterogeneity, inconsistent reporting, and risk of bias limit interpretation. Lung Cancer, Artificial Intelligence, Malignancy Classification, External Validation, Meta-Analysis © RSNA, 2026 See also commentary by Bressem and Kim in this issue.

Deep learning AI models for lung nodule malignancy classification show 88% sensitivity and 75% specificityArtificial intelligence helps identify lung cancer in chest CT scans

How this fits prior evidence

Common questions

Study Details

Real-world meta-analysis supports pembrolizumab for advanced NSCLC with high PD-L1

Real-world data confirms pembrolizumab offers long-term survival for advanced lung cancer patients

Clinical research that matters. Delivered to your inbox.

Deep learning AI models for lung nodule malignancy classification show 88% sensitivity and 75% specificityArtificial intelligence helps identify lung cancer in chest CT scans

How this fits prior evidence

Common questions

More on Lung Cancer

Study Details

Real-world meta-analysis supports pembrolizumab for advanced NSCLC with high PD-L1

Real-world data confirms pembrolizumab offers long-term survival for advanced lung cancer patients

Clinical research that matters. Delivered to your inbox.

Related in Pulmonology & Critical Care

From Other Specialties