Mode
Text Size
Log in / Sign up

Machine learning models predict noninvasive support failure in acute respiratory failure with moderate accuracy

Machine learning models predict noninvasive support failure in acute respiratory failure with modera…
Photo by Steve A Johnson / Unsplash
Key Takeaway
Interpret machine learning prediction models for respiratory support failure as investigational due to very low certainty evidence.

A systematic review and meta-analysis evaluated machine learning-based prediction models for forecasting failure of noninvasive respiratory support in adults with acute respiratory failure. The analysis included data from 34,500 patients, though specific study settings and comparators were not reported. The primary outcome was discriminative performance, measured by the area under the receiver operating characteristic curve (AUC).

The main finding was a pooled AUC of 0.84 (95% CI, 0.78–0.89), indicating moderate discriminatory ability for predicting noninvasive support failure. No statistically significant differences were found in subgroup analyses. Safety and tolerability data for the models were not reported in the meta-analysis.

Key limitations severely constrain interpretation. The evidence exhibited extreme statistical heterogeneity (I² = 99.5%), had wide prediction intervals, and all included studies were rated at high risk of bias. The authors concluded the certainty of evidence is very low.

Due to these substantial limitations and the associative nature of the data from cohort studies, the review authors explicitly state the findings preclude clinical implementation. The models represent an area of research interest but require rigorous external validation and testing in prospective studies before any clinical application can be considered.

Study Details

Study typeMeta analysis
EvidenceLevel 1
PublishedApr 2026
View Original Abstract ↓
Early identification of noninvasive respiratory support (NIRS) failure in acute respiratory failure (ARF) is clinically relevant, as delayed intubation is associated with worse outcomes. Machine learning-based prediction models have been proposed to support escalation decisions, but their performance and reliability remain uncertain. To systematically evaluate the discriminative performance of machine learning-based models for predicting NIRS failure in adults with ARF. We conducted a systematic review and meta-analysis following PRISMA 2020 guidelines and registered the protocol in PROSPERO (CRD420251167330). PubMed, Web of Science, and Scopus were searched from January 2010 to the final search date. Cohort studies developing or validating machine learning models to predict NIRS failure, primarily defined as endotracheal intubation, were included. Discrimination was assessed using the area under the receiver operating characteristic curve (AUC). Logit-transformed AUCs were synthesized using random-effects models with restricted maximum likelihood estimation and Hartung–Knapp confidence intervals. Risk of bias and certainty of evidence were assessed using PROBAST-AI and GRADE, respectively. Fourteen cohort studies comprising 34,500 patients were included. The descriptive pooled AUC was 0.84 (95% CI, 0.78–0.89) with extreme heterogeneity (I2 = 99.5%) and wide prediction intervals. Subgroup analyses showed no statistically significant differences by validation strategy or type of noninvasive respiratory support. All studies were rated at high risk of bias, and the certainty of evidence was very low. Machine learning-based models demonstrate moderate discrimination; however, extreme heterogeneity, high risk of bias, and very low certainty of evidence preclude clinical implementation. https://www.crd.york.ac.uk/PROSPERO/view/CRD420251167330.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.