This multicenter retrospective cohort study assessed the development and validation of an interpretable machine learning model designed to predict sepsis development in patients with laboratory-confirmed multidrug-resistant Pseudomonas aeruginosa infections. The analysis included 2,001 patients drawn from two major medical centers. The study compared the performance of a Random Forest model against general sepsis scores, addressing limitations inherent in general scoring systems for this specific pathogen population.
The primary outcome measured was the incidence of sepsis, which occurred in approximately 7% of the cohort. The model demonstrated varying performance across different datasets: an AUC of 1.000 in the SMOTE-balanced training set, 0.837 in the internal validation set, and 0.816 in the external validation set. Key predictors identified included calcium level, chronic obstructive pulmonary disease (COPD), red blood cell distribution width-standard deviation (RDW-SD), intra-abdominal infection, invasive catheters, and prior antibiotic exposure. COPD and calcium levels were identified as the most significant contributors to sepsis risk.
Safety and tolerability data were not reported in the study. A key limitation identified is that current prediction tools lack specificity for drug-resistant organisms, which may affect generalizability. The study provides clinicians with a precise, visualizable decision-support system to optimize early intervention strategies. However, the retrospective design and specific limitations regarding drug-resistant organism specificity suggest that these findings should be viewed as preliminary evidence requiring further prospective validation before routine clinical adoption.
View Original Abstract ↓
BackgroundMultidrug-resistant Pseudomonas aeruginosa (MDR-PA) infections present a critical healthcare challenge, often progressing to sepsis with high mortality. Current prediction tools lack specificity for drug-resistant organisms, hindering the early identification of high-risk patients. This study aimed to develop and validate an interpretable machine learning (ML) model to predict sepsis development in patients with MDR-PA infections.MethodsWe conducted a multicenter retrospective study analyzing 2,001 patients with laboratory-confirmed MDR-PA infections from two major medical centers between January 2019 and May 2025. The derivation cohort included 1,182 patients, while 819 patients from an independent center served as the external validation cohort. Feature selection was performed using a hybrid approach combining LASSO regression and support vector machine-recursive feature elimination (SVM-RFE). Seven ML algorithms were evaluated, with model interpretability enhanced via SHapley Additive exPlanations (SHAP). A web-based calculator was subsequently developed to facilitate clinical implementation.ResultsThe sepsis incidence was approximately 7% across cohorts. Feature selection identified six key predictors: calcium level, chronic obstructive pulmonary disease (COPD), red blood cell distribution width-standard deviation (RDW-SD), intra-abdominal infection, invasive catheters, and prior antibiotic exposure. The Random Forest model demonstrated superior performance, achieving an AUC of 1.000 in the SMOTE-balanced training set, 0.837 in internal validation, and 0.816 in external validation. SHAP analysis highlighted COPD and calcium levels as the most significant contributors to sepsis risk.ConclusionsThis study presents the first interpretable ML model specifically tailored for predicting sepsis onset in patients with MDR-PA infections. By addressing the limitations of general sepsis scores, our validated model and accompanying web-based tool provide clinicians with a precise, visualizable decision-support system to optimize early intervention strategies.