A single-center retrospective cohort study investigated the use of a machine learning model to predict early risk of carbapenem-resistant Klebsiella pneumoniae infection in ICU patients with culture-confirmed K. pneumoniae infections. The model, developed using LASSO regression and XGBoost, was compared against other algorithms for its ability to discriminate infection risk.
The authors observed that the XGBoost model demonstrated good discrimination and calibration in an independent validation set. It also provided net clinical benefit across a range of risk thresholds, suggesting potential utility for guiding empirical therapy and supporting antimicrobial stewardship in the ICU.
Key limitations noted by the authors include the single-center retrospective design and the need for future multicenter prospective studies for validation. The study does not report clinical outcomes such as mortality or treatment success, and the model is intended for early risk prediction prior to susceptibility reports, not for definitive diagnosis.
In clinical practice, the model shows promise for risk stratification, but its application should be restrained pending further validation. The observational nature of the study means the model predicts risk but does not establish causation.
View Original Abstract ↓
BackgroundCarbapenem-resistant Klebsiella pneumoniae (CRKP) infections in intensive care units are associated with poor outcomes. The delay in obtaining culture-based susceptibility results often forces clinicians to choose between under-treatment and overtreatment with empirical antibiotics. A reliable early risk assessment using only standard clinical data could help address this challenge.MethodsThis single-center retrospective cohort study included 401 ICU patients with culture-confirmed K. pneumoniae infections (January 2022 to January 2025). Patients were randomly allocated to training (n = 281) and validation (n = 120) sets. Predictors extracted from electronic health records comprised demographics, severity scores (APACHE II, SOFA), comorbidities, invasive procedures, inflammatory markers, specimen type, history of multidrug-resistant (MDR) infection, and antibiotic exposure within the preceding 90 days. Feature selection was performed using Least Absolute Shrinkage and Selection Operator (LASSO) regression on the training set. The selected features were used to develop an XGBoost model, whose performance was compared against six other machine learning algorithms (logistic regression, random forest, etc.). Model discrimination was evaluated using the area under the receiver operating characteristic curve (AUC), calibration with Brier scores and calibration curves, and clinical utility with decision curve analysis. SHapley Additive exPlanations (SHAP) values were employed to interpret the model.ResultsCRKP isolates accounted for 15.7% (63/401) of cases. LASSO regression identified nine predictors: procalcitonin (PCT), specimen type, prior MDR infection, prior carbapenem exposure, history of stroke, APACHE II score, white blood cell count, age, and hemoglobin. In the independent validation set, the XGBoost model achieved an AUC of 0.852 (95% CI: 0.745–0.959), with a sensitivity of 0.737, specificity of 0.891, accuracy of 0.867, and an F1-score of 0.636. The model demonstrated good calibration (Brier score: 0.088) and provided a net clinical benefit across a wide range of risk thresholds. SHAP analysis highlighted PCT, specimen source (blood), and prior resistance-related exposures as the most influential predictors.ConclusionThe integration of LASSO feature selection with the XGBoost algorithm, utilizing only routine clinical data, generates a reliable early-warning model for CRKP infection risk prior to the availability of susceptibility reports. This tool shows promising discriminative ability and calibration, offering potential to guide empirical therapy and support antimicrobial stewardship. Future multicenter prospective studies are warranted to validate its generalizability and real-world clinical impact.