Mode
Text Size
Log in / Sign up

Machine Learning with Social Determinants Improves Readmission Prediction

Machine Learning with Social Determinants Improves Readmission Prediction
Photo by Steve A Johnson / Unsplash
Key Takeaway
Consider that integrating SDOH data into ML models may improve readmission prediction, but clinical benefit remains unproven.

This retrospective cohort study analyzed 3,018 adult patients discharged from a large academic medical center to evaluate whether machine learning models that integrate clinical data with social determinants of health (SDOH) could improve 30-day hospital readmission prediction. The primary outcome was prediction accuracy, measured by ROC-AUC, PR-AUC, and F1-score.

The XGBoost model achieved the best performance on the test set, with a ROC-AUC of 0.79 (95% CI 0.75–0.82) and a PR-AUC of 0.71. Ensemble models outperformed logistic regression, though specific effect sizes were not reported. Key predictive features included prior admissions, comorbidity burden, neighborhood socioeconomic status, and household composition.

Safety and tolerability were not reported, as this was a prediction modeling study. Limitations were not reported in the abstract, but the study is retrospective and does not establish causation or demonstrate improvement in patient outcomes or costs.

For clinicians, these findings suggest that incorporating SDOH data into predictive models may enhance identification of high-risk patients. However, the clinical utility of such models requires prospective validation and evidence of improved outcomes before adoption.

Study Details

Study typeCohort
EvidenceLevel 3
PublishedApr 2026
View Original Abstract ↓
Hospital readmissions remain a major challenge for healthcare systems, contributing to higher costs and worse patient outcomes. Although most prediction models rely primarily on clinical data, integrating social determinants of health (SDOH) may improve risk assessment. However, the use of machine learning (ML) to combine clinical and SDOH data for readmission prediction remains limited. To develop and compare machine learning models for predicting 30-day hospital readmission by integrating clinical and SDOH data. We conducted a retrospective cohort study of 3,018 adult patients discharged from a large academic medical center between January 2022 and December 2023. Clinical variables were extracted from electronic health records and linked, through geocoded residential addresses, to area-level SDOH indicators from publicly available census data, including neighborhood deprivation, median income, and educational attainment. Six tabular ML models were trained and evaluated, including Logistic Regression, Random Forest, XGBoost, LightGBM, CatBoost, and Support Vector Machine. Model performance was assessed using the area under the receiver operating characteristic curve (ROC-AUC), precision-recall AUC (PR-AUC), and F1-score. SHapley Additive exPlanations (SHAP) were used to assess feature importance. Ensemble models outperformed Logistic Regression, with XGBoost achieving the best performance on the test set (ROC-AUC 0.79, 95% CI 0.75–0.82; PR-AUC 0.71). In addition to key clinical variables such as prior admissions and comorbidity burden, SDOH features including neighborhood socioeconomic status and household composition were among the most important predictors. Integrating clinical and SDOH data into ML models improved prediction of 30-day hospital readmission. These findings support moving beyond clinical-only models and suggest that SDOH-informed prediction may help identify high-risk patients earlier and guide more targeted care management.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.