This prospective observational study included 578 pregnant women attending an obstetric clinic of a tertiary hospital in CentralSouth China. The primary outcome was the risk of excessive gestational weight gain, assessed until delivery. Researchers applied machine-learning models, including Logistic Regression, LightGBM, XGBoost, and Random Forest, using predictors such as body image, protective motivation, parity, dietary habits, and physical activity time.
The LightGBM model showed superior predictive performance compared to other models evaluated. Specific metrics included an accuracy of 88.6%, sensitivity of 87.5%, specificity of 89.9%, a kappa statistic of 0.770, and an AUC of 0.926. The 95% CI for the AUC was 0.889–0.962. Absolute numbers for outcomes were not reported.
Safety and tolerability data, including adverse events or discontinuations, were not reported. The study did not report funding or conflicts of interest. Key limitations include the observational design, which precludes causal inference, and the lack of reported certainty notes or comparative clinical outcomes.
The study team developed an online calculator for clinicians based on these models. While the predictive performance appears robust, the evidence remains observational and requires validation in randomized settings before altering clinical practice regarding weight management interventions.
View Original Abstract ↓
ObjectiveTo establish and validate interpretable machine-learning (ML) models for early assessment and identification of Chinese women at risk of excessive gestational weight gain (EGWG).MethodsWe performed a prospective observational study with pregnant women whose gestational age averaged 19 weeks or less. These women attended the obstetric clinic of a tertiary hospital in CentralSouth China between January and June 2023, and again from April to May 2024. Women completed standardized questionnaires, and their gestational weight gain (GWG) was recorded until delivery. We conducted feature selection by applying the Boruta algorithm together with the least absolute shrinkage and selection operator (LASSO) algorithm, We used four ML models—the logistic regression(LR), light gradient boosting machine(LightGBM), extreme gradient boosting(XGBoost), and random forest (RF) models—optimizing its hyperparameters by grid search and 5-fold crossvalidation. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC), confusion matrix, kappa statistic, calibration curves, and decision curve analysis. To enhance model interpretability, the SHapley Additive exPlanations (SHAP) framework was applied to quantify and rank the contribution of individual predictors. Based on the key predictive features, a web-based interactive calculator was developed using Python and the Flask micro-framework to facilitate clinical application.ResultsWe enrolled 578 pregnant women in all. The combined use of the Boruta and LASSO algorithms screened ten critical predictors. The LightGBM model showed superior predictive performance with an accuracy of 88.6%, sensitivity of 87.5%, specificity of 89.9%, kappa statistic of 0.770, and AUC of 0.926 (95% CI: 0.889–0.962) in the test cohort. The SHAP analysis indicated that the body image in pregnancy, protective motivation for gestational weight management, parity, the weekly frequency of consuming sugar-sweetened beverages, desserts, and Western-style fast food, and moderate-intensity physical activity time were the major determinants that influenced model prediction. An online calculator was developed and made available for clinicians at: http://39.103.64.176/.ConclusionsWe established an interpretable ML model for predicting the risk of EGWG. The LGBM model exhibited higher predictive accuracy and may serves as a powerful tool for the early detection and individualized management of the EGWG risk among Chinese pregnant women.