This single-center, retrospective cohort study was conducted at Zhongshan Hospital, Xiamen University, between January 2022 and January 2025. The population consisted of 82 NVAF-related stroke cases and 164 matched non-stroke controls, all with CHA2DS2-VA scores ≤1. The study utilized machine-learning models (Model A, Model B, and Model C) based on logistic regression, random forest, and XGBoost algorithms to predict stroke risk.
Main results indicated that stroke cases were older and had higher prevalence of comorbidities such as heart failure and hypertension. Laboratory and echocardiographic parameters showed higher E/e' ratios, elevated NT-proBNP, CRP, and white blood cell counts in stroke cases compared to controls. Specifically, XGB employing Model C achieved an AUC of 0.905 (95% CI: 0.877–0.933) in the training set and an AUC of 0.906 (95% CI: 0.826–0.985) in the test set. SHAP analysis identified NT-proBNP as the most influential feature, with elevated levels and E/e' associated with increased risk, while higher LVEF linked to decreased risk.
Safety and tolerability data, including adverse events and discontinuations, were not reported. Key limitations include the single-center design and retrospective nature of the data, which preclude causal conclusions. The study population was restricted to low- to moderate-risk patients based on CHA2DS2-VA scores.
The practice relevance lies in the potential for ML models incorporating cardiac biomarkers and echocardiographic parameters to improve stroke risk stratification in low- to moderate-risk NVAF patients, supporting personalized anticoagulation strategies.
View Original Abstract ↓
BackgroundNon-valvular atrial fibrillation (NVAF) patients with CHA₂DS₂-VA ≤1 face uncertainty in stroke risk assessment, particularly in Asian populations. Machine-learning (ML) models offer improved accuracy for individualized risk prediction.MethodsThis single-center, retrospective study at Zhongshan Hospital, Xiamen University (January 2022–January 2025) included 82 NVAF-related stroke cases and 164 matched non-stroke controls with CHA₂DS₂-VA ≤1. Data encompassed demographics, comorbidities, laboratory markers, and echocardiographic parameters. Following stratified train-test split (80%:20%), feature selection used univariable logistic regression and least absolute shrinkage and selection operator (LASSO) regression, retaining the intersection of variables selected by both methods. Three nested predictor sets were defined: Model A (clinical and routine laboratory variables), Model B (Model A plus Cardiac biomarkers), and Model C (Model B plus echocardiographic parameters). ML algorithms [logistic regression (LR), random forest (RF), XGBoost (XGB)] underwent nested cross-validation for hyperparameter tuning. Performance was evaluated by area under the receiver operating characteristic curve (AUC), sensitivity, specificity, precision, and F1 score. Shapley additive explanations (SHAP) were applied to the best-performing ML algorithm in the test set to evaluate the contributions of individual features.ResultsStroke cases were older, with higher E/e′ ratio, and elevated N-terminal pro-B-type natriuretic peptide (NT-proBNP), C-reactive protein (CRP), and white blood cell count (all P ≤ 0.01). Comorbidities such as heart failure, hypertension, and age 65–74 years were more prevalent in stroke cases (all P ≤ 0.05). Feature selection yielded seven predictors: age, CRP, E/e′ ratio, left ventricular ejection fraction (LVEF), NT-proBNP, triglycerides, and white blood cell count. In the training set, XGB employing Model C achieved an AUC of 0.905 (95% CI: 0.877–0.933). In the test set, XGB employing Model C yielded the AUC (0.906; 95% CI: 0.826–0.985). SHAP analysis identified NT-proBNP as the most influential feature, with elevated NT-proBNP and E/e′ levels associated with increased predicted risk and higher LVEF linked to decreased risk.ConclusionsML models incorporating cardiac biomarkers and echocardiographic parameters improve stroke risk stratification in low- to moderate-risk NVAF patients, supporting personalized anticoagulation strategies.