Mode
Text Size
Log in / Sign up

Systematic review of AI/ML methods in myocardial infarction biomarker prediction highlights validation gaps

Systematic review of AI/ML methods in myocardial infarction biomarker prediction highlights validati…
Photo by Danielle-Claude Bélanger / Unsplash
Key Takeaway
Note that strong AI model discrimination in MI often lacks external validation and transparency.

This systematic review and proof-of-concept case study evaluates artificial intelligence and machine-learning methods applied to cardiac biomarkers in patients with myocardial infarction. The analysis included 120 eligible studies identified, with a proof-of-concept dataset comprising 152 patients. The review scope encompasses prediction or prognostic modelling, methodological limitations, and leakage-aware modelling workflow performance. Most studies used multimodal inputs combining biomarkers with clinical or functional variables, representing 109 of 120 cases. Focus on prediction or prognostic modelling was present in 89 of 120 studies. Logistic or regularized regression usage was observed in 76 of 120 studies, while Random Forest usage occurred in 69 of 120 studies. Area under the receiver operating characteristic curve (ROC-AUC) was reported in 114 of 120 studies. The FULL variant achieved near-perfect discrimination with a ROC-AUC of 0.9988 (95% CI 0.9925-1.000). The CLINICAL variant showed modest performance with a value of 0.6025 (0.4463-0.7450). The BIOMARKERS variant yielded strong discrimination with low dimensionality, achieving a ROC-AUC of 0.9300 (0.8537-0.9863).

Study Details

Study typeSystematic review
Sample sizen = 152
EvidenceLevel 1
PublishedApr 2026
View Original Abstract ↓
Aims To systematically evaluate how artificial intelligence and machine-learning (AI/ML) methods are applied to cardiac biomarkers after myocardial infarction (MI), identify recurring methodological limitations, and operationalize a leakage-aware modelling workflow in a proof-of-concept post-MI dataset using a controlled proxy classification task. Methods and results A PRISMA 2020-compliant systematic review of studies published between 2015 and 2025 identified 120 eligible studies from 1,389 records. Most studies used multimodal inputs combining biomarkers with clinical or functional variables (109/120, 90.8%) and focused on prediction or prognostic modelling (89/120, 74.2%). Logistic or regularized regression (76/120, 63.3%) and Random Forest (69/120, 57.5%) were the most frequently used approaches. Internal validation predominated, whereas independent external validation was reported in only 44/120 studies (36.7%). Area under the receiver operating characteristic curve (ROC-AUC) was reported in 114/120 studies (95.0%), while calibration analyses and decision-curve analysis remained limited. Formal explainability methods were used inconsistently, and public code availability was uncommon. To translate these observations into a practical framework, we implemented a leakage-aware machine-learning workflow in a proof-of-concept dataset of 152 patients with MI and 117 variables. The analytical task was defined as a binary classification problem (STEMI vs NSTEMI), used intentionally as a methodological proxy rather than a clinically relevant prognostic endpoint. Three predefined feature-set variants were benchmarked using nested cross-validation. The FULL variant achieved near-perfect discrimination [ROC-AUC 0.9988 (95% CI 0.9925-1.000)], the CLINICAL variant showed modest performance [0.6025 (0.4463-0.7450)], and the BIOMARKERS variant yielded strong discrimination with low dimensionality [0.9300 (0.8537-0.9863)]. Permutation-based falsification testing reduced performance towards chance level, supporting the procedural integrity of the workflow. Conclusions AI/ML research on cardiac biomarkers after MI is expanding rapidly but remains limited by heterogeneous methodology, insufficient external validation, incomplete interpretability, and weak reproducibility practices. A leakage-aware framework integrating explicit feature governance, nested validation, calibration assessment, robustness analyses, and falsification testing may improve the credibility and translational relevance of biomarker-based cardiovascular AI studies. However, the proof-of-concept case study is intended as a methodological demonstration and does not represent prognostic modelling of post-MI outcomes. Translational Perspective AI models using cardiac biomarkers after MI often report strong discrimination, but their clinical value is undermined by limited external validation, incomplete calibration assessment, and poor transparency. Our systematic review identifies these recurrent weaknesses, while the proof-of-concept case study demonstrates how a leakage-aware workflow can distinguish clinically plausible signals from structurally inflated performance under controlled analytical conditions. The use of a proxy classification task highlights methodological behavior rather than clinical prognosis, underscoring the need for future studies to validate such frameworks on clinically meaningful post-MI outcomes. Integrating explicit feature governance, nested validation, calibration, decision-analytic assessment, and falsification testing may help move biomarker-based cardiovascular AI from promising retrospective performance towards more reproducible and clinically trustworthy prediction models.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.