Machine learning framework predicts CD4 and CD8 counts in people living with HIV

Photo by Faustina Okeke / Unsplash

Frontiers in Medicine Published April 22, 2026 Medically reviewed April 25, 2026 Study authors: Juan Jin, Tingting Li, Jie Chen, Huanhuan Ba, Yuan Zhang, Jiajia Li, Jinling Yin, Huanqing Liu, Kang… DOI ↗ By Dr. Ji-eun Park, MD · Brain, Mind & Pain

Key Takeaway

Consider this predictive model for immune markers in HIV, but note it requires external validation before clinical implementation.

This retrospective cohort study developed an ensemble machine learning framework to predict longitudinal CD4+ count, CD8+ count, and CD4/CD8 ratio in people living with HIV. The model was trained and tested on a real-world dataset of 5,436 patients, with an independent test set of 1,088 patients.

The intervention was a heterogeneous stacking ensemble of XGBoost, LightGBM, Random Forest, Gradient Boosting, and Ridge regression. The comparator was a baseline Robust Transformer model. For CD4+ count prediction in the test set (n=1,088), the model achieved an R2 of 0.768 and a mean absolute error (MAE) of 74.8 cells/μL, representing a relative improvement in R2 of 66.4% compared to the baseline.

For CD8+ count prediction (n=1,088), the model achieved an R2 of 0.636 and an MAE of 300.5 cells/μL, with a relative improvement in R2 of 128.6% compared to the baseline. For CD4/CD8 ratio prediction (n=1,088), the model achieved an R2 of 0.131 and an MAE of 0.137.

Safety and tolerability data were not reported. A key limitation is that the model was trained and tested using only demographic and clinical features while explicitly excluding baseline CD4+/CD8+ counts. The practice relevance is that this provides a robust and clinically applicable tool for forecasting multi-dimensional immune reconstitution in HIV care, though causal claims regarding immune reconstitution are not supported.

Study Details

Study typeCohort

EvidenceLevel 3

PublishedApr 2026

View Original Abstract ↓

Accurate prediction of long-term CD4+ T-cell recovery trajectories in people living with HIV on antiretroviral therapy (ART) is a crucial unmet need for personalized monitoring and treatment optimization. Traditional statistical models have limited ability to capture the complex, non-linear relationships inherent in longitudinal clinical data. We developed a heterogeneous stacking ensemble framework to predict longitudinal CD4+ count, CD8+ count, and CD4/CD8 ratio. The model integrates four tree-based algorithms—XGBoost, LightGBM, Random Forest, and Gradient Boosting—with a Ridge regression meta-learner. It was trained and tested on a retrospective cohort of 5,436 patients who initiated ART between 2016 and 2025, using only demographic and clinical features while explicitly excluding baseline CD4+/CD8+ counts to prevent data leakage. On an independent test set (n=1,088), the ensemble achieved an R2 of 0.768 (MAE: 74.8 cells/μL) for CD4+ count, 0.636 (MAE: 300.5 cells/μL) for CD8+ count, and 0.131 (MAE: 0.137) for the CD4/CD8 ratio. This represents a relative improvement in R2 of 66.4% for CD4+ and 128.6% for CD8+ predictions compared to a baseline Robust Transformer model. The model accurately replicated the statistical distributions of observed outcomes and demonstrated stable learning dynamics without overfitting. Our ensemble learning framework provides a robust and clinically applicable tool for forecasting multi-dimensional immune reconstitution in HIV care. By synthesizing diverse algorithmic perspectives without relying on baseline immunology, it offers a foundation for data-driven clinical decision support to personalize long-term treatment monitoring.

Machine learning framework predicts CD4 and CD8 counts in people living with HIV

Study Details

Intravenous tenecteplase improves functional outcomes but increases hemorrhage risk in non-large vessel occlusion acute ischemic stroke.

Tenecteplase improves stroke recovery but increases bleeding risk in late treatment window

Clinical research that matters. Delivered to your inbox.

Machine learning framework predicts CD4 and CD8 counts in people living with HIV

More on HIV/AIDS

Study Details

Intravenous tenecteplase improves functional outcomes but increases hemorrhage risk in non-large vessel occlusion acute ischemic stroke.

Tenecteplase improves stroke recovery but increases bleeding risk in late treatment window

Clinical research that matters. Delivered to your inbox.

Related in Neurology

From Other Specialties