Mode
Text Size
Log in / Sign up

Machine learning analysis identifies associations between multi-chemical exposure clusters and hypertension prevalence

Machine learning analysis identifies associations between multi-chemical exposure clusters and hyper…
Photo by BoliviaInteligente / Unsplash
Key Takeaway
Note the association between high combustion multi-chemical exposure clusters and increased odds of hypertension.

This machine learning analysis utilized NHANES 2017-2018 data from 2,979 participants to evaluate multi-chemical exposure clusters. The study focused on 25 urinary biomarkers, including 6 PAH and 19 VOC metabolites, to identify distinct exposure profiles and their associations with hypertension and high total cholesterol prevalence in a nationally representative US adult population.

Results indicated that the high combustion cluster, which includes an estimated 5.1 million US adults, demonstrated a 39.3% prevalence of hypertension (95% CI 37.2-41.4%), compared to 28.7% in the low exposure reference group (95% CI 21.9-35.5%). After demographic adjustment, membership in the high combustion cluster was independently associated with 38.4% higher odds of prevalent hypertension (OR 1.38). The prediction model achieved a cross-validated AUC of 0.849 (SD 0.017).

The authors suggest that multi-chemical exposome profiling can identify cardiovascularly distinct subpopulations, supporting the use of multi-chemical approaches over single-pollutant analyses for risk stratification. However, these findings are based on a cross-sectional analysis of NHANES data and report associations rather than direct causation.

Study Details

Sample sizen = 370
EvidenceLevel 5
PublishedApr 2026
View Original Abstract ↓
Background Polycyclic aromatic hydrocarbons (PAHs) and volatile organic compounds (VOCs) are combustion-derived pollutants linked to cardiovascular disease. Prior NHANES analyses have evaluated these chemicals individually, failing to capture the correlated co-exposure structures that characterize real-world environmental burden, thereby underscoring the need for application. In this study, we applied an unsupervised machine learning pipeline to urinary biomarker data to identify multi-chemical exposure clusters and quantify their differential cardiovascular risk profiles in a nationally representative US sample. Methods We analyzed 2,979 participants from NHANES between 2017-2018, representing an estimated 36.8 million US adults after complex survey weighting. Twenty-five urinary biomarkers (6 PAH, 19 VOC metabolites) were log-transformed, imputed using Multivariate Imputation by Chained Equations (MICE), and standardized. Uniform Manifold Approximation and Projection (UMAP) was used for dimensionality reduction, followed by Gaussian Mixture Model (GMM) clustering. Survey-weighted prevalence estimates with 95% confidence intervals (CIs) were calculated for hypertension and high total cholesterol within each cluster. Weighted multivariable logistic regression was used to estimate odds ratios (OR) for hypertension, adjusting for age, sex, race/ethnicity, and income. Results Four exposure clusters were identified with a mean assignment probability of 0.948. The High combustion cluster (n=370; estimated 5.1 million US adults) exhibited the highest multi-chemical burden and a weighted hypertension prevalence of 39.3% (95% CI 37.2-41.4%), compared to 28.7% (95% CI 21.9-35.5%) in the Low exposure reference group. After demographic adjustment, High combustion cluster membership was independently associated with 38.4% higher odds of prevalent hypertension (OR 1.38). The prediction model achieved a cross-validated area under the receiver operating characteristic curve (AUC) of 0.849 (SD 0.017). Non-Hispanic Black participants constituted approximately 40% of the High combustion cluster, exceeding their representation in lower-risk clusters. Conclusions Multi-chemical exposome profiling identifies four cardiovascularly distinct subpopulations in the US adult population. Membership in the High combustion exposure cluster was associated with higher odds of prevalent hypertension and disproportionately affected Non-Hispanic Black participants. These findings support the use of multichemical approaches over single-pollutant analyses and highlight the relevance of environmental exposure patterns for making policy and targeted cardiovascular risk stratification.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.