AI Overestimates Behavioral Health Risks Compared to Human Advice in Experimental Study

Photo by Aakash Dhage / Unsplash

Applied psychology. Health and well-being Published April 2, 2026 Medically reviewed April 26, 2026 Study authors: Liu Mengying, Guo Xiyu, Wan Xiaoang PubMed ↗ DOI ↗ By Dr. Lars van Dijk, PhD · Surgical, Procedural & Diagnostic

Key Takeaway

Note: Experimental study found AI overestimated health risks vs human advice; findings relate to judgment, not clinical outcomes.

This study conducted two controlled experiments involving 60 gender-balanced participants and 30 GPT-4 samples to examine how advice source influences judgment of behavioral health risks. Participants received risk assessments for behavioral health scenarios from either AI algorithms or human peer groups, with comparisons made between these advice sources. The primary finding was that AI systematically overestimated health risks compared to human advice, though specific effect sizes and statistical measures were not reported. In lower-threat conditions, participants showed a preference for human advice over AI, though this preference disappeared in higher-threat scenarios. Interestingly, participants who accepted AI-disagreeing advice showed greater belief updates than those following human advice.

Safety and tolerability data were not reported in this experimental study. The research focused on judgment and preference measures rather than clinical outcomes or adverse events associated with AI use in healthcare settings.

Key limitations include the experimental nature of the study, which measured perceptions and judgments in controlled tasks rather than real-world health decisions. The AI component (GPT-4) was used to generate risk ratings in a research context, not as a clinical decision aid. The findings relate to judgment processes and source preference, not diagnostic accuracy or patient management outcomes.

For practice, this research offers preliminary insights for the design of AI-based health decision support systems, suggesting that users may perceive AI risk assessments differently than human advice. However, the study does not test AI in clinical practice or measure actual health outcomes, so applications to patient care remain speculative. Clinicians should recognize these findings as early experimental evidence about judgment processes rather than guidance for clinical AI implementation.

Study Details

Study typeRct

Sample sizen = 60

EvidenceLevel 2

PublishedApr 2026

PMID41792887

View Original Abstract ↓

Previous research has shown that advice sources influence individuals' risk perceptions and health decision-making. We conducted two experiments to examine differences in health risk assessment between AI algorithms and human peer groups, and how these assessments influence individuals' judgments of behavioral health risks. In Experiment 1, 60 participants (gender-balanced) and 30 GPT-4 samples (from independent runs with varying temperature settings) rated the perceived risk of 60 health behaviors. The results revealed that AI systematically overestimated health risks by inflating outcome severity rather than risk probability. In Experiment 2, 60 participants compared higher- or lower-threat health behaviors to judge which posed lower risk, then revised judgments after receiving advice from AI or human peer groups. The results indicated that participants preferred human advice over AI in the lower-threat condition. However, this preference disappeared in the higher-threat condition, and participants accepting AI-disagreeing advice showed greater belief updates than those following human advice. Collectively, these findings highlight how the threat context influences human-AI advice integration, offering insights for the design of effective AI-based health decision support systems.

AI Overestimates Behavioral Health Risks Compared to Human Advice in Experimental Study

Study Details

Muscle ultrasonography shows high diagnostic accuracy for fasciculation detection in ALS

A Simple Scan Could Speed Up an ALS Diagnosis

Clinical research that matters. Delivered to your inbox.

AI Overestimates Behavioral Health Risks Compared to Human Advice in Experimental Study

Study Details

Muscle ultrasonography shows high diagnostic accuracy for fasciculation detection in ALS

A Simple Scan Could Speed Up an ALS Diagnosis

Clinical research that matters. Delivered to your inbox.

Related in Radiology & Imaging

From Other Specialties