CBT Chatbots Show Moderate Depression Relief But High Bias Limits Confidence In Anxiety Findings

Photo by Navy Medicine / Unsplash

Journal of medical Internet research Published May 8, 2026 Medically reviewed May 12, 2026 Study authors: Gong Bingyan, Yao Nisha, Xie Hangxin, Huang Chuncheng, Kishimoto Tomoko, Berenbaum Howard, Mu Wentin… PubMed ↗ DOI ↗ By Dr. Ji-eun Park, MD · Brain, Mind & Pain

Key Takeaway

CBT chatbots show moderate depression relief but high bias limits confidence in anxiety findings.

A comprehensive meta-analysis evaluated twenty-nine randomized controlled trials examining CBT-oriented psychological chatbots for adults with depressive and anxiety symptoms. The intervention demonstrated moderate efficacy in reducing depressive symptoms at postintervention, with an effect size of g=-0.55. However, anxiety symptoms showed only small reductions immediately after the intervention, which did not persist significantly at follow-up.

User engagement and satisfaction were generally favorable, meeting digital intervention standards. Despite these positive user metrics, the overall certainty of the evidence was rated as very low to low. This low confidence stems primarily from a high risk of bias and substantial heterogeneity observed across the included trials.

Technical limitations and repetitive interaction patterns remain significant challenges that must be addressed. While CBT chatbots offer profound potential as scalable, low-barrier first-line tools, clinicians should interpret findings with caution. The current data supports short-term relief for comorbid profiles but requires further rigorous investigation to confirm long-term benefits and safety.

Study Details

Study typeMeta analysis

EvidenceLevel 1

PublishedMay 2026

PMID42101333

View Original Abstract ↓

BACKGROUND: Cognitive behavioral therapy (CBT) is the most examined psychotherapy for depression and anxiety, but delivery faces significant barriers such as limited access, cost, and time constraints. CBT-oriented psychological chatbots offer a promising means of addressing these challenges. Yet, their overall efficacy, user engagement, and acceptability have not been systematically synthesized. OBJECTIVE: This study aimed to evaluate the efficacy, user engagement, and acceptability of CBT-oriented chatbots for adults with depressive and/or anxiety symptoms. METHODS: A systematic search of 9 databases, including PubMed, Cochrane Central Register of Controlled Trials, Embase, Web of Science, PsycINFO, CINAHL, China National Knowledge Infrastructure, WanFang, and VIP Databases, was conducted from inception to February 2026. Eligibility criteria included randomized controlled trials comparing CBT-oriented chatbots with control groups in adults with depressive and/or anxiety symptoms. Risk of bias (ROB) was assessed using the Cochrane ROB tool. Random-effects meta-analyses (Hartung-Knapp-Sidik-Jonkman adjustment) calculated pooled effect sizes (Hedges g), 95% CIs, and 95% prediction intervals (PIs). Heterogeneity was evaluated using the I² statistic, and Galbraith plots were used to identify outliers for subsequent sensitivity analyses. Subgroup and meta-regression analyses examined potential moderators. The certainty of evidence was evaluated using the GRADE (Grading of Recommendations Assessment, Development, and Evaluation) approach. Data on user engagement and acceptability were extracted and synthesized using narrative and quantitative methods where available. RESULTS: Twenty-nine eligible randomized controlled trials were included. CBT-oriented psychological chatbots produced a moderate reduction in depressive symptoms at postintervention (g=-0.55, 95% CI -0.70 to -0.40, 95% PI -1.23 to 0.13) and a small reduction in anxiety symptoms (g=-0.26, 95% CI -0.37 to -0.14, 95% PI -0.67 to 0.15). At follow-up, effects were small for depression (g=-0.32, 95% CI -0.55 to -0.09, 95% PI -0.93 to 0.29) and nonsignificant for anxiety (g=-0.19, 95% CI -0.43 to 0.04, 95% PI -0.84 to 0.46). Subgroup and meta-regression analyses revealed that anxiety outcomes were significantly moderated by clinical profiles-showing distinct advantages for comorbid symptoms-and the proportion of female participants. The CBT-oriented chatbots received an adequate level of engagement that complied with digital intervention standards. Although user satisfaction ratings were generally favorable, technical limitations and repetitive interaction patterns remain to be addressed to enhance overall acceptability. Regarding the limitations of evidence, the overall certainty was rated as very low to low, predominantly driven by high ROB and substantial heterogeneity. CONCLUSIONS: This study innovatively isolates CBT-oriented chatbots from broader digital interventions, providing a precise, methodology-driven evaluation of theoretically grounded therapeutics. This review brings critical evidence to the field that these tools yield significant short-term relief, particularly for comorbid anxiety profiles. In the real world, CBT chatbots offer profound potential as scalable, low-barrier first-line tools. To sustain engagement, future developments must evolve from rigid rule-based scripts toward adaptive, large language model-driven architectures while ensuring clinical safety.