Mode
Text Size
Log in / Sign up

Machine learning models improved suicide risk detection accuracy in asynchronous text therapy clientsNew AI models improved suicide risk detection in text therapy messages

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway
Note improved model performance for risk detection, but clinical validation and safety outcomes remain unreported.

This research focused on developing machine learning models for real-time suicide risk detection using text from clients in asynchronous text therapy within a digital mental health setting. The specific study design and sample size were not reported in the available data. The intervention involved training models to classify risk levels as 'no risk,' 'moderate,' or 'severe,' comparing new iterations against a previously published model.

The primary outcome measured model performance using a weighted F1 score. The final multiclass model, designated as version 3.0, achieved a weighted F1 score of 0.85. This result represented an improvement over the previous model, though exact absolute numbers, p-values, or confidence intervals were not reported. No secondary outcomes were detailed in the findings.

Safety and tolerability data were not reported, as adverse events, serious adverse events, discontinuations, and general tolerability metrics were absent from the study results. Additionally, no follow-up data regarding long-term model stability or clinical outcomes were provided. The study did not report specific limitations, funding sources, or conflicts of interest.

The practice relevance suggests these models could enhance clinical utility by helping providers prioritize urgent cases for more accurate and timely intervention. However, because this is a model development study without clinical validation or reported impact on patient outcomes, the findings should be interpreted as technical performance metrics rather than evidence of improved patient care or safety.

This research focused on creating machine learning models to identify suicide risk from text messages in asynchronous therapy. The team tested several versions to see how well they could classify risk levels as no risk, moderate, or severe. The final model, version 3.0, achieved a weighted F1 score of 0.85, which was an improvement over previous published models.

The study took place in a digital mental health setting using text from therapy clients. No safety data or adverse events were reported because the work was about model development, not clinical trials. The main reason to be careful is that this study did not measure whether using these models actually helps patients or changes real-world outcomes.

Readers should understand that while the models performed well on paper, they have not yet been proven to improve care. These tools may help providers prioritize urgent cases in the future, but more research is needed to confirm their safety and effectiveness in real clinical practice.

What this means for you:
New AI models showed improved detection performance, but clinical impact remains unproven.

Study Details

EvidenceLevel 5
PublishedMar 2026
View Original Abstract ↓
The goal of this work was to leverage a large corpus of text based psychotherapy data to create novel machine learning algorithms that can identify suicide risk in asynchronous text therapy. Advances in the field of natural language processing and machine learning have allowed us to include novel data sources as well as use encoding models that can represent context. Our models utilize advanced natural language processing techniques, including fine-tuned transformer models like RoBERTa, to classify risk. Subsequent model versions incorporated non-text data such as demographic features and census-derived social determinants of health to improve equitable and culturally responsive risk assessment, as well as multiclass models that can identify tiered levels of risk. All new models demonstrated significant improvements over our previous model. Our final version, a multiclass model, provides a tiered system that classifies risk as "no risk," "moderate," or "severe" (weighted F1 of 0.85). This tiered approach enhances clinical utility by allowing providers to quickly prioritize the most urgent cases, ensuring a more accurate and timely intervention for clients in need. Author SummarySuicide is a major public health concern, and traditional methods for assessing risk in clinical settings have serious limitations, often failing to capture risk in real time. To address this, we developed a series of new machine learning models to automatically and accurately detect suicide risk from the text of therapy messages. By training these models on a large, unique dataset of de-identified clinical transcripts, we were able to move beyond simple keyword spotting to a more contextual interpretation of a clients language. The resulting models showed vast improvements over our previous published model. This is critical both for catching as much risk as possible, and for reducing "alert fatigue" for our clinicians by reducing the number of false alarms raised by the model. Furthermore, our final model, v3.0, introduced a tiered system that classifies text as "no risk," "moderate," or "severe." This allows clinicians to quickly prioritize the most urgent cases, ensuring a more accurate, equitable, and timely intervention for the clients who need it most.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.