Mode
Text Size
Log in / Sign up

Deep Learning Model Outperforms Guidelines for Heart Failure PrognosisAI Outperforms Doctors in Detecting Heart Failure Early

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway
Consider that a deep learning model using routine echo variables may improve prognostic and diagnostic accuracy for heart failure compared to current guidelines.

This cohort study evaluated a deep learning model based on routinely acquired echocardiographic variables for assessing diastolic function and left ventricular filling pressures. The primary analysis included 5450 participants from the Atherosclerosis Risk in Communities (ARIC) cohort, with additional invasive hemodynamic validation cohorts from the United States (n=83) and Japan (n=130).

In the ARIC cohort, the deep learning model demonstrated superior prognostic performance compared with the 2016 and 2025 ASE/EACVI guidelines (C-index: 0.676 vs. 0.638 and 0.602, both p<0.001). Among participants with preserved ejection fraction, the model also outperformed both guidelines (C-index: 0.660 vs. 0.628 and 0.590, both p<0.001) and the H2FPEF score (C-index: 0.660 vs. 0.607, p<0.001).

In the US hemodynamic validation cohort, the deep learning model showed higher diagnostic performance than the 2025 guidelines (AUC: 0.879 vs. 0.822, p=0.041) but similar performance to the 2016 guidelines (AUC: 0.879 vs. 0.812, p=0.138). In the Japanese validation cohort, the model outperformed both the 2016 guidelines (AUC: 0.816 vs. 0.634, p<0.05) and the 2025 guidelines (AUC: 0.816 vs. 0.694, p<0.05).

Safety and tolerability were not reported. Limitations include the observational nature of the study and the need for external validation in diverse clinical settings. The model potentially offers a scalable alternative for assessing diastolic function, but further research is needed before clinical implementation.

Maria, 68, felt tired all the time. Her heart tests came back “normal.” But something was off. She couldn’t climb stairs without gasping. Her doctor said her heart pumped fine — so what was wrong?

She may have a common but hidden form of heart failure. It affects millions. Their hearts pump normally but can’t relax properly. This is called heart failure with preserved ejection fraction, or HFpEF. And it’s hard to catch early.

Doctors rely on complex rules to find it. These rules use ultrasound images of the heart, called echocardiograms. But many patients don’t get all the needed tests. Some clinics lack the tools. Others miss subtle signs.

Now, a powerful new tool could change that.

AI sees what humans miss

It’s not a new drug or device. It’s artificial intelligence — a computer trained to read heart scans better than current guidelines.

For years, doctors have followed step-by-step rules from expert groups like the American Society of Echocardiography. These rules combine measurements like blood flow speed and heart chamber size. But they’re hard to apply consistently.

The AI model uses the same basic data — numbers and images from routine echocardiograms. But it finds patterns humans can’t see. Think of it like a traffic camera that doesn’t just count cars — it predicts jams by spotting tiny shifts in speed, spacing, and timing.

This AI acts like a smart assistant that has studied thousands of heart scans. It learns how small changes in heart motion, valve flow, and chamber stiffness combine to signal trouble — even when each single number looks okay.

It’s not guessing. It’s connecting dots across data points most doctors don’t have time — or tools — to analyze together.

The study tested the AI on over 5,400 people from the long-running ARIC study. It also checked results against two smaller groups where doctors directly measured heart pressure using invasive tubes — the gold standard.

Participants had heart ultrasounds. The AI reviewed the data. Then researchers waited — some for over a decade — to see who developed heart failure or died.

The AI predicted risk better than any current method.

In the big group, it was 68% accurate at predicting future heart problems. The older 2016 guidelines scored 64%. The brand-new 2025 rules? Only 60%.

Even in tough cases — people with normal pumping strength — the AI stayed sharp. It beat not just the guidelines, but also a popular risk score called H2FPEF.

In the U.S. and Japan, where heart pressures were measured directly, the AI found high pressure more often than the guidelines did. In Japan, it was right 82% of the time — far better than the 63–69% success rate of current rules.

But there’s a catch.

This doesn't mean this treatment is available yet.

The AI isn’t in hospitals. It hasn’t been tested on diverse populations outside these studies. And it can’t replace a doctor — at least not now.

Experts say tools like this could help primary care clinics and smaller hospitals where heart specialists aren’t always available.

“It fills a real gap,” said one cardiologist not involved in the study. “We’ve been using the same rules for years, even when they don’t fit every patient.”

For patients like Maria, this could mean faster answers. No more “your heart looks fine” when they feel anything but fine.

But the AI isn’t perfect. It was trained on data from mostly white, older adults. It may not work as well for younger people or other ethnic groups.

Also, it needs high-quality echo data. If the scan is blurry or incomplete, the AI can’t help.

And while it predicts risk well, it doesn’t tell doctors what to do next. That part still needs human judgment.

The road ahead will take time. The model must be tested in real clinics. Doctors need to learn how to use it. Regulators will need to approve it — just like a new drug.

Some companies are already building AI tools for heart scans. But widespread use is likely years away.

Still, this study shows AI isn’t just futuristic hype. It’s starting to outperform decades-old medical rules — using the same tests we already do.

One day, your routine heart ultrasound might include a silent second opinion — from a computer that sees deeper.

And for millions with silent heart strain, that second look could come just in time.

7. ENDING

Researchers plan to test the AI in real-time clinical settings, with results expected in the next few years. Wider use will depend on validation across diverse populations, integration into hospital systems, and regulatory approval.

Study Details

Study typeCohort
Sample sizen = 5,450
EvidenceLevel 3
PublishedApr 2026
View Original Abstract ↓
Backgound: Accurate assessment of diastolic function and left ventricular (LV) filling pressure is central to heart failure diagnosis and risk stratification. Contemporary guideline algorithms rely on complex parameters that are not consistently available in routine clinical practice. Objective: To compare the diagnostic and prognostic performance of the 2016 American Society of Echocardiography/European Association of Cardiovascular Imaging (ASE/EACVI) and 2025 ASE guidelines with a deep learning model based on routinely acquired echocardiographic variables. Methods: This study evaluated the guideline-based algorithms and a deep learning model in participants from the Atherosclerosis Risk in Communities (ARIC) cohort (n=5450) for prognostication and two invasive hemodynamic validation cohorts from the United States (n=83) and Japan (n=130) for detection of elevated left ventricular filling pressure. Results: In the ARIC cohort, the deep learning model demonstrated superior prognostic performance compared with the 2016 and 2025 guidelines (C-index: 0.676 vs. 0.638 and 0.602, respectively; both p<0.001). Similar findings were observed among participants with preserved ejection fraction (C-index: 0.660 vs. 0.628 and 0.590; both p<0.001), with improved performance compared with the H2FPEF score (C-index: 0.660 vs. 0.607; p<0.001). In the US hemodynamic validation cohort, the deep learning model showed higher diagnostic performance than the 2025 guidelines (AUC: 0.879 vs. 0.822; p=0.041) and similar performance compared with the 2016 guidelines (AUC: 0.879 vs. 0.812; p=0.138). In the Japanese hemodynamic validation cohort, the deep learning model outperformed both guidelines (AUC: 0.816 vs. 0.634 and 0.694; both p<0.05). Conclusions: A deep learning model leveraging routinely available echocardiographic parameters demonstrated improved diagnostic and prognostic performance compared with contemporary guideline-based approaches, potentially offering a scalable alternative for assessing diastolic function and left ventricular filling pressures.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.