A new study shows AI can read doctor’s notes and find missed treatment steps for kids with ADHD.
Why a Doctor’s Note Matters More Than You Think
Imagine a parent brings their 5-year-old to the doctor. The child is struggling with focus, impulsivity, and behavior at home and school. The doctor diagnoses ADHD and writes a plan in the electronic health record.
But here’s the problem: That plan might include a key step—recommending parent training in behavior management—that the parent never hears about. Or the health system never knows if that recommendation was made.
Why? Because checking every doctor’s note to see if that step was followed is slow, expensive, and nearly impossible at scale.
Now, imagine an AI that can read thousands of notes in minutes and flag which kids should have gotten this recommendation.
That’s what this new study is about.
The ADHD Care Gap Most Parents Never See
ADHD is one of the most common childhood neurodevelopmental disorders. In the U.S., about 6% of children are diagnosed with ADHD.
For kids under age 6, guidelines strongly recommend starting with parent training in behavior management (PTBM)—not medication. This training teaches parents strategies to manage challenging behaviors at home.
But studies show that many young children with ADHD never get this recommendation. In this study, only about 26% of kids had it documented in their first visit.
Why the gap?
Doctors are busy. Notes are long and written in free text. Manually reviewing charts to see if PTBM was recommended is tedious and costly. So, health systems often skip it—meaning they can’t measure or improve care quality.
Old way: A team of humans reads through thousands of clinical notes, one by one, to check if PTBM was recommended. This takes months and costs a lot.
New way: An AI reads the notes in minutes, flags which ones mention PTBM, and explains why it made that call.
But here’s the twist: The AI isn’t just guessing. It’s trained to look for specific language in the “assessment and plan” section of the note and provide evidence for its decision.
How AI Thinks Like a Doctor (Sort Of)
Think of the AI like a super-fast medical detective.
When a doctor writes a note, they often include a section called “Assessment and Plan.” This is where they summarize the diagnosis and next steps.
The AI scans this section for clues that PTBM was recommended. For example, it might look for phrases like:
- “Recommend parent behavior training”
- “Refer to behavioral therapy”
- “Suggest parenting class”
It’s like a spell-checker for care quality—spotting what should be there.
But unlike a simple keyword search, the AI uses context. It understands that “parent training” might be phrased differently, and it can tell the difference between a recommendation and a general discussion.
The study tested three AI models: Claude-3.5, GPT-4o, and LLaMA-3.3-70B. Each one read the same set of doctor’s notes and tried to identify whether PTBM was recommended.
How the Study Worked
Researchers looked at 542 children aged 4–6 years who were diagnosed with ADHD or ADHD symptoms between 2020 and 2024. All were seen in a California pediatric network with 27 clinics.
They took the first ADHD-related visit for each child and analyzed the doctor’s note.
A subset of 122 notes—including all cases where the AI models disagreed—was manually reviewed by experts. This helped measure how well the AI performed.
The goal: See if the AI could match expert human review in identifying PTBM recommendations.
All three AI models performed well, but one stood out.
Claude-3.5 was the most balanced:
- Correctly identified 89% of kids who should have gotten PTBM (sensitivity)
- Was right 95% of the time when it flagged a recommendation (positive predictive value)
- Overall accuracy score: 92%
LLaMA-3.3-70B was close behind:
- 91% sensitivity
- 89% positive predictive value
- Overall accuracy: 90%
GPT-4o had the highest precision (97%) but missed more cases (82% sensitivity), and its explanations were rated lower by experts.
Using the best model (Claude-3.5), the study found that only 26.4% of kids had documented PTBM recommendations at their first ADHD visit.
That’s a big care gap—and one that AI could help close.
But There’s a Catch
This doesn’t mean AI is ready to replace human chart review everywhere.
The study was done in one health network in California. The AI models were tested on notes from a specific time period and setting.
Also, the AI is only as good as the notes it reads. If a doctor doesn’t document a recommendation, the AI can’t find it.
What Experts Say
The study used a framework called QUEST to rate how well the AI explained its decisions. Claude-3.5 ranked highest for clarity and usefulness.
Researchers say this explainability is key. Doctors and health systems need to trust the AI’s output—and understand why it made a certain call.
This transparency could make AI more acceptable for real-world use in quality improvement.
If you’re a parent of a young child with ADHD, this research won’t change your next doctor’s visit.
But over time, it could help health systems ensure more kids get the right first-line treatment—like parent training—before considering medication.
If you’re a doctor or health system leader, this could be a tool to improve care quality without adding hours of paperwork.
This doesn’t mean this treatment is available yet.
The study only looked at one type of recommendation in one age group. It didn’t test whether AI could track if families actually followed through with training.
Also, the AI models are not perfect. They can make mistakes, and they rely on how well doctors document their plans.
Next steps include testing these AI tools in more health systems and with different types of clinical notes.
Researchers also want to see if AI can help track whether families actually complete parent training—not just whether it was recommended.
If successful, this approach could be used for other conditions and treatments, making quality measurement faster, cheaper, and more reliable.
For now, it’s a promising step toward smarter, more transparent health care.