A study compared notes written by doctors to those generated by ChatGPT for children with knee problems. The research included 20 patients with an average age of 14.2 years. Doctors rated the human-written summaries higher than the AI versions on overall quality. This difference was statistically significant across several categories including consistency over time and accuracy of accident descriptions.
The study followed the patients for about 28 months. While the AI summaries matched human writing style in some ways, they were less accurate in describing specific medical details. No safety issues were reported during the study period. The researchers found that human documentation was significantly better at capturing important clinical information.
The main takeaway is that large language models are not ready to replace human medical documentation in pediatric orthopaedic practice without careful oversight. The findings support hybrid workflows where AI assists but does not replace human clinical judgement. This small study suggests doctors should continue to write their own notes for these patients.