Older patients often leave the hospital with too many medicines. For people over 65 who stay in the hospital for two days or more, this can be dangerous. Doctors must decide which drugs to keep and which to stop. But reading thousands of medical notes is hard work. A team in New South Wales, Australia, built a smart tool to help. They tested it on records from six public hospitals. The system looked at notes about antibiotics and opioids. It found over 9,600 medication mentions and pulled out 1,000 sentences that suggested stopping a drug. The computer was very good at this job. It matched expert human reviewers with a high score of 0.91. It also agreed with other experts most of the time. The whole process took just 12 seconds for each patient record. This speed means doctors can use it without waiting hours for results. The tool works on local hospital computers. This keeps patient data safe and private. One small issue appeared during testing. Sometimes the system thought a medicine was stopped after discharge when it was actually finished during the stay. The researchers noted this mistake happens often. They are working to fix it. This tool offers a practical way to catch errors. It helps hospitals save money while keeping patients safer.
Hybrid NLP and LLM system extracts deprescribing recommendations from electronic health records for older hospitalized patientsNew tool helps doctors stop unnecessary drugs for older hospital patients
AI-generated summary of the cited source, checked by automated accuracy review. How we work
This retrospective cohort study assessed a two-stage hybrid system combining rule-based natural language processing and open-source large language models. The system was evaluated using data from 850 patients aged 65 years or older who were hospitalized for 48 hours or more. The evaluation took place across six public hospitals in New South Wales, Australia. The primary outcome was the automated extraction of deprescribing recommendations from electronic health records. No comparator was reported for this technical validation study.
The system extracted 9,631 medications with a median of 11 per patient. It also identified 1,061 candidate sentences. Model 2 achieved an F1 score of 0.91 and an accuracy of 0.90. Processing time averaged 12.6 seconds. Inter-rater reliability showed substantial agreement with a Cohen's kappa of 0.70.
Safety and tolerability data were not reported. The study did not collect clinical outcome data or adverse event information. The most common misclassification was incorrectly identifying actions completed during hospitalization as post-discharge recommendations. Funding or conflicts of interest were not reported.
The practice relevance includes enabling cost-efficient, privacy-compliant local deployment. Clinicians should note that this study validates a technical tool rather than demonstrating clinical benefit. Do not infer clinical outcomes from a technical validation study. Do not infer causality between deprescribing and outcomes as no clinical outcome data were collected.