Imagine having symptoms that could be leukemia, but you can't get to a specialist for the complex tests needed to confirm it. This is a reality for many people around the world. A team looked at whether an artificial intelligence (AI) tool could help by using just the results from standard blood tests to predict the type of leukemia a person might have. They tested it on a large, diverse group of over 6,200 patients from 20 different centers worldwide. The tool was very good at spotting certain types, like acute myeloid leukemia and a specific subtype called promyelocytic leukemia. But there was a big catch: to get that high accuracy, the tool had to refuse to make a prediction for the vast majority of patients—between 71% and 93% of the time. That's not very helpful for doctors. So, they refined the tool using a different method. This new version was less likely to refuse a prediction, excluding only about 12% of patients, and its accuracy for spotting acute myeloid leukemia in those uncertain cases improved. They also specifically retrained the tool to work better for children. The work shows that AI could one day be a useful support tool, helping more people get a faster, initial indication of their condition using tests they can already get.
AI tool predicts acute leukemia subtypes with AUROC 0.94 for AML, 0.98 for promyelocytic, 0.84 for ALL in international cohortCan a simple blood test help diagnose leukemia? An AI tool shows promise but still needs work
AI-generated summary of the cited source, checked by automated accuracy review. How we work
This retrospective study assembled an international cohort of 6206 leukemia patients from 20 centers to test and refine an artificial intelligence (AI) tool designed to support leukemia diagnosis using standard laboratory results. The goal was to address health disparities by potentially improving access to diagnosis. The pretrained algorithm was executed on this diverse cohort, yielding varying accuracy metrics. When applying a confidence cutoff for predictions, the 2000-fold bootstrapped area under the curve (AUROC) metrics were 0.94 for acute myeloid leukemia (AML), 0.98 for the promyelocytic subtype, and 0.84 for acute lymphoblastic leukemia (ALL). However, this confidence cutoff approach excluded a substantial proportion of patients from receiving predictions, ranging from 70.8% to 92.5%. To improve the tool's utility, the researchers enhanced its accuracy and robustness while maintaining generalizability. They implemented an ensemble method combining Isolation Forest and Local Outlier Factor. This refinement increased the AUROC for AML from 0.72 to 0.84 on a hold-out test set specifically for patients who fell below the initial confidence threshold. Importantly, this improved model excluded only 12.1% of patients from predictions, a significant reduction from the earlier high exclusion rates. An additional development noted in the abstract is that the algorithm was retrained specifically for pediatric patients. The study demonstrates a process of international testing and iterative refinement of an AI diagnostic support tool, showing that modifications can substantially reduce the rate of excluded patients while improving performance for a subset of cases.