Machine learning models predict ciprofloxacin resistance in Shigella isolates from OntarioNew Test Predicts Drug Failure Before It Starts

medRxiv Published April 16, 2026 Study authors: Gohari, M. R.; Zhang, P.; Villegas, A.; Rosella, L. C.; Patel, S. N.; Hopkins, J. P.; Duvvuri, V. R. DOI ↗ Editorial oversight: Dr. Amelia Tan, PhD · Internal Medicine & Chronic Disease

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway

Consider that machine learning models using k-mer features may improve ciprofloxacin resistance prediction in Shigella isolates, but evidence is limited to this setting.

This primary research article from Ontario, Canada, evaluated machine learning models using k-mer features derived from whole-genome sequencing (WGS) data to predict ciprofloxacin resistance in 1,424 Shigella isolates. The study compared models based on chromosomal determinants alone versus those incorporating both chromosomal and plasmid-mediated determinants, testing k-mer lengths of 11, 15, 21, and 31.

The key findings indicate that a k-mer length of 11 produced the highest area under the receiver operating characteristic curve (AUC) and the lowest Brier score for predictive performance. The Random Forest classifier achieved the most consistent performance across models. Furthermore, the inclusion of plasmid-mediated determinants improved predictive accuracy relative to chromosomal-only models.

The authors note that applications to Shigella remain limited, and the study does not report follow-up duration, effect sizes, absolute numbers, or p-values and confidence intervals for the primary outcomes. The research is confined to isolates from Ontario, Canada, and the generalizability beyond this context is uncertain.

Practice relevance suggests potential integration into genomic antimicrobial resistance surveillance and digital public health frameworks. However, the evidence is observational and specific to bacterial isolates; it does not infer clinical efficacy or patient outcomes. The findings should be interpreted cautiously, with further validation needed.

The Frustrating Reality of Diarrhea

Imagine you have a stomach bug that makes you feel terrible. You go to the doctor, and they give you a pill to kill the bacteria. You take the pill, but the bacteria are still there. This is a nightmare for anyone with an infection.

Doctors usually test the bacteria in a lab to see which pills work. But this test takes days. By the time you get the results, you might have been sick for a week.

The bacteria causing this illness are getting smarter. They are learning to ignore the pills we use to kill them. This is called antimicrobial resistance. When bacteria ignore our medicines, infections become harder to treat.

This problem is getting worse in places like Ontario, Canada. Doctors are worried because the bugs are changing faster than we can make new medicines. We need a faster way to know which drug will work before we even start the treatment.

The Old Way vs. The New Way

For a long time, scientists looked at the bacteria under a microscope. They grew the bacteria in a dish and added different pills to see what killed them. It was slow and didn't always tell us why the bacteria were resisting the drug.

But here's the twist. Scientists now use a super-fast camera to read the entire genetic code of the bacteria. Think of this code like a long instruction manual for the bug. The old way was reading the whole manual page by page. The new way uses a smart computer to scan the manual instantly.

The bacteria have a genetic code that tells them how to survive. Sometimes, a small change in this code acts like a lock picking tool. It lets the bacteria ignore the antibiotic.

Scientists use a method called machine learning. Imagine teaching a child to recognize a cat. You show them many pictures of cats. Eventually, the child can spot a cat in a crowd without you pointing it out.

The computer does this with DNA. It looks at tiny pieces of the genetic code, called k-mers. You can think of k-mers as small words in the genetic sentence. The computer learns which "words" mean the bacteria are resistant.

Researchers looked at data from 1,424 bacteria samples collected in Ontario between 2018 and 2025. They used a computer program to predict if the bacteria would ignore ciprofloxacin. This is a common antibiotic used to treat severe diarrhea.

They tested different ways to teach the computer. They also checked if looking at just the main genetic code was enough or if they needed to look at extra pieces floating outside the main code too.

The computer was very good at its job. The best model correctly predicted which bacteria would resist the drug. It worked better when it looked at both the main genetic code and the extra floating pieces.

The computer found the specific spots in the genetic code that caused the problem. This helps scientists understand exactly how the bacteria are learning to fight back. It is like finding the exact switch that turns the resistance on.

But there's a catch. This technology is not available in every doctor's office yet.

The study shows that this method is accurate and easy to understand. Scientists can see exactly which parts of the DNA caused the prediction. This transparency builds trust. It proves the computer is not guessing; it is reading the genetic evidence.

This fits into a bigger plan to track how bugs change. Public health officials can use this data to stop outbreaks before they get big. It turns a complex genetic puzzle into a simple yes or no answer for doctors.

If you or a loved one has a severe infection, your doctor might use this kind of data soon. It means you could get the right medicine faster. You would not have to wait days for a lab test to tell you what works.

However, you should still talk to your doctor about your treatment. They know your specific situation best. Do not stop taking your medicine just because you read about a new test.

This study was done on samples from one region in Canada. The bacteria in other parts of the world might be different. Also, this is still a research tool. It has not been approved for regular use in hospitals yet.

Scientists will now test this tool in more places. They want to see if it works for other types of bacteria too. If it proves safe and effective, it could become a standard part of how doctors treat infections. This could save lives by getting the right medicine to patients much faster.

Study Details

EvidenceLevel 5

PublishedApr 2026

View Original Abstract ↓

Antimicrobial resistance (AMR) is a growing global public health threat that complicates the treatment and control of bacterial infections. Shigella spp., a leading cause of bacterial diarrhea worldwide, has increasingly exhibited resistance to multiple antimicrobial agents that are commonly recommended therapy for severe shigellosis. Although conventional antimicrobial susceptibility testing (AST) remains the reference standard, it is time-consuming and provides limited insight into the genetic mechanisms underlying resistance. Whole-genome sequencing (WGS) has emerged as a complementary approach for AMR detection by enabling direct identification of resistance genetic determinants encoded in bacterial genomes. Machine learning (ML) methods applied to genomic features such as k-mers have shown promise for predicting resistance phenotypes from WGS data; however, applications to Shigella remain limited. In this study, we developed and evaluated an interpretable ML framework for predicting ciprofloxacin resistance using k-mer features derived from WGS data of 1,424 Shigella isolates collected in Ontario, Canada, between 2018 and 2025. K-mers were extracted from known gene targets associated with ciprofloxacin resistance, including chromosomal quinoline resistance-determining regions (QRDRs: gyrA and parC) and plasmid-mediated determinants (qnr). Supervised ML approaches were trained and compared. We evaluated the influence of k-mer lengths (k=11, 15, 21 and 31) on predictive performance and model interpretability; and compared models based on chromosomal determinants alone and models incorporating both chromosomal and plasmid-mediated determinants. Randon Forest classifier achieved the most consistent performance across models. Inclusion of plasmid-mediated determinants improved predictive accuracy relative to chromosomal-only models. Although differences across k-mer lengths were modest, k = 11 produced the highest area under the receiver operating characteristic curve (AUC) and the lowest Brier score. SHAP analyses localized high-impact features within QRDRs of gyrA and parC, supporting biological interpretability. These findings demonstrate that biologically-informed k-mer-based ML models can accurately and transparently predict ciprofloxacin resistance in Shigella, supporting their potential integration into genomic AMR surveillance and digital public health frameworks. Author summaryIn this study, we used genome sequencing data to develop machine learning models that predict ciprofloxacin resistance for Shigella directly from bacterial DNA. We focused on small DNA fragments (k-mers) derived from known resistance genes and mutations. Among the approaches tested, a Random Forest model showed the most consistent performance. Combining chromosomal mutations with plasmid-mediated resistance genes improved prediction accuracy and helped identify key genetic regions associated with resistance. These findings demonstrate that machine learning applied to genomic data can accurately and interpretable predict antibiotic resistance, supporting its potential use in genomic surveillance and public health monitoring.

Machine learning models predict ciprofloxacin resistance in Shigella isolates from OntarioNew Test Predicts Drug Failure Before It Starts

The Frustrating Reality of Diarrhea

The Old Way vs. The New Way

Study Details

Intrawound tobramycin plus vancomycin did not reduce deep infection risk compared with vancomycin alone in high-risk tibial fractures

Adding Antibiotics to Bone Surgery Didn't Lower Infection Risk

Clinical research that matters. Delivered to your inbox.

Machine learning models predict ciprofloxacin resistance in Shigella isolates from OntarioNew Test Predicts Drug Failure Before It Starts

The Frustrating Reality of Diarrhea

The Old Way vs. The New Way

More on Shigellosis

Study Details

Intrawound tobramycin plus vancomycin did not reduce deep infection risk compared with vancomycin alone in high-risk tibial fractures

Adding Antibiotics to Bone Surgery Didn't Lower Infection Risk

Clinical research that matters. Delivered to your inbox.

Related in Infectious Disease

From Other Specialties