Mode
Text Size
Log in / Sign up

Technical report validates AI tool for classifying MPXV subclades using genomic data

Technical report validates AI tool for classifying MPXV subclades using genomic data
Photo by Cht Gsml / Unsplash
Key Takeaway
Recognize this AI classifier validates MPXV subclades with 95% accuracy, though clinical efficacy is unproven.

This publication is a technical report and software description focusing on methodological development and tool validation. The scope involves the CladePredictor-MPXV, an alignment-free AI-based classifier using an ensemble of XGBoost and CNNs. The analysis utilized a dataset comprising 3,866 MPXV genomes, including both complete and partial sequences. The study aimed to compare this new tool against traditional viral genome classifiers relying on alignment methods.

The primary outcome measured classification accuracy for MPXV subclades Ia, Ib, and IIb. The weighted average accuracy for the XGBoost model was 90.2%. The weighted average accuracy for the CNN model was 95%. These figures represent the performance of the algorithm on the provided genomic dataset. No adverse events or safety data were reported as this is a computational tool validation. Secondary outcomes were not reported in the source material.

The authors note that the setting was not reported and safety metrics were not reported. Practice relevance is described as providing a fast and efficient framework for the assignment of clades to MPXV subclade Ia, Ib, and IIb complete and partial genomes. Clinicians should note that accuracy metrics should not be interpreted as clinical efficacy. Furthermore, causality between tool use and public health outcomes should not be inferred. The evidence is limited to computational performance without clinical outcome data. Funding or conflicts of interest were not reported in the provided text.

Study Details

EvidenceLevel 5
PublishedApr 2026
View Original Abstract ↓
Poxviruses constitute a threat to human health. Since 2022, two public health emergencies of international concern due to global spread of mpox viruses (MPXVs) were declared. The emergence of the novel MPXV subclade Ib has placed the global health community on alert as sustained human-to-human and travel-related transmission is prevalent in Africa and 30 non-African countries. Metagenomic and outbreak surveillance data often generates complete as well as partial assemblies of genomes which then require efficient taxonomic classification. Traditional viral genome classifiers rely on poorly scalable alignment methods creating computational bottlenecks in taxonomic classifications. Here, we present CladePredictor-MPXV: an alignment-free AI-based classifier of complete and partial MPXV genomes. Our classification framework consists of an ensemble of XGBoost and CNNs to classify between subclades Ia, Ib and IIb. CladePredictor-MPXV was trained with 3,866 MPXV genomes. XGBoost models were trained with 3-mers which are representative of the global feature space of complete MPXV genomes. CNNs were trained with short-range, position-independent sequence patterns to assign clades to partial genomes with a minimum size of 1000 nucleotides. Our XGBoost instance attained a weighted average accuracy of 90.2% while our CNN instance attained a weighted average accuracy of 95% in classifying clade (I vs II) and subclade (Ia vs Ib) from complete (>= 188,000 nucleotides) and partial MPXV genomes on a phylogenetically distinct validation set. CladePredictor-MPXV is freely available at https://clade-predictor.microbiologyandimmunology.dal.ca and provides a fast and efficient framework for the assignment of clades to MPXV subclade Ia, Ib, and IIb complete and partial genomes.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.