Technical report validates AI tool for classifying MPXV subclades using genomic data
This publication is a technical report and software description focusing on methodological development and tool validation. The scope involves the CladePredictor-MPXV, an alignment-free AI-based classifier using an ensemble of XGBoost and CNNs. The analysis utilized a dataset comprising 3,866 MPXV genomes, including both complete and partial sequences. The study aimed to compare this new tool against traditional viral genome classifiers relying on alignment methods.
The primary outcome measured classification accuracy for MPXV subclades Ia, Ib, and IIb. The weighted average accuracy for the XGBoost model was 90.2%. The weighted average accuracy for the CNN model was 95%. These figures represent the performance of the algorithm on the provided genomic dataset. No adverse events or safety data were reported as this is a computational tool validation. Secondary outcomes were not reported in the source material.
The authors note that the setting was not reported and safety metrics were not reported. Practice relevance is described as providing a fast and efficient framework for the assignment of clades to MPXV subclade Ia, Ib, and IIb complete and partial genomes. Clinicians should note that accuracy metrics should not be interpreted as clinical efficacy. Furthermore, causality between tool use and public health outcomes should not be inferred. The evidence is limited to computational performance without clinical outcome data. Funding or conflicts of interest were not reported in the provided text.