Mode
Text Size
Log in / Sign up

Methodological review assesses phenotypes for classifying respiratory viruses in English health data

Methodological review assesses phenotypes for classifying respiratory viruses in English health data
Photo by Brett Jordan / Unsplash
Key Takeaway
Consider phenotypes as a cautious tool for virus classification in coded data without testing.

This methodological assessment of phenotypes is a review that examines the use of specific and sensitive phenotypes to classify respiratory viruses—Respiratory Syncytial Virus, influenza, and COVID-19—in English coded health data from NHS England, with follow-up from September 2016 to August 2024. The review compares these phenotypes to publicly available surveillance data, focusing on classification accuracy and seasonal patterns as primary outcomes, along with risks of misclassification for mild and severe cases as secondary outcomes.

Key findings synthesized by the authors indicate that seasonal patterns derived from the phenotypes are similar to surveillance data, though no effect sizes or statistical measures are reported. For misclassification risks, sensitive phenotypes are associated with an increased risk of misclassification for mild cases compared to specific phenotypes, and the risk of misclassification for severe cases is higher in infants than for older adults. The review does not provide numeric data on accuracy rates, sample sizes, or confidence intervals, limiting quantitative interpretation.

The authors note limitations, including that the phenotypes presented offer a solution in the absence of testing information, suggesting reliance on coded data may have inherent gaps. They do not report funding or conflicts, and safety aspects such as adverse events are not addressed. In terms of practice relevance, the review cautiously suggests that these phenotypes could offer a solution for classifying respiratory viruses from coded health records when testing information is unavailable, but clinicians should interpret results with restraint due to the methodological nature and lack of detailed validation metrics.

Study Details

EvidenceLevel 5
PublishedApr 2026
View Original Abstract ↓
Electronic health records (EHRs) are a rich source of data which can be used to analyse health outcomes using computable phenotypes. With the approval of NHS England we used the OpenSAFELY secure analytics platform to design and assess phenotypes to classify three key respiratory viruses - respiratory syncytial virus (RSV), influenza, and COVID-19 - in English coded health data between September 2016 and August 2024. We compared specific and sensitive phenotypes to one another and to publicly available surveillance data. Cases from both phenotypes showed similar seasonal patterns to surveillance data. Sensitive phenotypes led to increased risk of misclassification than specific phenotypes for mild cases. For severe cases the risk of misclassification was higher in infants than for older adults, irrespective of the phenotype used. The phenotypes presented here offer a solution to classifying respiratory viruses from coded health records in the absence of testing information.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.