This cohort study evaluated 504,000 participants from the Emirati Genome Program in the United Arab Emirates. The primary exposure was whole-genome sequencing and population-scale genomics, which were compared against global reference data and clinically ascertained cohorts. The study aimed to reassess pathogenicity and discover novel variants in inherited retinal disease (IRD). Secondary outcomes included variant penetrance, cohort enrichment, family-based concordance, and panel composition.
Analysis of individuals with IRD-compatible genotypes identified 259 pathogenic/likely pathogenic (P/LP) variants, 11,888 variants of uncertain significance (VUS), and 11,953 novel variants. P/LP variants demonstrated the highest proportional enrichment at 15.4%, compared to 3.4% for VUS and 3.2% for novel variants. Strongest population-level signals were observed for P/LP variants in KCNV2 and ABCA4 genes, which were markedly enriched relative to global reference data.
Inclusion of VUS and novel variants expanded the panel by more than 14-fold. However, restricting the panel to P/LP variants alone captured only 37 variants. Notably, 9.1% of P/LP variants lacked population-level enrichment, and 3.1% exhibited null penetrance. Family-based concordance supported 404 of 533 prioritized variants. The population-calibrated panel composition revealed that 91.8% of supported variants were VUS or novel.
Safety data, including adverse events and tolerability, were not reported. Key limitations include the reliance of pathogenicity classifications on clinically ascertained cohorts where penetrance is often assumed near-complete, despite known heterogeneity and frequently lower penetrance in real-world IRD-associated variants. The study underscores that assertion-based pathogenicity classifications require recalibration against real-world disease expression before deployment in population-scale genomic screening, particularly in underrepresented populations.
View Original Abstract ↓
Background Pathogenicity classifications for rare disease variants are largely derived from clinically ascertained cohorts, in which penetrance is often assumed to be near-complete. As genomic sequencing expands into population-scale screening, particularly in underrepresented populations, both the penetrance of known variants and the relevance of novel or population-enriched variants require systematic evaluation. Inherited retinal diseases (IRDs), genetically heterogeneous and clinically variable, represent an ideal setting in which to assess how ascertainment-based pathogenicity assignments translate to real-world population risk. Methods We analyzed whole-genome sequencing data from 504,000 participants in the Emirati Genome Program, including 426,382 individuals with linked longitudinal electronic health records and 75,184 genetically reconstructed family units. IRD phenotypes were defined through a phenome-wide association approach to derive enriched retinal ICD-10 code sets. Variants in 339 IRD-associated genes were assessed using a population-based framework integrating enrichment testing, penetrance tiering, and family-based concordance across pathogenic or likely pathogenic variants (P/LP), variants of uncertain significance (VUS), and novel variants not previously classified in clinical variant databases, yielding a population-calibrated IRD panel. Results Among 24,100 variants observed in individuals with IRD-compatible genotypes (259 P/LP, 11,888 VUS, and 11,953 novel variants), 533 variants were prioritized. P/LP variants showed the highest proportional enrichment (15.4% vs. 3.4% of VUS and 3.2% of novel variants), with the strongest population-level signals being P/LP variants in KCNV2 and ABCA4, both markedly enriched in the Emirati cohort relative to global reference data and consistent with their predominance among affected individuals at regional ophthalmology clinics. Restricting prioritization to P/LP variants alone would have captured only 37 variants; inclusion of VUS and novel variants expanded the panel more than 14-fold. Conversely, 9.3% of P/LP variants lacked population-level enrichment, including 3.1% with null penetrance despite adequate carrier representation. Family-based concordance supported 404 of 533 prioritized variants, defining the population-calibrated panel; 91.8% of these were VUS or novel. Conclusions Penetrance of IRD-associated variants is heterogeneous and frequently lower than inferred from clinical cohorts. Population-scale calibration identified clinically relevant VUS and novel variants, including population-specific risk alleles absent from global databases, while revealing limited real-world support for a subset of established P/LP classifications. These findings underscore that assertion-based pathogenicity classifications require recalibration against real-world disease expression before deployment in population-scale genomic screening, particularly in underrepresented populations.