Mode
Text Size
Log in / Sign up

Phenotype database PAVS shows utility for gene prioritization in rare disease cases

Phenotype database PAVS shows utility for gene prioritization in rare disease cases
Photo by Andrew Neel / Unsplash
Key Takeaway
Consider PAVS as a potential population-specific resource for phenotype-driven gene prioritization in rare disease evaluation.

This observational database development and evaluation study assessed PAVS, a curated database integrating phenotype-associated variants. The database incorporated data from 5132 Saudi clinical cases, 522 cases from a mixed-population cohort, 1856 cases from the Deciphering Developmental Disorders study, and 9588 literature phenopackets. The primary outcome was the utility of phenotype annotations for gene prioritization using semantic similarity, compared against global literature-curated databases.

The main result showed that phenotypes in PAVS could successfully rank the correct gene at a high rank, with a reported ROCAUC of 0.89 for gene prioritization performance. No specific absolute numbers, p-values, or confidence intervals were reported for this metric. Safety, tolerability, and adverse event data were not reported, as this was a database evaluation study rather than a clinical intervention trial.

Key limitations of the study were not explicitly reported. The practice relevance is that this work addresses a gap in population-specific genotype-phenotype resources and provides a benchmark for phenotype-driven variant prioritization in under-represented populations. However, clinicians should interpret these findings cautiously as they represent a database evaluation with clear differences compared to global literature-curated databases, and do not directly assess clinical outcomes or generalizability beyond the evaluation context.

Study Details

Study typeCohort
EvidenceLevel 3
PublishedApr 2026
View Original Abstract ↓
Genotype-phenotype databases are essential for variant interpretation and disease gene discovery. Genetic variation differs among human populations, mainly in allele frequencies and haplotype patterns shaped by ancestry and demographic history. Population-specific genotypes can influence traits and disease risk; this makes population specific characterization important. Most existing resources focus on the characterization of a population's genetic background, but do not represent the resulting phenotypes. We have developed PAVS (Phenotype-Associated Variants in Saudi Arabia), a curated, publicly accessible database that integrates 5,132 Saudi clinical cases from four Saudi cohorts and 522 cases from analysis of a mixed-population cohort, together with 1,856 cases from the Deciphering Developmental Disorders study (DDD) and 9,588 literature phenopackets. Each case record describes patient-level phenotypes, encoded with the Human Phenotype Ontology (HPO), and links them to genomic variants, gene identifiers, zygosity, pathogenicity classifications, and disease diagnoses mapped to standardized disease terminologies. The data is represented in Phenopackets format and as a knowledge graph in RDF. Additionally, a web interface provides phenotype-based similarity search, gene and variant browsers, and an HPO hierarchy explorer. We evaluate the utility of the phenotype annotations for gene prioritization using semantic similarity. While there are clear differences to global literature-curated databases, phenotypes in PAVS can successfully rank the correct gene at high rank (ROCAUC: 0.89). PAVS addresses a gap in population-specific genotype-phenotype resources and provides a benchmark for phenotype-driven variant prioritization in under-represented populations.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.