TY - JOUR
T1 - Unbiased phenotype and genotype matching maximizes gene discovery and diagnostic yield
AU - Rips, Jonathan
AU - Halstuk, Orli
AU - Fuchs, Adina
AU - Lang, Ziv
AU - Sido, Tal
AU - Gershon-Naamat, Shiri
AU - Abu-Libdeh, Bassam
AU - Edvardson, Simon
AU - Salah, Somaya
AU - Breuer, Oded
AU - Hadhud, Mohamad
AU - Eden, Sharon
AU - Simon, Itamar
AU - Slae, Mordechai
AU - Damseh, Nadirah S.
AU - Abu-Libdeh, Abdulsalam
AU - Eskin-Schwartz, Marina
AU - Birk, Ohad S.
AU - Varga, Julia
AU - Schueler-Furman, Ora
AU - Rosenbluh, Chaggai
AU - Elpeleg, Orly
AU - Yanovsky-Dagan, Shira
AU - Mor-Shaked, Hagar
AU - Harel, Tamar
N1 - Publisher Copyright:
© 2024 American College of Medical Genetics and Genomics
PY - 2024/4
Y1 - 2024/4
N2 - Purpose: Widespread application of next-generation sequencing, combined with data exchange platforms, has provided molecular diagnoses for countless families. To maximize diagnostic yield, we implemented an unbiased semi-automated genematching algorithm based on genotype and phenotype matching. Methods: Rare homozygous variants identified in 2 or more affected individuals, but not in healthy individuals, were extracted from our local database of ∼12,000 exomes. Phenotype similarity scores (PSS), based on human phenotype ontology terms, were assigned to each pair of individuals matched at the genotype level using HPOsim. Results: 33,792 genotype-matched pairs were discovered, representing variants in 7567 unique genes. There was an enrichment of PSS ≥0.1 among pathogenic/likely pathogenic variant-level pairs (94.3% in pathogenic/likely pathogenic variant-level matches vs 34.75% in all matches). We highlighted founder or region-specific variants as an internal positive control and proceeded to identify candidate disease genes. Variant-level matches were particularly helpful in cases involving inframe indels and splice region variants beyond the canonical splice sites, which may otherwise have been disregarded, allowing for detection of candidate disease genes, such as KAT2A, RPAIN, and LAMP3. Conclusion: Semi-automated genotype matching combined with PSS is a powerful tool to resolve variants of uncertain significance and to identify candidate disease genes.
AB - Purpose: Widespread application of next-generation sequencing, combined with data exchange platforms, has provided molecular diagnoses for countless families. To maximize diagnostic yield, we implemented an unbiased semi-automated genematching algorithm based on genotype and phenotype matching. Methods: Rare homozygous variants identified in 2 or more affected individuals, but not in healthy individuals, were extracted from our local database of ∼12,000 exomes. Phenotype similarity scores (PSS), based on human phenotype ontology terms, were assigned to each pair of individuals matched at the genotype level using HPOsim. Results: 33,792 genotype-matched pairs were discovered, representing variants in 7567 unique genes. There was an enrichment of PSS ≥0.1 among pathogenic/likely pathogenic variant-level pairs (94.3% in pathogenic/likely pathogenic variant-level matches vs 34.75% in all matches). We highlighted founder or region-specific variants as an internal positive control and proceeded to identify candidate disease genes. Variant-level matches were particularly helpful in cases involving inframe indels and splice region variants beyond the canonical splice sites, which may otherwise have been disregarded, allowing for detection of candidate disease genes, such as KAT2A, RPAIN, and LAMP3. Conclusion: Semi-automated genotype matching combined with PSS is a powerful tool to resolve variants of uncertain significance and to identify candidate disease genes.
KW - Exome sequencing
KW - Genotype matching
KW - HPO terms
KW - KAT2A
KW - Phenotype similarity scores
KW - RPAIN
KW - Variants of uncertain significance
UR - http://www.scopus.com/inward/record.url?scp=85183950416&partnerID=8YFLogxK
U2 - 10.1016/j.gim.2024.101068
DO - 10.1016/j.gim.2024.101068
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 38193396
AN - SCOPUS:85183950416
SN - 1098-3600
VL - 26
JO - Genetics in Medicine
JF - Genetics in Medicine
IS - 4
M1 - 101068
ER -