Abstract
Over the last two decades, Genome-Wide Association Study (GWAS) has become a canonical tool for exploratory genetic research, generating countless gene-phenotype associations. Despite its accomplishments, several limitations and drawbacks still hinder its success, including low statistical power and obscurity about the causality of implicated variants. We introduce PWAS (Proteome-Wide Association Study), a new method for detecting protein-coding genes associated with phenotypes through protein function alterations. PWAS aggregates the signal of all variants jointly affecting a protein-coding gene and assesses their overall impact on the protein’s function using machine-learning and probabilistic models. Subsequently, it tests whether the gene exhibits functional variability between individuals that correlates with the phenotype of interest. By collecting the genetic signal across many variants in light of their rich proteomic context, PWAS can detect subtle patterns that standard GWAS and other methods overlook. It can also capture more complex modes of heritability, including recessive inheritance. Furthermore, the discovered associations are supported by a concrete molecular model, thus reducing the gap to inferring causality. To demonstrate its applicability for a wide range of human traits, we applied PWAS on a cohort derived from the UK Biobank (~330K individuals) and evaluated it on 49 prominent phenotypes. 23% of the significant PWAS associations on that cohort (2,998 of 12,896) were missed by standard GWAS. A comparison between PWAS to existing methods proves its capacity to recover causal protein-coding genes and highlighting new associations with plausible biological mechanism.
Original language | English |
---|---|
Title of host publication | Research in Computational Molecular Biology - 24th Annual International Conference, RECOMB 2020, Proceedings |
Editors | Russell Schwartz |
Publisher | Springer |
Pages | 237-239 |
Number of pages | 3 |
ISBN (Print) | 9783030452568 |
DOIs | |
State | Published - 2020 |
Event | 24th Annual Conference on Research in Computational Molecular Biology, RECOMB 2020 - Padua, Italy Duration: 10 May 2020 → 13 May 2020 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12074 LNBI |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 24th Annual Conference on Research in Computational Molecular Biology, RECOMB 2020 |
---|---|
Country/Territory | Italy |
City | Padua |
Period | 10/05/20 → 13/05/20 |
Bibliographical note
Publisher Copyright:© Springer Nature Switzerland AG 2020.
Keywords
- GWAS
- Machine learning
- Protein function
- Recessive heritability
- UK Biobank