TY - JOUR
T1 - Quantifying gene selection in cancer through protein functional alteration bias
AU - Brandes, Nadav
AU - Linial, Nathan
AU - Linial, Michal
N1 - Publisher Copyright:
© 2019 The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.
PY - 2019/7/26
Y1 - 2019/7/26
N2 - Compiling the catalogue of genes actively involved in cancer is an ongoing endeavor, with profound implications to the understanding and treatment of the disease. An abundance of computational methods have been developed to screening the genome for candidate driver genes based on genomic data of somatic mutations in tumors. Existing methods make many implicit and explicit assumptions about the distribution of random mutations. We present FABRIC, a new framework for quantifying the selection of genes in cancer by assessing the effects of de-novo somatic mutations on protein-coding genes. Using a machine-learning model, we quantified the functional effects of ∼3M somatic mutations extracted from over 10 000 human cancerous samples, and compared them against the effects of all possible single-nucleotide mutations in the coding human genome. We detected 593 protein-coding genes showing statistically significant bias towards harmful mutations. These genes, discovered without any prior knowledge, show an overwhelming overlap with known cancer genes, but also include many overlooked genes. FABRIC is designed to avoid false discoveries by comparing each gene to its own background model using rigorous statistics, making minimal assumptions about the distribution of random somatic mutations. The framework is an open-source project with a simple command-line interface.
AB - Compiling the catalogue of genes actively involved in cancer is an ongoing endeavor, with profound implications to the understanding and treatment of the disease. An abundance of computational methods have been developed to screening the genome for candidate driver genes based on genomic data of somatic mutations in tumors. Existing methods make many implicit and explicit assumptions about the distribution of random mutations. We present FABRIC, a new framework for quantifying the selection of genes in cancer by assessing the effects of de-novo somatic mutations on protein-coding genes. Using a machine-learning model, we quantified the functional effects of ∼3M somatic mutations extracted from over 10 000 human cancerous samples, and compared them against the effects of all possible single-nucleotide mutations in the coding human genome. We detected 593 protein-coding genes showing statistically significant bias towards harmful mutations. These genes, discovered without any prior knowledge, show an overwhelming overlap with known cancer genes, but also include many overlooked genes. FABRIC is designed to avoid false discoveries by comparing each gene to its own background model using rigorous statistics, making minimal assumptions about the distribution of random somatic mutations. The framework is an open-source project with a simple command-line interface.
UR - http://www.scopus.com/inward/record.url?scp=85077665031&partnerID=8YFLogxK
U2 - 10.1093/nar/gkz546
DO - 10.1093/nar/gkz546
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 31334812
AN - SCOPUS:85077665031
SN - 0305-1048
VL - 47
SP - 6642
EP - 6655
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 13
ER -