TY - JOUR
T1 - Identification of introns harboring functional sequence elements through positional conservation /631/114/2404 /631/114/1305 /631/181/735 article
AU - Chorev, Michal
AU - Joseph Bekker, Alan
AU - Goldberger, Jacob
AU - Carmel, Liran
N1 - Publisher Copyright:
© 2017 The Author(s).
PY - 2017/12/1
Y1 - 2017/12/1
N2 - Many human introns carry out a function, in the sense that they are critical to maintain normal cellular activity. Their identification is fundamental to understanding cellular processes and disease. However, being noncoding elements, such functional introns are poorly predicted based on traditional approaches of sequence and structure conservation. Here, we generated a dataset of human functional introns that carry out different types of functions. We showed that functional introns share common characteristics, such as higher positional conservation along the coding sequence and reduced loss rates, regardless of their specific function. A unique property of the data is that if an intron is unknown to be functional, it still does not mean that it is indeed non-functional. We developed a probabilistic framework that explicitly accounts for this unique property, and predicts which specific human introns are functional. We show that we successfully predict function even when the algorithm is trained on introns with a different type of function. This ability has many implications in studying regulatory networks, gene regulation, the effect of mutations outside exons on human disease, and on our general understanding of intron evolution and their functional exaptation in mammals.
AB - Many human introns carry out a function, in the sense that they are critical to maintain normal cellular activity. Their identification is fundamental to understanding cellular processes and disease. However, being noncoding elements, such functional introns are poorly predicted based on traditional approaches of sequence and structure conservation. Here, we generated a dataset of human functional introns that carry out different types of functions. We showed that functional introns share common characteristics, such as higher positional conservation along the coding sequence and reduced loss rates, regardless of their specific function. A unique property of the data is that if an intron is unknown to be functional, it still does not mean that it is indeed non-functional. We developed a probabilistic framework that explicitly accounts for this unique property, and predicts which specific human introns are functional. We show that we successfully predict function even when the algorithm is trained on introns with a different type of function. This ability has many implications in studying regulatory networks, gene regulation, the effect of mutations outside exons on human disease, and on our general understanding of intron evolution and their functional exaptation in mammals.
UR - http://www.scopus.com/inward/record.url?scp=85021310526&partnerID=8YFLogxK
U2 - 10.1038/s41598-017-04476-0
DO - 10.1038/s41598-017-04476-0
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 28646210
AN - SCOPUS:85021310526
SN - 2045-2322
VL - 7
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 4201
ER -