TY - JOUR
T1 - Optimization of co-evolution analysis through phylogenetic profiling reveals pathway-specific signals
AU - Bloch, Idit
AU - Sherill-Rofe, Dana
AU - Stupp, Doron
AU - Unterman, Irene
AU - Beer, Hodaya
AU - Sharon, Elad
AU - Tabach, Yuval
N1 - Publisher Copyright:
© The Author(s) 2020.
PY - 2020/7/15
Y1 - 2020/7/15
N2 - Summary: The exponential growth in available genomic data is expected to reach full sequencing of a million genomes in the coming decade. Improving and developing methods to analyze these genomes and to reveal their utility is of major interest in a wide variety of fields, such as comparative and functional genomics, evolution and bioinformatics. Phylogenetic profiling is an established method for predicting functional interactions between proteins based on similarities in their evolutionary patterns across species. Proteins that function together (i.e. generate complexes, interact in the same pathways or improve adaptation to environmental niches) tend to show coordinated evolution across the tree of life. The normalized phylogenetic profiling (NPP) method takes into account minute changes in proteins across species to identify protein co-evolution. Despite the success of this method, it is still not clear what set of parameters is required for optimal use of co-evolution in predicting functional interactions. Moreover, it is not clear if pathway evolution or function should direct parameter choice. Here, we create a reliable and usable NPP construction pipeline. We explore the effect of parameter selection on functional interaction prediction using NPP from 1028 genomes, both separately and in various value combinations. We identify several parameter sets that optimize performance for pathways with certain biological annotation. This work reveals the importance of choosing the right parameters for optimized function prediction based on a biological context.
AB - Summary: The exponential growth in available genomic data is expected to reach full sequencing of a million genomes in the coming decade. Improving and developing methods to analyze these genomes and to reveal their utility is of major interest in a wide variety of fields, such as comparative and functional genomics, evolution and bioinformatics. Phylogenetic profiling is an established method for predicting functional interactions between proteins based on similarities in their evolutionary patterns across species. Proteins that function together (i.e. generate complexes, interact in the same pathways or improve adaptation to environmental niches) tend to show coordinated evolution across the tree of life. The normalized phylogenetic profiling (NPP) method takes into account minute changes in proteins across species to identify protein co-evolution. Despite the success of this method, it is still not clear what set of parameters is required for optimal use of co-evolution in predicting functional interactions. Moreover, it is not clear if pathway evolution or function should direct parameter choice. Here, we create a reliable and usable NPP construction pipeline. We explore the effect of parameter selection on functional interaction prediction using NPP from 1028 genomes, both separately and in various value combinations. We identify several parameter sets that optimize performance for pathways with certain biological annotation. This work reveals the importance of choosing the right parameters for optimized function prediction based on a biological context.
UR - http://www.scopus.com/inward/record.url?scp=85088651912&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btaa281
DO - 10.1093/bioinformatics/btaa281
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 32353123
AN - SCOPUS:85088651912
SN - 1367-4803
VL - 36
SP - 4116
EP - 4125
JO - Bioinformatics
JF - Bioinformatics
IS - 14
ER -