TY - CHAP
T1 - The little known universe of short proteins in insects:
T2 - A machine learning approach
AU - Ofer, Dan
AU - Rappoport, Nadav
AU - Linial, Michal
PY - 2015
Y1 - 2015
N2 - Modern genomics and proteomics technologies are turning out immense quantities of sequenced proteins. The only feasible way to assign functions to this flood of sequences is by applying state-of-the-art computational methods for automated functional annotation. We illustrate the significance of machine learning tools in identifying and annotating short bioactive proteins and peptides from insect genomes. Over 500,000 full-length proteins from insects are currently archived in databases, of which textasciitilde15 % are short proteins. Currently, most short sequences remain uncharacterized. We developed a platform to systematically identify the functional class of short toxin-like peptides in metazoa. We present data from eight representative genomes (140,000 proteins) that cover the main phylogenetic branches of Hexapoda. The platform is a trained machine-predictor that successfully identified textasciitilde800 toxin-like candidates, 250 of them predicted with high confidence. These proteins' functions include ion channel inhibition, protease inhibitors, antimicrobial peptides, and components of the innate immune system. Our systematic approach can be expanded to new genomes and other biological classes of proteins. Using similar methodologies, we illustrate the success of identifying overlooked neuropeptide precursors. The systematic discovery of insect neuropeptides and short toxin-like proteins allows developing new strategies for pest control and manipulating insects' behavior. The overlooked secreted short peptides are discussed with respect to their evolution and potential applications in biotechnology.
AB - Modern genomics and proteomics technologies are turning out immense quantities of sequenced proteins. The only feasible way to assign functions to this flood of sequences is by applying state-of-the-art computational methods for automated functional annotation. We illustrate the significance of machine learning tools in identifying and annotating short bioactive proteins and peptides from insect genomes. Over 500,000 full-length proteins from insects are currently archived in databases, of which textasciitilde15 % are short proteins. Currently, most short sequences remain uncharacterized. We developed a platform to systematically identify the functional class of short toxin-like peptides in metazoa. We present data from eight representative genomes (140,000 proteins) that cover the main phylogenetic branches of Hexapoda. The platform is a trained machine-predictor that successfully identified textasciitilde800 toxin-like candidates, 250 of them predicted with high confidence. These proteins' functions include ion channel inhibition, protease inhibitors, antimicrobial peptides, and components of the innate immune system. Our systematic approach can be expanded to new genomes and other biological classes of proteins. Using similar methodologies, we illustrate the success of identifying overlooked neuropeptide precursors. The systematic discovery of insect neuropeptides and short toxin-like proteins allows developing new strategies for pest control and manipulating insects' behavior. The overlooked secreted short peptides are discussed with respect to their evolution and potential applications in biotechnology.
U2 - 10.1007/978-3-319-24235-4_8
DO - 10.1007/978-3-319-24235-4_8
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.chapter???
SN - 978-3-319-24233-0
SN - 978-3-319-37165-8
VL - 1
T3 - Entomology in Focus (ENFO)
SP - 177
EP - 202
BT - Short Views on Insect Genomics and Proteomics
A2 - Raman, Chandrasekar
A2 - Goldsmith, Marian R.
A2 - Agunbiade, Tolulope A.
PB - Springer International Publishing AG
CY - Cham
ER -