Abstract
ProTarget is a Web-based tool for the automatic prediction of fold novelty. It offers the structural genomics community a method for target selection by providing an online analysis of any new or pre-existing sequence for its relationship to any previously solved three-dimensional structure. ProTarget takes as input an amino acid sequence. Regions of this sequence that exhibit high similarity to an existing PDB (Protein Data Bank) sequence are removed, leaving one or more subsequences. Each of these subsequences is then analyzed against a clustering of the protein space to determine the likelihood of its representing a new structural superfamily. This likelihood is derived from the distance in the clustering between the (sub)sequence and sequences that have known structures. The output of ProTarget is a graphical visualization of the protein of interest together with the likelihood that a protein sequence represents a novel structural superfamily. ProTarget is updated regularly and currently covers over 160 000 protein sequences from the SwissProt and PDB databases. ProTarget is available at http://www.protarget.cs.huji.ac.il.
Original language | English |
---|---|
Pages (from-to) | W81-W84 |
Journal | Nucleic Acids Research |
Volume | 33 |
Issue number | SUPPL. 2 |
DOIs | |
State | Published - Jul 2005 |
Bibliographical note
Funding Information:Ilona Kifer and O.S. jointly developed the algorithm and the validation tests underlying ProTarget prediction. We thank Ilona for setting up the ProtoMap-based version of ProTarget. We thank Nati Linial and Elon Portugaly for fruitful discussions throughout this study. The authors wish to thank the outstanding ProtoNet team and especially Alex Savenok for the ProTarget web design. M.L. is a member of the Sudarsky Center for Computational Biology (SCCB) at the Hebrew University of Jerusalem. This study was partially supported by the CESG consortium (NIMSG, NIH) and the European SPINE consortium. Funding to pay the Open Access publication charges for this article was provided by the National Science Foundation under grant DBI-0218798 and the National Institutes of Health under grant HG 02602-01.