ProtoMap: Automatic classification of protein sequences and hierarchy of protein families

Golan Yona, Nathan Linial*, Michal Linial

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

128 Scopus citations

Abstract

The ProtoMap site offers an exhaustive classification of all proteins in the SWISS-PROT database, into groups of related proteins. The classification is based on analysis of all pairwise similarities among protein sequences. The analysis makes essential use of transitivity to identify homologies among proteins. Within each group of the classification, every two members are either directly or transitively related. However, transitivity is applied restrictively in order to prevent unrelated proteins from clustering together. The classification is done at different levels of confidence, and yields a hierarchical organization of all proteins. The resulting classification splits the protein space into well-defined groups of proteins, which are closely correlated with natural biological families and superfamilies. Many clusters contain protein sequences that are not classified by other databases. The hierarchical organization suggested by our analysis may help in detecting finer subfamilies in families of known proteins. In addition it brings forth interesting relationships between protein families, upon which local maps for the neighborhood of protein families can be sketched. The ProtoMap web server can be accessed at http://www.protomap.cs.huji.ac.il.

Original languageEnglish
Pages (from-to)49-55
Number of pages7
JournalNucleic Acids Research
Volume28
Issue number1
DOIs
StatePublished - 1 Jan 2000

Fingerprint

Dive into the research topics of 'ProtoMap: Automatic classification of protein sequences and hierarchy of protein families'. Together they form a unique fingerprint.

Cite this