Abstract
Motivation: A major goal in structural genomics is to enrich the catalogue of proteins whose 3D structures are known. In an attempt to address this problem we mapped over 10 000 proteins with solved structures onto a graph of all Swissprot protein sequences (release 36, ∼73 000 proteins) provided by ProtoMap, with the goal of sorting proteins according to their likelihood of belonging to new superfamilies. We hypothesized that proteins within neighbouring clusters tend to share common structural superfamilies or folds. If true, the likelihood of finding new superfamilies increases in clusters that are distal from other solved structures within the graph. Results: We defined an order relation between unsolved proteins according to their 'distance' from solved structures in the graph, and sorted ∼48 000 proteins. Our list can be partitioned into three groups: ∼35 000 proteins sharing a cluster with at least one known structure; ∼6500 proteins in clusters with no solved structure but with neighbouring clusters containing known structures; and a third group contains the rest of the proteins, ∼6100 (in 1274 clusters). We tested the quality of the order relation using thousands of recently solved structures that were not included when the order was defined. The tests show that our order is significantly better (P-value ∼10-5) than a random order. More interestingly, the order within the union of the second and third groups, and the order within the third group alone, perform better than random (P-values: 0.0008 and 0.15, respectively) and are better than alternative orders created using PSI-BLAST. Herein, we present a method for selecting targets to be used in structural genomics projects.
Original language | English |
---|---|
Pages (from-to) | 899-907 |
Number of pages | 9 |
Journal | Bioinformatics |
Volume | 18 |
Issue number | 7 |
DOIs | |
State | Published - 2002 |
Bibliographical note
Funding Information:We thank Nati Linial for his mathematical advice and suggestions. We thank Adi Shraibman and Ori Mosenzon for their ideas and Yonatan Bilu for critically reading the manuscript and for his valuable suggestions. This study was partially supported by the Israeli Ministry of Science, the Israeli Ministry of Defence and the Horowitz Foundation.