Selecting targets for structural determination by navigating in a graph of protein families

Elon Portugaly, Ilona Kifer, Michal Linial*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


Motivation: A major goal in structural genomics is to enrich the catalogue of proteins whose 3D structures are known. In an attempt to address this problem we mapped over 10 000 proteins with solved structures onto a graph of all Swissprot protein sequences (release 36, ∼73 000 proteins) provided by ProtoMap, with the goal of sorting proteins according to their likelihood of belonging to new superfamilies. We hypothesized that proteins within neighbouring clusters tend to share common structural superfamilies or folds. If true, the likelihood of finding new superfamilies increases in clusters that are distal from other solved structures within the graph. Results: We defined an order relation between unsolved proteins according to their 'distance' from solved structures in the graph, and sorted ∼48 000 proteins. Our list can be partitioned into three groups: ∼35 000 proteins sharing a cluster with at least one known structure; ∼6500 proteins in clusters with no solved structure but with neighbouring clusters containing known structures; and a third group contains the rest of the proteins, ∼6100 (in 1274 clusters). We tested the quality of the order relation using thousands of recently solved structures that were not included when the order was defined. The tests show that our order is significantly better (P-value ∼10-5) than a random order. More interestingly, the order within the union of the second and third groups, and the order within the third group alone, perform better than random (P-values: 0.0008 and 0.15, respectively) and are better than alternative orders created using PSI-BLAST. Herein, we present a method for selecting targets to be used in structural genomics projects.

Original languageAmerican English
Pages (from-to)899-907
Number of pages9
Issue number7
StatePublished - 2002

Bibliographical note

Funding Information:
We thank Nati Linial for his mathematical advice and suggestions. We thank Adi Shraibman and Ori Mosenzon for their ideas and Yonatan Bilu for critically reading the manuscript and for his valuable suggestions. This study was partially supported by the Israeli Ministry of Science, the Israeli Ministry of Defence and the Horowitz Foundation.


Dive into the research topics of 'Selecting targets for structural determination by navigating in a graph of protein families'. Together they form a unique fingerprint.

Cite this