Probabilities for having a new fold on the basis of a map of all protein sequences

Elon Portugaly*, Michal Linial

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

Abstract

It is a major problem in the study of protein structure to predict which proteins have new, currently unknown structural folds. In an attempt to address this problem we studied the location of all proteins with solved structures within the map of all known protein sequences provided by ProtoMap. The mutual distances in this map among solved structures are used to derive a probabilistic model from which we infer an estimate for the probability of an unsolved protein to have a new fold. The probabilities were based on data from SCOP release 1.37. The results were evaluated against the more recent SCOP pre-release 1.41. Our predicted probabilities for unsolved proteins to have a new fold are very well correlated with the proportion of new folds among recently released structures. Thus, information about the structure of proteins can be inferred from a global relational view of protein sequences. Finally, the same procedure was applied to estimate probabilities on the basis of SCOP 1.41. A list of the highest scoring proteins is provided: These are about 80 non-membranous proteins that belong to clusters with more than 5 proteins and achieve the highest probability to have a new fold. A rational selection for 3D determination of those targets is expected to accelerate the pace of new fold discovery.

Original languageEnglish
Pages237-244
Number of pages8
DOIs
StatePublished - 2000
EventRECOMB 2000: 4th Annual International Conference on Computational Molecular Biology - Tokyo, Jpn
Duration: 8 Apr 200011 Apr 2000

Conference

ConferenceRECOMB 2000: 4th Annual International Conference on Computational Molecular Biology
CityTokyo, Jpn
Period8/04/0011/04/00

Fingerprint

Dive into the research topics of 'Probabilities for having a new fold on the basis of a map of all protein sequences'. Together they form a unique fingerprint.

Cite this