Minimizing and learning energy functions for side-chain prediction

Chen Yanover*, Ora Schueler-Furman, Yair Weiss

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

16 Scopus citations

Abstract

Side-chain prediction is an important subproblem of the general protein folding problem. Despite much progress in side-chain prediction, performance is far from satisfactory. As an example, the ROSETTA program that uses simulated annealing to select the minimum energy conformations, correctly predicts the first two side-chain angles for approximately 72% of the buried residues in a standard data set. Is further improvement more likely to come from better search methods, or from better energy functions? Given that exact minimization of the energy is NP hard, it is difficult to get a systematic answer to this question. In this paper, we present a novel search method and a novel method for learning energy functions from training data that are both based on Tree Reweighted Belief Propagation (TRBP). We find that TRBP can find the global optimum of the ROSETTA energy function in a few minutes of computation for approximately 85% of the proteins in a standard benchmark set. TRBP can also effectively bound the partition function which enables using the Conditional Random Fields (CRF) framework for learning. Interestingly, finding the global minimum does not significantly improve side-chain prediction for an energy function based on ROSETTA's default energy terms (less than 0.1%), while learning new weights gives a significant boost from 72% to 78%. Using a recently modified ROSETTA energy function with a softer Lennard-Jones repulsive term, the global optimum does improve prediction accuracy from 77% to 78%. Here again, learning new weights improves side-chain modeling even further to 80%. Finally, the highest accuracy (82.6%) is obtained using an extended rotamer library and CRF learned weights. Our results suggest that combining machine learning with approximate inference can improve the state-of-the-art in side-chain prediction.

Original languageAmerican English
Title of host publicationResearch in Computational Molecular Biology - 11th Annual International Conference, RECOMB 2007, Proceedings
PublisherSpringer Verlag
Pages381-395
Number of pages15
ISBN (Print)3540716807, 9783540716808
DOIs
StatePublished - 2007
Event11th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2007 - Oakland, CA, United States
Duration: 21 Apr 200725 Apr 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4453 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2007
Country/TerritoryUnited States
CityOakland, CA
Period21/04/0725/04/07

Fingerprint

Dive into the research topics of 'Minimizing and learning energy functions for side-chain prediction'. Together they form a unique fingerprint.

Cite this