TY - JOUR
T1 - CombFold
T2 - predicting structures of large protein assemblies using a combinatorial assembly algorithm and AlphaFold2
AU - Shor, Ben
AU - Schneidman-Duhovny, Dina
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/3
Y1 - 2024/3
N2 - Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold’s high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
AB - Deep learning models, such as AlphaFold2 and RosettaFold, enable high-accuracy protein structure prediction. However, large protein complexes are still challenging to predict due to their size and the complexity of interactions between multiple subunits. Here we present CombFold, a combinatorial and hierarchical assembly algorithm for predicting structures of large protein complexes utilizing pairwise interactions between subunits predicted by AlphaFold2. CombFold accurately predicted (TM-score >0.7) 72% of the complexes among the top-10 predictions in two datasets of 60 large, asymmetric assemblies. Moreover, the structural coverage of predicted complexes was 20% higher compared to corresponding Protein Data Bank entries. We applied the method on complexes from Complex Portal with known stoichiometry but without known structure and obtained high-confidence predictions. CombFold supports the integration of distance restraints based on crosslinking mass spectrometry and fast enumeration of possible complex stoichiometries. CombFold’s high accuracy makes it a promising tool for expanding structural coverage beyond monomeric proteins.
UR - http://www.scopus.com/inward/record.url?scp=85184169655&partnerID=8YFLogxK
U2 - 10.1038/s41592-024-02174-0
DO - 10.1038/s41592-024-02174-0
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 38326495
AN - SCOPUS:85184169655
SN - 1548-7091
VL - 21
SP - 477
EP - 487
JO - Nature Methods
JF - Nature Methods
IS - 3
ER -