TY - JOUR
T1 - A deep learning model for predicting optimal distance range in crosslinking mass spectrometry data
AU - Cohen, Shon
AU - Schneidman-Duhovny, Dina
N1 - Publisher Copyright:
© 2023 The Authors. Proteomics published by Wiley-VCH GmbH.
PY - 2023/9
Y1 - 2023/9
N2 - Macromolecular assemblies play an important role in all cellular processes. While there has recently been significant progress in protein structure prediction based on deep learning, large protein complexes cannot be predicted with these approaches. The integrative structure modeling approach characterizes multi-subunit complexes by computational integration of data from fast and accessible experimental techniques. Crosslinking mass spectrometry is one such technique that provides spatial information about the proximity of crosslinked residues. One of the challenges in interpreting crosslinking datasets is designing a scoring function that, given a structure, can quantify how well it fits the data. Most approaches set an upper bound on the distance between Cα atoms of crosslinked residues and calculate a fraction of satisfied crosslinks. However, the distance spanned by the crosslinker greatly depends on the neighborhood of the crosslinked residues. Here, we design a deep learning model for predicting the optimal distance range for a crosslinked residue pair based on the structures of their neighborhoods. We find that our model can predict the distance range with the area under the receiver-operator curve of 0.86 and 0.7 for intra- and inter-protein crosslinks, respectively. Our deep scoring function can be used in a range of structure modeling applications.
AB - Macromolecular assemblies play an important role in all cellular processes. While there has recently been significant progress in protein structure prediction based on deep learning, large protein complexes cannot be predicted with these approaches. The integrative structure modeling approach characterizes multi-subunit complexes by computational integration of data from fast and accessible experimental techniques. Crosslinking mass spectrometry is one such technique that provides spatial information about the proximity of crosslinked residues. One of the challenges in interpreting crosslinking datasets is designing a scoring function that, given a structure, can quantify how well it fits the data. Most approaches set an upper bound on the distance between Cα atoms of crosslinked residues and calculate a fraction of satisfied crosslinks. However, the distance spanned by the crosslinker greatly depends on the neighborhood of the crosslinked residues. Here, we design a deep learning model for predicting the optimal distance range for a crosslinked residue pair based on the structures of their neighborhoods. We find that our model can predict the distance range with the area under the receiver-operator curve of 0.86 and 0.7 for intra- and inter-protein crosslinks, respectively. Our deep scoring function can be used in a range of structure modeling applications.
KW - crosslinking mass spectrometry
KW - deep learning
KW - protein structure
KW - protein–protein interactions
UR - http://www.scopus.com/inward/record.url?scp=85158126630&partnerID=8YFLogxK
U2 - 10.1002/pmic.202200341
DO - 10.1002/pmic.202200341
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 37070547
AN - SCOPUS:85158126630
SN - 1615-9853
VL - 23
JO - Proteomics
JF - Proteomics
IS - 17
M1 - 2200341
ER -