TY - JOUR
T1 - Automatic Inference of Sequence from Low-Resolution Crystallographic Data
AU - Ben-Aharon, Ziv
AU - Levitt, Michael
AU - Kalisman, Nir
N1 - Publisher Copyright:
© 2018 Elsevier Ltd
PY - 2018/11/6
Y1 - 2018/11/6
N2 - At resolutions worse than 3.5 Å, the electron density is weak or nonexistent at the locations of the side chains. Consequently, the assignment of the protein sequences to their correct positions along the backbone is a difficult problem. In this work, we propose a fully automated computational approach to assign sequence at low resolution. It is based on our surprising observation that standard reciprocal-space indicators, such as the initial unrefined R value, are sensitive enough to detect an erroneous sequence assignment of even a single backbone position. Our approach correctly determines the amino acid type for 15%, 13%, and 9% of the backbone positions in crystallographic datasets with resolutions of 4.0 Å, 4.5 Å, and 5.0 Å, respectively. We implement these findings in an application for threading a sequence onto a backbone structure. For the three resolution ranges, the application threads 83%, 81%, and 64% of the sequences exactly as in the deposited PDB structures. Ben-Aharon et al. find that certain crystallographic measures are more informative than previously assumed. They use these findings to solve a difficult technical problem in low-resolution crystallography: the identification of the amino acid types along the protein backbone.
AB - At resolutions worse than 3.5 Å, the electron density is weak or nonexistent at the locations of the side chains. Consequently, the assignment of the protein sequences to their correct positions along the backbone is a difficult problem. In this work, we propose a fully automated computational approach to assign sequence at low resolution. It is based on our surprising observation that standard reciprocal-space indicators, such as the initial unrefined R value, are sensitive enough to detect an erroneous sequence assignment of even a single backbone position. Our approach correctly determines the amino acid type for 15%, 13%, and 9% of the backbone positions in crystallographic datasets with resolutions of 4.0 Å, 4.5 Å, and 5.0 Å, respectively. We implement these findings in an application for threading a sequence onto a backbone structure. For the three resolution ranges, the application threads 83%, 81%, and 64% of the sequences exactly as in the deposited PDB structures. Ben-Aharon et al. find that certain crystallographic measures are more informative than previously assumed. They use these findings to solve a difficult technical problem in low-resolution crystallography: the identification of the amino acid types along the protein backbone.
KW - automatic threading
KW - low-resolution crystallography
KW - model building
KW - reciprocal-space indicators
UR - http://www.scopus.com/inward/record.url?scp=85056433853&partnerID=8YFLogxK
U2 - 10.1016/j.str.2018.08.011
DO - 10.1016/j.str.2018.08.011
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 30293812
AN - SCOPUS:85056433853
SN - 0969-2126
VL - 26
SP - 1546-1554.e2
JO - Structure
JF - Structure
IS - 11
ER -