Serine substitutions are linked to codon usage and differ for variable and conserved protein regions

Gregory W. Schwartz, Tair Shauli, Michal Linial, Uri Hershberg*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Serine is the only amino acid that is encoded by two disjoint codon sets (TCN & AGY) so that a tandem substitution of two nucleotides is required to switch between the two sets. We show that these codon sets underlie distinct substitution patterns at positions subject to purifying and diversifying selections. We found that in humans, positions that are conserved among ~100 vertebrates, and thus subjected to purifying selection, are enriched for substitutions involving serine (TCN, denoted S′), proline, and alanine, (S′PA). In contrast, the less conserved positions are enriched for serine encoded with AGY codons (denoted S″), glycine and asparagine, (GS″N). We tested this phenomenon in the HIV envelope glycoprotein (gp120), and the V-gene that encodes B-cell receptors/antibodies. These fast evolving proteins both have hypervariable positions, which are under diversifying selection, closely adjacent to highly conserved structural regions. In both instances, we identified an opposite abundance of two groups of serine substitutions, with enrichment of S′PA in the conserved positions, and GS″N in the hypervariable regions. Finally, we analyzed the substitutions across 60,000 individual human exomes to show that, when serine has a specific functional constraint of phosphorylation capability, S′ codons are 32-folds less prone than S″ to substitutions to Threonine or Tyrosine that could potentially retain the phosphorylation site capacity. Combined, our results, that cover evolutionary signals at different temporal scales, demonstrate that through its encoding by two codon sets, serine allows for the existence of alternating substitution patterns within positions of functional maintenance versus sites of rapid diversification.

Original languageAmerican English
Article number17238
JournalScientific Reports
Issue number1
StatePublished - 1 Dec 2019

Bibliographical note

Publisher Copyright:
© 2019, The Author(s).


Dive into the research topics of 'Serine substitutions are linked to codon usage and differ for variable and conserved protein regions'. Together they form a unique fingerprint.

Cite this