Evolutionary and functional lessons from human-specific amino acid substitution matrices

Tair Shauli, Nadav Brandes, Michal Linial*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Human genetic variation in coding regions is fundamental to the study of protein structure and function. Most methods for interpreting missense variants consider substitution measures derived from homologous proteins across different species. In this study, we introduce human-specific amino acid (AA) substitution matrices that are based on genetic variations in the modern human population. We analyzed the frequencies of >4.8M single nucleotide variants (SNVs) at codon and AA resolution and compiled human-centric substitution matrices that are fundamentally different from classic cross-species matrices (e.g. BLOSUM, PAM). Our matrices are asymmetric, with some AA replacements showing significant directional preference. Moreover, these AA matrices are only partly predicted by nucleotide substitution rates. We further test the utility of our matrices in exposing functional signals of experimentally-validated protein annotations. A significant reduction in AA transition frequencies was observed across nine post-translational modification (PTM) types and four ion-binding sites. Our results propose a purifying selection signal in the human proteome across a diverse set of functional protein annotations and provide an empirical baseline for interpreting human genetic variation in coding regions.

Original languageAmerican English
Article numberlqab079
JournalNAR Genomics and Bioinformatics
Issue number3
StatePublished - 1 Sep 2021

Bibliographical note

Publisher Copyright:
© The Author(s) 2021.


Dive into the research topics of 'Evolutionary and functional lessons from human-specific amino acid substitution matrices'. Together they form a unique fingerprint.

Cite this