Revealing lineage-related signals in single-cell gene expression using random matrix theory

Mor Nitzan*, Michael P. Brenner

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Gene expression profiles of a cellular population, generated by single-cell RNA sequencing, contains rich information about biological state, including cell type, cell cycle phase, gene regulatory patterns, and location within the tissue of origin. A major challenge is to disentangle information about these different biological states from each other, including distinguishing from cell lineage, since the correlation of cellular expression patterns is necessarily contaminated by ancestry. Here, we use a recent advance in random matrix theory, discovered in the context of protein phylogeny, to identify differentiation or ancestry-related processes in single-cell data. Qin and Colwell [C. Qin, L. J. Colwell, Proc. Natl. Acad. Sci. U.S.A. 115, 690-695 (2018)] showed that ancestral relationships in protein sequences create a power-law signature in the covariance eigenvalue distribution. We demonstrate the existence of such signatures in scRNA-seq data and that the genes driving them are indeed related to differentiation and developmental pathways. We predict the existence of similar power-law signatures for cells along linear trajectories and demonstrate this for linearly differentiating systems. Furthermore, we generalize to show that the same signatures can arise for cells along tissuespecific spatial trajectories. We illustrate these principles in diverse tissues and organisms, including the mammalian epidermis and lung, Drosophila whole-embryo, adult Hydra, dendritic cells, the intestinal epithelium, and cells undergoing induced pluripotent stem cells (iPSC) reprogramming. We show how these results can be used to interpret the gradual dynamics of lineage structure along iPSC reprogramming. Together, we provide a framework that can be used to identify signatures of specific biological processes in single-cell data without prior knowledge and identify candidate genes associated with these processes.

Original languageAmerican English
Article numbere1913931118
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number11
StatePublished - 16 Mar 2021

Bibliographical note

Publisher Copyright:
© 2021 National Academy of Sciences. All rights reserved.


  • Cellular lineage
  • Random matrix theory
  • Single-cell data
  • Spectral analysis


Dive into the research topics of 'Revealing lineage-related signals in single-cell gene expression using random matrix theory'. Together they form a unique fingerprint.

Cite this