VarNMF: non-negative probabilistic factorization with source variation

Ela Fallik, Nir Friedman*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

MOTIVATION: Non-negative matrix factorization (NMF) is a powerful tool often applied to genomic data to identify non-negative latent components that constitute linearly mixed samples. It is useful when the observed signal combines contributions from multiple sources, such as cell types in bulk measurements of heterogeneous tissue. NMF accounts for two types of variation between samples - disparities in the proportions of sources and observation noise. However, in many settings, there is also a non-trivial variation between samples in the contribution of each source to the mixed data. This variation cannot be accurately modeled using the NMF framework.

RESULTS: We present VarNMF, a probabilistic extension of NMF that explicitly models this variation in source values. We show that by modeling sources as non-negative distributions, we can recover source variation directly from mixed samples without observing any of the sources directly. We apply VarNMF to a cell-free ChIP-seq dataset of two cancer cohorts and a healthy cohort, demonstrating that VarNMF provides a better estimation of the data distribution. Moreover, VarNMF extracts cancer-associated source distributions that decouple the tumor characteristics from the amount of tumor contribution, and identify patient-specific disease behaviors. This decomposition highlights the inter-tumor variability that is obscured in the mixed samples.

AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/Nir-Friedman-Lab/VarNMF.

Original languageEnglish
Article numberbtae758
JournalBioinformatics
Volume41
Issue number1
StatePublished - 1 Jan 2025

Bibliographical note

Publisher Copyright:
© 2024 The Author(s). Published by Oxford University Press.

Fingerprint

Dive into the research topics of 'VarNMF: non-negative probabilistic factorization with source variation'. Together they form a unique fingerprint.

Cite this