From promoter sequence to expression: A probabilistic framework

Eran Segal, Yoseph Barash, Itamar Simon, Nir Friedman*, Daphne Koller

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

82 Scopus citations

Abstract

We present a probabilistic framework that models the process by which transcriptional binding explains the mRNA expression of different genes. Our joint probabilistic model unifies the two key components of this process: the prediction of gene regulation events from sequence motifs in the gene's promoter region, and the prediction of mRNA expression from combinations of gene regulation events in different settings. Our approach has several advantages. By learning promoter sequence motifs that are directly predictive of expression data, it can improve the identification of binding site patterns. It is also able to identify combinatorial regulation via interactions of different transcription factors. Finally, the general framework allows us to integrate additional data sources, including data from the recent binding localization assays. We demonstrate our approach on the cell cycle data of Spellman et al., combined with the binding localization information of Simon et al. We show that the learned model predicts expression from sequence, and that it identifies coherent co-regulated groups with significant transcription factor motifs. It also provides valuable biological insight into the domain via these co-regulated "modules" and the combinatorial regulation effects that govern their behavior.

Original languageEnglish
Pages263-272
Number of pages10
StatePublished - 2002
EventRECOMB 2002: Proceedings of the Sixth Annual International Conference on Computational Biology - Washington, DC, United States
Duration: 18 Apr 200221 Apr 2002

Conference

ConferenceRECOMB 2002: Proceedings of the Sixth Annual International Conference on Computational Biology
Country/TerritoryUnited States
CityWashington, DC
Period18/04/0221/04/02

Fingerprint

Dive into the research topics of 'From promoter sequence to expression: A probabilistic framework'. Together they form a unique fingerprint.

Cite this