Much of a cell's activity is organized as a network of interacting modules: sets of genes coregulated to respond to different conditions. We present a probabilistic method for identifying regulatory modules from gene expression data. Our procedure identifies modules of coregulated genes, their regulators and the conditions under which regulation occurs, generating testable hypotheses in the form 'regulator X regulates module Y under conditions W′. We applied the method to a Saccharomyces cerevisiae expression data set, showing its ability to identify functionally coherent modules and their correct regulators. We present microarray experiments supporting three novel predictions, suggesting regulatory roles for previously uncharacterized proteins.
Bibliographical noteFunding Information:
We thank L. Garwin, M. Scott, G. Simchen and L. Stryer for their useful comments on earlier versions of this manuscript and A. Kaushal, T. Pham, A. Tanay and R. Yelensky for technical help with software and visualization. E.S., D.K. and N.F. were supported by a National Science Foundation grant under the Information Technology Research program. E.S. was also supported by a Stanford Graduate Fellowship. M.S was supported by the Stanford University School of Medicine Dean’s Fellowship. A.R. was supported by the Colton Foundation. D.P. was supported by an Eshkol Fellowship. N.F. was also supported by an Alon Fellowship, by the Harry & Abe Sherman Senior Lectureship in Computer Science and by the Israeli Ministry of Science.