TY - JOUR
T1 - Approaches to multiplicity issues in complex research in microarray analysis
AU - Yekutieli, Daniel
AU - Reiner-Benaim, Anat
AU - Benjamini, Yoav
AU - Elmer, Gregory I.
AU - Kafkafi, Neri
AU - Letwin, Noah E.
AU - Lee, Norman H.
PY - 2006/11
Y1 - 2006/11
N2 - The multiplicity problem is evident in the simplest form of statistical analysis of gene expression data - the identification of differentially expressed genes. In more complex analysis, the problem is compounded by the multiplicity of hypotheses per gene. Thus, in some cases, it may be necessary to consider testing millions of hypotheses. We present three general approaches for addressing multiplicity in large research problems, (a) Use the scalability of false discovery rate (FDR) controlling procedures; (b) apply FDR-controlling procedures to a selected subset of hypotheses; (c) apply hierarchical FDR-controlling procedures. We also offer a general framework for ensuring reproducible results in complex research, where a researcher faces more than just one large research problem. We demonstrate these approaches by analyzing the results of a complex experiment involving the study of gene expression levels in different brain regions across multiple mouse strains.
AB - The multiplicity problem is evident in the simplest form of statistical analysis of gene expression data - the identification of differentially expressed genes. In more complex analysis, the problem is compounded by the multiplicity of hypotheses per gene. Thus, in some cases, it may be necessary to consider testing millions of hypotheses. We present three general approaches for addressing multiplicity in large research problems, (a) Use the scalability of false discovery rate (FDR) controlling procedures; (b) apply FDR-controlling procedures to a selected subset of hypotheses; (c) apply hierarchical FDR-controlling procedures. We also offer a general framework for ensuring reproducible results in complex research, where a researcher faces more than just one large research problem. We demonstrate these approaches by analyzing the results of a complex experiment involving the study of gene expression levels in different brain regions across multiple mouse strains.
KW - False discovery rate
KW - Hierarchical testing
KW - High throughput analysis
UR - http://www.scopus.com/inward/record.url?scp=33750022361&partnerID=8YFLogxK
U2 - 10.1111/j.1467-9574.2006.00343.x
DO - 10.1111/j.1467-9574.2006.00343.x
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:33750022361
SN - 0039-0402
VL - 60
SP - 414
EP - 437
JO - Statistica Neerlandica
JF - Statistica Neerlandica
IS - 4
ER -