A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments

Research output: Contribution to journalArticlepeer-review

397 Scopus citations

Abstract

Motivation: A common task in analyzing microarray data is to determine which genes are differentially expressed across two kinds of tissue samples or samples obtained under two experimental conditions. Recently several statistical methods have been proposed to accomplish this goal when there are replicated samples under each condition. However, it may not be clear how these methods compare with each other. Our main goal here is to compare three methods, the t-test, a regression modeling approach (Thomas et al., Genome Res., 11, 1227-1236, 2001) and a mixture model approach (Pan et al., http://www.biostat.umn.edu/cgi-bin/rrs?print+2001, 2001a,b) with particular attention to their different modeling assumptions. Results: It is pointed out that all the three methods are based on using the two-sample t-statistic or its minor variation, but they differ in how to associate a statistical significance level to the corresponding statistic, leading to possibly large difference in the resulting significance levels and the numbers of genes detected. In particular, we give an explicit formula for the test statistic used in the regression approach. Using the leukemia data of Golub et al. (Science, 285, 531-537, 1999), we illustrate these points. We also briefly compare the results with those of several other methods, including the empirical Bayesian method of Efron et al. (J. Am. Stat. Assoc., to appear, 2001) and the Significance Analysis of Microarray (SAM) method of Tusher et al.

Original languageEnglish (US)
Pages (from-to)546-554
Number of pages9
JournalBioinformatics
Volume18
Issue number4
DOIs
StatePublished - 2002

Fingerprint Dive into the research topics of 'A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments'. Together they form a unique fingerprint.

Cite this