TY - JOUR
T1 - Clustering gene expression profile data by selective shrinkage
AU - Ishwaran, Hemant
AU - Sunil Rao, J.
PY - 2008/9/1
Y1 - 2008/9/1
N2 - Clustering of gene expression profiles is a widely used approach for finding macroscopic data structure. A complication in such analyses is that not all genes are informative for forming clusters and different clusters might have different transcription regulation. Driven by these considerations, we present a novel two-stage clustering approach. The first stage identifies informative genes by adaptive variable selection using pseudo-samples modeled by a high dimensional multigroup ANOVA model. Variables are selected using a rescaled spike and slab Bayesian hierarchical model having a special selective shrinkage property. The second stage uses output from the first stage for clustering. We demonstrate why selective shrinkage occurs, and by extension, why it is useful for the clustering paradigm. We analyze a human gene atlas expression dataset where the question of interest is to look for tissue-specific transcription regulation and investigate whether tissues can be grouped together due to similar genomic control.
AB - Clustering of gene expression profiles is a widely used approach for finding macroscopic data structure. A complication in such analyses is that not all genes are informative for forming clusters and different clusters might have different transcription regulation. Driven by these considerations, we present a novel two-stage clustering approach. The first stage identifies informative genes by adaptive variable selection using pseudo-samples modeled by a high dimensional multigroup ANOVA model. Variables are selected using a rescaled spike and slab Bayesian hierarchical model having a special selective shrinkage property. The second stage uses output from the first stage for clustering. We demonstrate why selective shrinkage occurs, and by extension, why it is useful for the clustering paradigm. We analyze a human gene atlas expression dataset where the question of interest is to look for tissue-specific transcription regulation and investigate whether tissues can be grouped together due to similar genomic control.
UR - http://www.scopus.com/inward/record.url?scp=49349102807&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=49349102807&partnerID=8YFLogxK
U2 - 10.1016/j.spl.2008.01.003
DO - 10.1016/j.spl.2008.01.003
M3 - Article
AN - SCOPUS:49349102807
SN - 0167-7152
VL - 78
SP - 1490
EP - 1497
JO - Statistics and Probability Letters
JF - Statistics and Probability Letters
IS - 12
ER -