Motivation: To understand cancer etiology, it is important to explore molecular changes in cellular processes from normal state to cancerous state. Because genes interact with each other during cellular processes, carcinogenesis related genes may form differential co-expression patterns with other genes in different cell states. In this study, we develop a statistical method for identifying differential gene-gene co-expression patterns in different cell states. Results: For efficient pattern recognition, we extend the traditional F-statistic and obtain an Expected Conditional F-statistic (ECF-statistic), which incorporates statistical information of location and correlation. We also propose a statistical method for data transformation. Our approach is applied to a microarray gene expression dataset for prostate cancer study. For a gene of interest, our method can select other genes that have differential gene-gene co-expression patterns with this gene in different cell states. The 10 most frequently selected genes, include hepsin, GSTP1 and AMACR, which have recently been proposed to be associated with prostate carcinogenesis. However, genes GSTP1 and AMACR cannot be identified by studying differential gene expression alone. By using tumor suppressor genes TP53, PTEN and RB1, we identify seven genes that also include hepsin, GSTP1 and AMACR. We show that genes associated with cancer may have differential gene-gene expression patterns with many other genes in different cell states. By discovering such patterns, we may be able to identify carcinogenesis related genes.
Bibliographical noteFunding Information:
We thank two anonymous reviewers for their valuable comments. This work was supported in part by NSF grant DMS 0241160 and NIH grant GM 59507.