We aim to estimate multiple networks in the presence of sample heterogeneity, where the independent samples (i.e. observations) may come from different and unknown populations or distributions. Specifically, we consider penalized estimation of multiple precision matrices in the framework of a Gaussian mixture model. A major innovation is to take advantage of the commonalities across the multiple precision matrices through possibly nonconvex fusion regularization, which for example makes it possible to achieve simultaneous discovery of unknown disease subtypes and detection of differential gene (dys)regulations in functional genomics. We embed in the EM algorithm one of two recently proposed methods for estimating multiple precision matrices in Gaussian graphical models. We demonstrate the feasibility and potential usefulness of the proposed methods in an application to glioblastoma subtype discovery and differential gene network analysis with a microarray gene expression data set. We also conduct realistic simulation studies to evaluate and compare the performance of various methods.
Bibliographical noteFunding Information:
The authors are grateful to the Editor and reviewers for constructive comments. This research was supported by NIH grants R01GM081535, R01GM113250, R01HL105397 and R01HL116720, and by the Minnesota Supercomputing Institute.
- Disease subtype discovery
- Gaussian graphical model
- Gene expression
- Model-based clustering
- Non-convex penalty