The cluster graphical lasso for improved estimation of Gaussian graphical models

Kean Ming Tan, Daniela Witten, Ali Shojaie

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

The task of estimating a Gaussian graphical model in the high-dimensional setting is considered. The graphical lasso, which involves maximizing the Gaussian log likelihood subject to a lasso penalty, is a well-studied approach for this task. A surprising connection between the graphical lasso and hierarchical clustering is introduced: the graphical lasso in effect performs a two-step procedure, in which (1) single linkage hierarchical clustering is performed on the variables in order to identify connected components, and then (2) a penalized log likelihood is maximized on the subset of variables within each connected component. Thus, the graphical lasso determines the connected components of the estimated network via single linkage clustering. The single linkage clustering is known to perform poorly in certain finite-sample settings. Therefore, the cluster graphical lasso, which involves clustering the features using an alternative to single linkage clustering, and then performing the graphical lasso on the subset of variables within each cluster, is proposed. Model selection consistency for this technique is established, and its improved performance relative to the graphical lasso is demonstrated in a simulation study, as well as in applications to a university webpage and a gene expression data sets.

Original languageEnglish (US)
Pages (from-to)23-36
Number of pages14
JournalComputational Statistics and Data Analysis
Volume85
DOIs
StatePublished - May 2015

Bibliographical note

Publisher Copyright:
© 2014 Elsevier B.V. All rights reserved.

Keywords

  • Hierarchical clustering
  • High-dimensional setting
  • Network
  • Single linkage clustering
  • Sparsity

Fingerprint

Dive into the research topics of 'The cluster graphical lasso for improved estimation of Gaussian graphical models'. Together they form a unique fingerprint.

Cite this