Variable selection with the strong heredity constraint and its oracle property

Nam Hee Choi, William Li, Ji Zhu

Research output: Contribution to journalArticlepeer-review

118 Scopus citations

Abstract

In this paper, we extend the LASSO method (Tibshirani 1996) for simultaneously fitting a regression model and identifying important interaction terms. Unlike most of the existing variable selection methods, our method automatically enforces the heredity constraint, that is, an interaction term can be included in the model only if the corresponding main terms are also included in the model. Furthermore, we extend our method to generalized linear models, and show that it performs as well as if the true model were given in advance, that is, the oracle property as in Fan and Li (2001) and Fan and Peng (2004). The proof of the oracle property is given in online supplemental materials. Numerical results on both simulation data and real data indicate that our method tends to remove irrelevant variables more effectively and provide better prediction performance than previous work (Yuan, Joseph, and Lin 2007 and Zhao, Rocha, and Yu 2009 as well as the classical LASSO method).

Original languageEnglish (US)
Pages (from-to)354-364
Number of pages11
JournalJournal of the American Statistical Association
Volume105
Issue number489
DOIs
StatePublished - Mar 2010

Bibliographical note

Funding Information:
Nam Hee Choi is Lecturer, Department of Statistics, University of Michigan, Ann Arbor, MI 48109. William Li is Professor, Carlson School of Management, University of Minnesota, Minneapolis, MN 55455. Ji Zhu is Associate Professor, Department of Statistics, University of Michigan, Ann Arbor, MI 48109 (E-mail: [email protected]). We thank Rayjean Hung, Stefano Porru, Paolo Boffetta, and John Witte for sharing the bladder cancer dataset. Choi and Zhu are partially supported by grants DMS-0705532 and DMS-0748389 from the National Science Foundation.

Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.

Keywords

  • Heredity structure
  • LASSO
  • Regularization

Fingerprint

Dive into the research topics of 'Variable selection with the strong heredity constraint and its oracle property'. Together they form a unique fingerprint.

Cite this