Abstract
For some modeling problems a population may be better assessed as an aggregate of unknown subpopulations, each with a distinct relationship between a response and associated variables. The finite mixture of regressions (FMR) model, in which an outcome is derived from one of a finite number of linear regression models, is a natural tool in this setting. In this article, we first propose a new penalized regression approach. Then, we demonstrate how the proposed approach better identifies subpopulations and their corresponding models than a semiparametric FMR method does. Our new method fits models for each person via grouping pursuit, utilizing a new group-truncated L1 penalty that shrinks the differences between estimated parameter vectors. The methodology causes the individuals' models to cluster into a few common models, in turn revealing previously unknown subpopulations. In fact, by varying the penalty strength, the new method can reveal a hierarchical structure among the subpopulations that can be useful in exploratory analyses. Simulations using FMR models and a real-data analysis show that the method performs promisingly well.
Original language | English (US) |
---|---|
Pages (from-to) | 783-807 |
Number of pages | 25 |
Journal | Statistica Sinica |
Volume | 30 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2020 |
Bibliographical note
Publisher Copyright:© 2020 Institute of Statistical Science. All rights reserved.
Keywords
- FMR
- Group LASSO
- Group TLP
- Grouping pursuit
- Penalized regression
- Semiparametric