Abstract
Predictor envelopes model the response variable by using a subspace of dimension d extracted from the full space of all p input variables. Predictor envelopes have a close connection to partial least squares and enjoy improved estimation efficiency in theory. As such, predictor envelopes have become increasingly popular in Chemometrics. Often, d is much smaller than p, which seemingly enhances the interpretability of the envelope model. However, the process of estimating the envelope subspace adds complexity to the final fitted model. To better understand the complexity of predictor envelopes, we study their effective degrees of freedom (EDF) in a variety of settings. We find that in many cases a d-dimensional predictor envelope model can have far more than d + 1 EDF and often has close to p + 1. However, the EDF of a predictor envelope depend heavily on the structure of the underlying data-generating model and there are settings under which predictor envelopes can have substantially reduced model complexity.
Original language | English (US) |
---|---|
Pages (from-to) | 528-541 |
Number of pages | 14 |
Journal | Journal of Data Science |
Volume | 19 |
Issue number | 4 |
DOIs | |
State | Published - Oct 2021 |
Bibliographical note
Publisher Copyright:© 2021 Center for Applied Statistics, School of Statistics, Renmin University of China. All rights reserved.
Keywords
- dimension reduction
- effective degrees of freedom
- envelopes
- Monte Carlo