Abstract
We study flexible modeling of clustered data using marginal generalized additive partial linear models with a diverging number of covariates. Generalized estimating equations are used to fit the model with the nonparametric functions being approximated by polynomial splines. We investigate the asymptotic properties in a "large n, diverging p" framework. More specifically, we establish the consistency and asymptotic normality of the estimators for the linear parameters under mild conditions. We propose a penalized estimating equations based procedure for simultaneous variable selection and estimation. The proposed variable selection procedure enjoys the oracle property and allows the number of parameters in the linear part to increase at the same order as the sample size under some general conditions. Extensive Monte Carlo simulations demonstrate that the proposed methods work well with moderate sample sizes. a dataset is analyzed to illustrate the application.
Original language | English (US) |
---|---|
Pages (from-to) | 173-196 |
Number of pages | 24 |
Journal | Statistica Sinica |
Volume | 24 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2014 |
Keywords
- Clustered data
- GEE
- High dimension
- Injective function
- Marginal regression
- Polynomial splines