Abstract
We propose a procedure associated with the idea of the E-M algorithm for model selection in the presence of missing data. The idea extends the concept of parameters to include both the model and the parameters under the model, and thus allows the model to be part of the E-M iterations. We develop the procedure, known as the E-MS algorithm, under the assumption that the class of candidate models is finite. Some special cases of the procedure are considered, including E-MS with the generalized information criteria (GIC), and E-MS with the adaptive fence (AF; Jiang et al.). We prove numerical convergence of the E-MS algorithm as well as consistency in model selection of the limiting model of the E-MS convergence, for E-MS with GIC and E-MS with AF. We study the impact on model selection of different missing data mechanisms. Furthermore, we carry out extensive simulation studies on the finite-sample performance of the E-MS with comparisons to other procedures. The methodology is also illustrated on a real data analysis involving QTL mapping for an agricultural study on barley grains. Supplementary materials for this article are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 1136-1147 |
Number of pages | 12 |
Journal | Journal of the American Statistical Association |
Volume | 110 |
Issue number | 511 |
DOIs | |
State | Published - Jul 3 2015 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2015, © American Statistical Association.
Keywords
- Backcross experiments
- Conditional sampling
- Consistency
- Convergence
- Missing data mechanism
- Model selection
- Regression