### Abstract

Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set.

Original language | English (US) |
---|---|

Pages (from-to) | 177-190 |

Number of pages | 14 |

Journal | Journal of the Royal Statistical Society. Series B: Statistical Methodology |

Volume | 71 |

Issue number | 1 |

DOIs | |

State | Published - Jan 1 2009 |

### Keywords

- Bayes information criterion
- Correlated data
- Generalized estimating equations
- Longitudinal data
- Marginal model
- Model checking
- Model selection
- Neyman smooth test
- Quadratic inference function