Abstract
Making informed decisions about model adequacy has been an outstanding issue for regression models with discrete outcomes. Standard assessment tools for such outcomes (e.g., deviance residuals) often show a large discrepancy from the hypothesized pattern even under the true model and are not informative, especially when data are highly discrete (e.g., binary). To fill this gap, we propose a quasi-empirical residual distribution function for general discrete (e.g., ordinal and count) outcomes that serves as an alternative to the empirical distribution function of Cox–Snell residuals. The assessment tool we propose is a principled approach and does not require injecting noise into the data. When at least one continuous covariate is available, we show asymptotically that the proposed function converges uniformly to the identity function under the correctly specified model, even with highly discrete outcomes. Through simulation studies, we demonstrate empirically that the proposed quasi-empirical residual distribution function outperforms commonly used residuals for various model assessment tasks, since it is close to the hypothesized pattern under the true model and significantly departs from this pattern under model misspecification, and is thus an effective assessment tool. Supplementary materials for this article are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 1019-1035 |
Number of pages | 17 |
Journal | Journal of Computational and Graphical Statistics |
Volume | 30 |
Issue number | 4 |
Early online date | May 12 2021 |
DOIs | |
State | Published - May 12 2021 |
Bibliographical note
Publisher Copyright:© 2021 The Author(s). Published with license by Taylor & Francis Group, LLC.
Keywords
- Cox–Snell residuals
- Generalized linear models
- Goodness of fit
- Insurance claim frequency
- m-asymptotics