Assessment of Regression Models With Discrete Outcomes Using Quasi-Empirical Residual Distribution Functions

Research output: Contribution to journalArticlepeer-review

Abstract

Making informed decisions about model adequacy has been an outstanding issue for regression models with discrete outcomes. Standard assessment tools for such outcomes (e.g., deviance residuals) often show a large discrepancy from the hypothesized pattern even under the true model and are not informative, especially when data are highly discrete (e.g., binary). To fill this gap, we propose a quasi-empirical residual distribution function for general discrete (e.g., ordinal and count) outcomes that serves as an alternative to the empirical distribution function of Cox–Snell residuals. The assessment tool we propose is a principled approach and does not require injecting noise into the data. When at least one continuous covariate is available, we show asymptotically that the proposed function converges uniformly to the identity function under the correctly specified model, even with highly discrete outcomes. Through simulation studies, we demonstrate empirically that the proposed quasi-empirical residual distribution function outperforms commonly used residuals for various model assessment tasks, since it is close to the hypothesized pattern under the true model and significantly departs from this pattern under model misspecification, and is thus an effective assessment tool. Supplementary materials for this article are available online.

Original languageEnglish (US)
JournalJournal of Computational and Graphical Statistics
Early online dateMay 12 2021
DOIs
StatePublished - May 12 2021

Bibliographical note

Publisher Copyright:
© 2021 The Author(s). Published with license by Taylor & Francis Group, LLC.

Keywords

  • Cox–Snell residuals
  • Generalized linear models
  • Goodness of fit
  • Insurance claim frequency
  • m-asymptotics

Fingerprint

Dive into the research topics of 'Assessment of Regression Models With Discrete Outcomes Using Quasi-Empirical Residual Distribution Functions'. Together they form a unique fingerprint.

Cite this