Adaptive model selection and assessment for exponential family distributions

Xiaotong Shen, Hsin Cheng Huang, Jimmy Ye

Research output: Contribution to journalArticle

26 Scopus citations

Abstract

In many scientific and engineering problems, selecting the optimal model from a large pool of candidate models is important, particularly in data mining. In the literature, model assessment in the context of non-normal distributions has not yet received a lot of attention. Indeed, many existing model selection criteria such as the Bayes information criterion and C p may not be suitable for a situation in which the conditional mean and variance of the response are dependent, such as in generalized linear model regression. In this article we propose a new adaptive model selection criterion and construct an approximately unbiased Kullback-Leibler loss estimator for model assessment in the context of exponential family distributions. This permits comparing any arbitrary complex modeling procedures. Our proposal uses a concept called generalized degrees of freedom that generalizes the concept originally proposed for the normal distribution. The proposed procedure is implemented for the binomial and Poisson distributions and its small sample operating characteristics are examined via simulations. The usefulness of the method is demonstrated by an application to a study of the effect of air pollution on certain respiratory diseases. Numerical analyses support the utility of the methodology.

Original languageEnglish (US)
Pages (from-to)306-317
Number of pages12
JournalTechnometrics
Volume46
Issue number3
DOIs
StatePublished - Aug 2004

Keywords

  • Adaptive penalty
  • Cross-validation
  • Loss estimation
  • Parametric and nonparametric regression
  • Trees
  • Variable selection

Fingerprint Dive into the research topics of 'Adaptive model selection and assessment for exponential family distributions'. Together they form a unique fingerprint.

  • Cite this