Optimal model assessment, selection, and combination

Xiaotong T Shen, Hsin Cheng Huang

Research output: Contribution to journalArticlepeer-review

49 Scopus citations


Central to statistical theory and application is statistical modeling, which typically involves choosing a single model or combining a number of models of different sizes and from different sources. Whereas model selection seeks a single best modeling procedure, model combination combines the strength of different modeling procedures. In this article we look at several key issues and argue that model assessment is the key to model selection and combination. Most important, we introduce a general technique of optimal model assessment based on data perturbation, thus yielding optimal selection, in particular model selection and combination. From a frequentist perspective, we advocate model combination over a selected subset of modeling procedures, because it controls bias while reducing variability, hence yielding better performance in terms of the accuracy of estimation and prediction. To realize the potential of model combination, we develop methodologies for determining the optimal tuning parameter, such as weights and subsets for combining via optimal model assessment. We present simulated and real data examples to illustrate main aspects.

Original languageEnglish (US)
Pages (from-to)554-568
Number of pages15
JournalJournal of the American Statistical Association
Issue number474
StatePublished - Jun 2006

Bibliographical note

Funding Information:
Xiaotong Shen is Professor, School of Statistics, University of Minnesota, Minneapolis, MN 55455 (E-mail: xshen@stat.umn.edu). Hsin-Cheng Huang is Associate Research Fellow, Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan (E-mail: hchuang@stat.sinica.edu.tw). Shen’s research was supported in part by National Science Foundation, grant IIS-0328802. The authors thank the editor, the associate editor, two anonymous referees, Yuhong Yang, and David Anderson for helpful comments and suggestions.


  • Data perturbation
  • Degrees of freedom
  • Dependent
  • Modeling uncertainty
  • Non/semiparametric
  • Parametric
  • Prediction


Dive into the research topics of 'Optimal model assessment, selection, and combination'. Together they form a unique fingerprint.

Cite this