Estimation of generalization error: Random and fixed inputs

Junhui Wang, Xiaotong T Shen

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


In multicategory classification, an estimated generalization error is often used to quantify a classifier's generalization ability. As a result, quality of estimation of the generalization error becomes crucial in tuning and combining classifiers. This article proposes an estimation methodology for the generalization error, permitting a treatment of both fixed and random inputs, which is in contrast to the conditional classification error commonly used in the statistics literature. In particular, we derive a novel data perturbation technique, that jointly perturbs both inputs and outputs, to estimate the generalization error. We show that the proposed technique yields optimal tuning and combination, as measured by generalization. We also demonstrate via simulation that it outperforms cross-validation for both fixed and random designs, in the context of margin classification. The results support utility of the proposed methodology.

Original languageEnglish (US)
Pages (from-to)569-588
Number of pages20
JournalStatistica Sinica
Issue number2
StatePublished - Apr 1 2006


  • Averaging
  • Logistic
  • Margins
  • Penalization
  • Support vector


Dive into the research topics of 'Estimation of generalization error: Random and fixed inputs'. Together they form a unique fingerprint.

Cite this