Comparing learning methods for classification

Research output: Contribution to journalArticlepeer-review

22 Scopus citations

Abstract

We address the consistency property of cross validation (CV) for classification. Sufficient conditions are obtained on the data splitting ratio to ensure that the better classifier between two candidates will be favored by CV with probability approaching 1. Interestingly, it turns out that for comparing two general learning methods, the ratio of the training sample size and the evaluation size does not have to approach 0 for consistency in selection, as is required for comparing parametric regression models (Shao (1993)). In fact, the ratio may be allowed to converge to infinity or any positive constant, depending on the situation. In addition, we also discuss confidence intervals and sequential instability in selection for comparing classifiers.

Original languageEnglish (US)
Pages (from-to)635-657
Number of pages23
JournalStatistica Sinica
Volume16
Issue number2
StatePublished - Apr 2006

Keywords

  • Classification
  • Comparing learning methods
  • Consistency in selection
  • Cross validation paradox
  • Sequential instability

Fingerprint Dive into the research topics of 'Comparing learning methods for classification'. Together they form a unique fingerprint.

Cite this