TY - JOUR
T1 - Minimax nonparametric classification - Part II
T2 - Model selection for adaptation
AU - Yang, Yuhong
PY - 1999
Y1 - 1999
N2 - We study nonparametric estimation of a conditional probability for classification based on a collection of finite-dimensional models. For the sake of flexibility, different types of models, linear or nonlinear, are allowed as long as each satisfies a dimensionality assumption. We show that with a suitable model selection criterion, the penalized maximum-likelihood estimator has risk bounded by an index of resolvability expressing a good tradeoff among approximation error, estimation error, and model complexity. The bound does not require any assumption on the target conditional probability and can be used to demonstrate the adaptivity of estimators based on model selection. Examples are given with both splines and neural nets, and problems of high-dimensional estimation are considered. The resulting adaptive estimator is shown to behave optimally or near optimally over Sobolev classes (with unknown orders of interaction and smoothness) and classes of integrable Fourier transform of gradient. In terms of rates of convergence, performance is the same as if one knew which of them contains the true conditional probability in advance. The corresponding classifier also converges optimally or nearly optimally simultaneously over these classes.
AB - We study nonparametric estimation of a conditional probability for classification based on a collection of finite-dimensional models. For the sake of flexibility, different types of models, linear or nonlinear, are allowed as long as each satisfies a dimensionality assumption. We show that with a suitable model selection criterion, the penalized maximum-likelihood estimator has risk bounded by an index of resolvability expressing a good tradeoff among approximation error, estimation error, and model complexity. The bound does not require any assumption on the target conditional probability and can be used to demonstrate the adaptivity of estimators based on model selection. Examples are given with both splines and neural nets, and problems of high-dimensional estimation are considered. The resulting adaptive estimator is shown to behave optimally or near optimally over Sobolev classes (with unknown orders of interaction and smoothness) and classes of integrable Fourier transform of gradient. In terms of rates of convergence, performance is the same as if one knew which of them contains the true conditional probability in advance. The corresponding classifier also converges optimally or nearly optimally simultaneously over these classes.
UR - http://www.scopus.com/inward/record.url?scp=0033356514&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0033356514&partnerID=8YFLogxK
U2 - 10.1109/18.796369
DO - 10.1109/18.796369
M3 - Article
AN - SCOPUS:0033356514
SN - 0018-9448
VL - 45
SP - 2285
EP - 2292
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 7
ER -