Information criteria have been popularly used in model selection and proved to possess nice theoretical properties. For classification, Claeskens et al. (2008) proposed support vector machine information criterion for feature selection and provided encouraging numerical evidence. Yet no theoretical justification was given there. This work aims to fill the gap and to provide some theoretical justifications for support vector machine information criterion in both fixed and diverging model spaces. We first derive a uniform convergence rate for the support vector machine solution and then show that a modification of the support vector machine information criterion achieves model selection consistency even when the number of features diverges at an exponential rate of the sample size. This consistency result can be further applied to selecting the optimal tuning parameter for various penalized support vector machine methods. Finite-sample performance of the proposed information criterion is investigated using Monte Carlo studies and one real-world gene selection problem.
|Original language||English (US)|
|Journal||Journal of Machine Learning Research|
|State||Published - Apr 1 2016|
Bibliographical noteFunding Information:
We thank the Action Editor Professor Jie Peng and two referees for very constructive comments and suggestions which have improved the presentation of the paper. Wu's research is partially supported by NSF grant DMS-1055210, NIH/NCI grants P01-CA142538 and R01-CA149569. Wang's research is supported by NSF grant DMS-1308960. Li's research was partially supported by NIH/NIDA grants P50-DA10075 and P50-DA036107. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NSF, NIH, NCI, or NIDA.
©2016 Xiang Zhang, Yichao Wu, Lan Wang and Runze Li.
Copyright 2016 Elsevier B.V., All rights reserved.
- Bayesian information criterion
- Diverging model spaces
- Feature selection
- Support vector machines