Training support vector machine using adaptive clustering

Daniel Boley, Dongwei Cao

Research output: Contribution to conferencePaperpeer-review

50 Scopus citations

Abstract

Training support vector machines involves a huge optimization problem and many specially designed algorithms have been proposed. In this paper, we proposed an algorithm called CluaterSVM that accelerates the training process by exploiting the distributional properties of the training data, that is, the natural clustering of the training data and the overall layout of these clusters relative to the decision boundary of support vector machines. The proposed algorithm first partitions the training data into several pair-wise disjoint clusters. Then, the representatives of these clusters are used to train an initial support vector machine, based on which we can approximately identify the support vectors and non-support vectors. After replacing the. cluster containing only non-support vectors with its representative, the number of training data can be significantly reduced, thereby speeding up the training process. The proposed ClusterSVM has been tested against the popular training algorithm SMO on both the artificial data and the real data, and a significant speedup was observed. The complexity of ClusterSVM scales with the square of the number of support vectors and, after a further improvement, it is expected that it will scale with square of the number of non-boundary support vectors.

Original languageEnglish (US)
Pages126-137
Number of pages12
DOIs
StatePublished - 2004
EventProceedings of the Fourth SIAM International Conference on Data Mining - Lake Buena Vista, FL, United States
Duration: Apr 22 2004Apr 24 2004

Other

OtherProceedings of the Fourth SIAM International Conference on Data Mining
Country/TerritoryUnited States
CityLake Buena Vista, FL
Period4/22/044/24/04

Keywords

  • Clustering
  • Optimization
  • PDDP
  • Support vector machine

Fingerprint

Dive into the research topics of 'Training support vector machine using adaptive clustering'. Together they form a unique fingerprint.

Cite this