We describe new methodology for supervised learning with sparse data, i.e., when the number of input features is (much) larger than the number of training samples (n). Under the proposed approach, all available (d) input features are split into several (t) subsets, effectively resulting in a larger number (t*n) of labeled training samples in lower-dimensional input space (of dimensionality d/t). This (modified) training data is then used to estimate a classifier for making predictions in lower-dimensional space. In this paper, standard SVM is used for training a classifier. During testing (prediction), a group of t predictions made by SVM classifier needs to be combined via intelligent post-processing rules, in order to make a prediction for a test input (in the original d-dimensional space). The novelty of our approach is in the design and empirical validation of these post-processing rules under Group Learning setting. We demonstrate that such post-processing rules effectively reflect general (common-sense) a priori knowledge (about application data). Specifically, we propose two different post-processing schemes and demonstrate their effectiveness for two real-life application domains, i.e., handwritten digit recognition and seizure prediction from iEEG signal. These empirical results show superior performance of the Group Learning approach for sparse data, under both balanced and unbalanced classification settings
|Original language||English (US)|
|Title of host publication||2019 International Joint Conference on Neural Networks, IJCNN 2019|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|State||Published - Jul 2019|
|Event||2019 International Joint Conference on Neural Networks, IJCNN 2019 - Budapest, Hungary|
Duration: Jul 14 2019 → Jul 19 2019
|Name||Proceedings of the International Joint Conference on Neural Networks|
|Conference||2019 International Joint Conference on Neural Networks, IJCNN 2019|
|Period||7/14/19 → 7/19/19|
Bibliographical noteFunding Information:
ACKNOWLEDGMENT This work was supported, in part, by NIH grant UH2NS095495, and NIH grant R01NS092882.
© 2019 IEEE.
Copyright 2019 Elsevier B.V., All rights reserved.
- Group Learning
- binary classification
- digit recognition
- feature selection
- histogram of projections
- seizure prediction
- unbalanced data.