Learning appropriate metric is critical for effectively capturing complex data characteristics. The metric learning of categorical data with hierarchical coupling relationships and local heterogeneous distributions is very challenging yet rarely explored. This paper proposes a Heterogeneous mEtric Learning with hIerarchical Couplings (HELIC for short) for this type of categorical data. HELIC captures both low-level value-to-attribute and high-level attribute-to-class hierarchical couplings, and reveals the intrinsic heterogeneities embedded in each level of couplings. Theoretical analyses of the effectiveness and generalization error bound verify that HELIC effectively represents the above complexities. Extensive experiments on 30 data sets with diverse characteristics demonstrate that HELIC-enabled classification significantly enhances the accuracy (up to 40.93 percent), compared with five state-of-the-art baselines.
|Original language||English (US)|
|Number of pages||14|
|Journal||IEEE Transactions on Knowledge and Data Engineering|
|State||Published - Jul 1 2018|
Bibliographical noteFunding Information:
This work was supported in part by the National Natural Science Foundation of China under Grant 61672528.
- Metric learning
- categorical data
- coupling learning
- distance metric
- heterogeneity learning
- non-IID learning
- similarity measure