Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.
Bibliographical noteFunding Information:
Manuscript received May 13, 2018; revised August 24, 2018 and October 17, 2018; accepted November 22, 2018. Date of publication December 6, 2018; date of current version December 24, 2018. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Byonghyo Shim. This work was supported in part by NSF grants 1711471, 1500713, 1514056, 1505970, and the NIH grant 1R01GM104975-01. This paper was presented in part at the 43rd IEEE International Conference on Acoustics, Speech, and Signal Processing, Calgary, AB, Canada, April 15–20, 2018, and in part at the 52nd Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, October 28–31, 2018. (Corresponding author: Gang Wang.) The authors are with the Digital Technology Center and the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455 USA (e-mail:, firstname.lastname@example.org; email@example.com; firstname.lastname@example.org). Digital Object Identifier 10.1109/TSP.2018.2885478
- Principal component analysis
- discriminative analytics
- kernel learning
- multiple background datasets