Canonical correlation analysis of datasets with a common source graph

Jia Chen, Gang Wang, Yanning Shen, Georgios B Giannakis

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA, however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is developed too. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.

Original languageEnglish (US)
Article number8408767
Pages (from-to)4398-4405
Number of pages8
JournalIEEE Transactions on Signal Processing
Volume66
Issue number16
DOIs
StatePublished - Aug 15 2018

Fingerprint

Image classification
Data fusion
Feature extraction
Geometry

Keywords

  • Dimensionality reduction
  • Laplacian regularization
  • correlation analysis
  • generalized eigen-decomposition
  • signal processing over graphs

Cite this

Canonical correlation analysis of datasets with a common source graph. / Chen, Jia; Wang, Gang; Shen, Yanning; Giannakis, Georgios B.

In: IEEE Transactions on Signal Processing, Vol. 66, No. 16, 8408767, 15.08.2018, p. 4398-4405.

Research output: Contribution to journalArticle

@article{811d149b87ab4a99b83bfef5c4a20d16,
title = "Canonical correlation analysis of datasets with a common source graph",
abstract = "Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA, however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is developed too. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.",
keywords = "Dimensionality reduction, Laplacian regularization, correlation analysis, generalized eigen-decomposition, signal processing over graphs",
author = "Jia Chen and Gang Wang and Yanning Shen and Giannakis, {Georgios B}",
year = "2018",
month = "8",
day = "15",
doi = "10.1109/TSP.2018.2853130",
language = "English (US)",
volume = "66",
pages = "4398--4405",
journal = "IEEE Transactions on Signal Processing",
issn = "1053-587X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "16",

}

TY - JOUR

T1 - Canonical correlation analysis of datasets with a common source graph

AU - Chen, Jia

AU - Wang, Gang

AU - Shen, Yanning

AU - Giannakis, Georgios B

PY - 2018/8/15

Y1 - 2018/8/15

N2 - Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA, however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is developed too. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.

AB - Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA, however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is developed too. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.

KW - Dimensionality reduction

KW - Laplacian regularization

KW - correlation analysis

KW - generalized eigen-decomposition

KW - signal processing over graphs

UR - http://www.scopus.com/inward/record.url?scp=85049696772&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85049696772&partnerID=8YFLogxK

U2 - 10.1109/TSP.2018.2853130

DO - 10.1109/TSP.2018.2853130

M3 - Article

AN - SCOPUS:85049696772

VL - 66

SP - 4398

EP - 4405

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

SN - 1053-587X

IS - 16

M1 - 8408767

ER -