Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis

Xiao Fu, Kejun Huang, Mingyi Hong, Nikolaos Sidiropoulos, Anthony Man Cho So

Research output: Contribution to journalArticlepeer-review

36 Scopus citations

Abstract

Generalized canonical correlation analysis (GCCA) aims at finding latent low-dimensional common structure from multiple views (feature vectors in different domains) of the same entities. Unlike principal component analysis that handles a single view, (G)CCA is able to integrate information from different feature spaces. Here we focus on MAX-VAR GCCA, a popular formulation that has recently gained renewed interest in multilingual processing and speech modeling. The classic MAX-VAR GCCA problem can be solved optimally via eigen-decomposition of a matrix that compounds the (whitened) correlation matrices of the views; but this solution has serious scalability issues, and is not directly amenable to incorporating pertinent structural constraints such as nonnegativity and sparsity on the canonical components. We posit regularized MAX-VAR GCCA as a nonconvex optimization problem and propose an alternating optimization-based algorithm to handle it. Our algorithm alternates between inexact solutions of a regularized least squares subproblem and a manifold-constrained nonconvex subproblem, thereby achieving substantial memory and computational savings. An important benefit of our design is that it can easily handle structure-promoting regularization. We show that the algorithm globally converges to a critical point at a sublinear rate, and approaches a global optimal solution at a linear rate when no regularization is considered. Judiciously designed simulations and large-scale word embedding tasks are employed to showcase the effectiveness of the proposed algorithm.

Original languageEnglish (US)
Article number7912373
Pages (from-to)4150-4165
Number of pages16
JournalIEEE Transactions on Signal Processing
Volume65
Issue number16
DOIs
StatePublished - Aug 15 2017

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • Canonical correlation analysis (CCA)
  • MAX-VAR
  • feature selection
  • multiview CCA
  • optimization
  • scalability
  • word embedding

Fingerprint

Dive into the research topics of 'Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis'. Together they form a unique fingerprint.

Cite this