GRASSMANNIAN DIFFUSION MAPS-BASED DIMENSION REDUCTION AND CLASSIFICATION FOR HIGH-DIMENSIONAL DATA

Ketson R. Dos Santos, Dimitrios G. Giovanis, Michael D. Shields

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

This work introduces the Grassmannian diffusion maps (GDMaps), a novel nonlinear dimensionality reduction technique that defines the affinity between points through their representation as low-dimensional subspaces corresponding to points on the Grassmann manifold. The method is designed for applications, such as image recognition and data-based classification of constrained high-dimensional data where each data point itself is a high-dimensional object (i.e., a large matrix) that can be compactly represented in a lower-dimensional subspace. The GDMaps is composed of two stages. The first is a pointwise linear dimensionality reduction wherein each high-dimensional object is mapped onto the Grassmann manifold representing the low-dimensional subspace on which it resides. The second stage is a multipoint nonlinear kernel-based dimension reduction using diffusion maps to identify the subspace structure of the points on the Grassmann manifold. To this end, an appropriate Grassmannian kernel is used to construct the transition matrix of a random walk on a graph connecting points on the Grassmann manifold. Spectral analysis of the transition matrix yields low-dimensional Grassmannian diffusion coordinates embedding the data into a low-dimensional reproducing kernel Hilbert space. Further, a novel data classification/recognition technique is developed based on the construction of an overcomplete dictionary of reduced dimension whose atoms are given by the Grassmannian diffusion coordinates. Three examples are considered. First, a "toy"example shows that the GDMaps can identify an appropriate parametrization of structured points on the unit sphere. The second example demonstrates the ability of the GDMaps to revealing the intrinsic subspace structure of high-dimensional random field data. In the last example, a face recognition problem is solved considering face images subject to varying illumination conditions, changes in face expressions, and occurrence of occlusions. The technique presented high recognition rates (i.e., 95% in the best case) using a fraction of the data required by conventional methods.

Original languageEnglish (US)
Pages (from-to)B250-B274
JournalSIAM Journal on Scientific Computing
Volume44
Issue number2
DOIs
StatePublished - 2022
Externally publishedYes

Bibliographical note

Funding Information:
\ast Submitted to the journal's Computational Methods in Science and Engineering section September 28, 2020; accepted for publication (in revised form) September 24, 2021; published electronically March 15, 2022. https://doi.org/10.1137/20M137001X Funding: This work was supported by the U.S. Department of Energy under grant DESC0020428 and by internal funds provided by Johns Hopkins University. \dagger Department of Civil and Systems Engineering, Johns Hopkins University, Baltimore, MD 21218 USA ([email protected], [email protected], [email protected]).

Publisher Copyright:
Copyright © 2022 by SIAM.

Keywords

  • data classification
  • diffusion maps
  • dimension reduction
  • face recognitio
  • Grassmann manifold

Fingerprint

Dive into the research topics of 'GRASSMANNIAN DIFFUSION MAPS-BASED DIMENSION REDUCTION AND CLASSIFICATION FOR HIGH-DIMENSIONAL DATA'. Together they form a unique fingerprint.

Cite this