We propose a spectral clustering method based on local principal components analysis (PCA). After performing local PCA in selected neighborhoods, the algorithm builds a nearest neighbor graph weighted according to a discrepancy between the principal subspaces in the neighborhoods, and then applies spectral clustering. As opposed to standard spectral methods based solely on pairwise distances between points, our algorithm is able to resolve intersections. We establish theoretical guarantees for simpler variants within a prototypical mathematical framework for multi-manifold clustering, and evaluate our algorithm on various simulated data sets.
|Original language||English (US)|
|Number of pages||57|
|Journal||Journal of Machine Learning Research|
|State||Published - Mar 1 2017|
Bibliographical noteFunding Information:
This work was partially supported by grants from the National Science Foundation (DMS 0915160, 0915064, 0956072, 1418386, 1513465). We would like to thank Jan Rataj for helpful discussion around Lemma 3 and Xu Wang for his sharp proofreading. We also gratefully acknowledge the comments, suggestions, and scrutiny of an anonymous referee. We would also like to acknowledge support from the Institute for Mathematics and its Applications (IMA). For one thing, the authors first learned about the research of Goldberg et al. (2009) there, at the Multi-Manifold Data Modeling and Applications workshop in the Fall of 2008, and this was the main inspiration for our paper. Also, part of our work was performed while TZ was a postdoctoral fellow at the IMA, and also while EAC and GL were visiting the IMA.
- Intersecting clusters
- Local principal component analysis
- Multi-manifold clustering
- Spectral clustering