Abstract
We propose distributed solutions to the problem of Robust Subspace Recovery (RSR). Our setting assumes a huge dataset in an ad hoc network without a central processor, where each node has access only to one chunk of the dataset. Furthermore, part of the whole dataset lies around a low-dimensional subspace and the other part is composed of outliers that lie away from that subspace. The goal is to recover the underlying subspace for the whole dataset, without transferring the data itself between the nodes. We first apply the Consensus Based Gradient method to the Geometric Median Subspace algorithm for RSR. For this purpose, we propose an iterative solution for the local dual minimization problem and establish its r-linear convergence. We then explain how to distributedly implement the Reaper and Fast Median Subspace algorithms for RSR. The proposed algorithms display competitive performance on both synthetic and real data.
Original language | English (US) |
---|---|
Pages (from-to) | A3067-A3090 |
Journal | SIAM Journal on Scientific Computing |
Volume | 40 |
Issue number | 5 |
DOIs | |
State | Published - 2018 |
Bibliographical note
Funding Information:\ast Submitted to the journal's Methods and Algorithms for Scientific Computing section May 24, 2017; accepted for publication (in revised form) June 18, 2018; published electronically September 25, 2018. http://www.siam.org/journals/sisc/40-5/M113165.html Funding: This work was supported by NSF awards DMS-09-56072 and DMS-14-18386 and the Feinberg Foundation Visiting Faculty Program Fellowship of the Weizmann Institute of Science. \dagger School of Mathematics, University of Minnesota, Twin Cities, Minneapolis, MN 55455 (huroy002@umn.edu, lerman@umn.edu).
Funding Information:
This work was supported by NSF awards DMS-09-56072 and DMS-14-18386 and the Feinberg Foundation Visiting Faculty Program Fellowship of the Weizmann Institute of Science.
Publisher Copyright:
© 2018 Society for Industrial and Applied Mathematics.
Keywords
- Consensus-based algorithms
- Distributed algorithms
- Geometric median
- Principal component analysis (PCA)
- Robust subspace recovery (RSR)