TY - GEN

T1 - On the sample complexity of robust PCA

AU - Coudron, Matthew

AU - Lerman, Gilad

PY - 2012

Y1 - 2012

N2 - We estimate the rate of convergence and sample complexity of a recent robust estimator for a generalized version of the inverse covariance matrix. This estimator is used in a convex algorithm for robust subspace recovery (i.e., robust PCA). Our model assumes a sub-Gaussian underlying distribution and an i.i.d. sample from it. Our main result shows with high probability that the norm of the difference between the generalized inverse covariance of the underlying distribution and its estimator from an i.i.d. sample of size N is of order O(N-0.5+∈) for arbitrarily small ∈ > 0 (affecting the probabilistic estimate); this rate of convergence is close to the one of direct covariance estimation, i.e., O(N-0.5). Our precise probabilistic estimate implies for some natural settings that the sample complexity of the generalized inverse covariance estimation when using the Frobenius norm is O(D2+δ) for arbitrarily small δ > 0 (whereas the sample complexity of direct covariance estimation with Frobenius norm is O(D2)). These results provide similar rates of convergence and sample complexity for the corresponding robust subspace recovery algorithm. To the best of our knowledge, this is the only work analyzing the sample complexity of any robust PCA algorithm.

AB - We estimate the rate of convergence and sample complexity of a recent robust estimator for a generalized version of the inverse covariance matrix. This estimator is used in a convex algorithm for robust subspace recovery (i.e., robust PCA). Our model assumes a sub-Gaussian underlying distribution and an i.i.d. sample from it. Our main result shows with high probability that the norm of the difference between the generalized inverse covariance of the underlying distribution and its estimator from an i.i.d. sample of size N is of order O(N-0.5+∈) for arbitrarily small ∈ > 0 (affecting the probabilistic estimate); this rate of convergence is close to the one of direct covariance estimation, i.e., O(N-0.5). Our precise probabilistic estimate implies for some natural settings that the sample complexity of the generalized inverse covariance estimation when using the Frobenius norm is O(D2+δ) for arbitrarily small δ > 0 (whereas the sample complexity of direct covariance estimation with Frobenius norm is O(D2)). These results provide similar rates of convergence and sample complexity for the corresponding robust subspace recovery algorithm. To the best of our knowledge, this is the only work analyzing the sample complexity of any robust PCA algorithm.

UR - http://www.scopus.com/inward/record.url?scp=84877726395&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84877726395&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84877726395

SN - 9781627480031

T3 - Advances in Neural Information Processing Systems

SP - 3221

EP - 3229

BT - Advances in Neural Information Processing Systems 25

T2 - 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012

Y2 - 3 December 2012 through 6 December 2012

ER -