We propose a general theory for studying the geometry of nonconvex objective functions with underlying symmetric structures. In specific, we characterize the locations of stationary points and the null space of the associated Hessian matrices via the lens of invariant groups. As a major motivating example, we apply the proposed general theory to characterize the global geometry of the low-rank matrix factorization problem. In particular, we illustrate how the rotational symmetry group gives rise to infinitely many nonisolated strict saddle points and equivalent global minima of the objective function. By explicitly identifying all stationary points, we divide the entire parameter space into three regions: (R1) the region containing the neighborhoods of all strict saddle points, where the objective has negative curvatures; (R2) the region containing neighborhoods of all global minima, where the objective enjoys strong convexity along certain directions; and (R3) the complement of the above regions, where the gradient has sufficiently large magnitudes. We further extend our result to the matrix sensing problem. This allows us to establish strong global convergence guarantees for popular iterative algorithms with arbitrary initialization.
|Original language||English (US)|
|Title of host publication||2018 Information Theory and Applications Workshop, ITA 2018|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|State||Published - Oct 23 2018|
|Event||2018 Information Theory and Applications Workshop, ITA 2018 - San Diego, United States|
Duration: Feb 11 2018 → Feb 16 2018
|Name||2018 Information Theory and Applications Workshop, ITA 2018|
|Other||2018 Information Theory and Applications Workshop, ITA 2018|
|Period||2/11/18 → 2/16/18|
Bibliographical notePublisher Copyright:
© 2018 IEEE.