Abstract
In this work, a gradient-based primal-dual method of multipliers is proposed for solving a class of linearly constrained non-convex problems. We show that with random initialization of the primal and dual variables, the algorithm is able to compute second-order stationary points (SOSPs) with probability one. Further, we present applications of the proposed method in popular signal processing and machine learning problems such as decentralized matrix factorization and decentralized training of overparameterized neural networks. One of the key steps in the analysis is to construct a new loss function for these problems such that the required convergence conditions (especially the gradient Lipschitz conditions) can be satisfied without changing the global optimal points.
Original language | English (US) |
---|---|
Article number | 9503322 |
Pages (from-to) | 4859-4874 |
Number of pages | 16 |
Journal | IEEE Transactions on Signal Processing |
Volume | 69 |
DOIs | |
State | Published - Aug 2 2021 |
Bibliographical note
Funding Information:Manuscript received July 8, 2020; revised April 13, 2021; accepted May 25, 2021. Date of publication August 2, 2021; date of current version September 3, 2021. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. V. Gripon. The work of Minggyi Hong was supported in part by the National Science Foundation under Grants CIF-1910385 and CNS-2003033, and in part by AFOSR under Grant 19RT0424. This paper was presented in part at the International Conference on Machine Learning, Stockholm, Sweden [1]. (Corresponding author: Mingyi Hong.) Songtao Lu is with the IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 USA (e-mail: songtao@ibm.com).
Publisher Copyright:
© 1991-2012 IEEE.
Keywords
- First-order stationary points (FOSPs)
- alternating direction method of multipliers (ADMM)
- neural networks
- non-convex optimization
- second-order stationary points (SOSPs)