Abstract
In this paper, we analyze the convergence of the zeroth-order stochastic projected gradient descent (ZO-SPGD) method for constrained convex and nonconvex optimization scenarios where only objective function values (not gradients) are directly available. We show statistical properties of a new random gradient estimator, constructed through random direction samples drawn from a bounded uniform distribution. We prove that ZO-SPGD yields a O( fracdbqsqrt T + frac1sqrt T right) convergence rate for convex but non-smooth optimization, where d is the number of optimization variables, b is the minibatch size, q is the number of random direction samples for gradient estimation, and T is the number of iterations. For nonconvex optimization, we show that ZO-SPGD achieves O( frac1sqrt T right) convergence rate but suffers an additional O( fracd + qbq right) error. Our the oretical investigation on ZO-SPGD provides a general framework to study the convergence rate of zeroth-order algorithms.
Original language | English (US) |
---|---|
Title of host publication | 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1179-1183 |
Number of pages | 5 |
ISBN (Electronic) | 9781728112954 |
DOIs | |
State | Published - Jul 2 2018 |
Event | 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Anaheim, United States Duration: Nov 26 2018 → Nov 29 2018 |
Publication series
Name | 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 - Proceedings |
---|
Conference
Conference | 2018 IEEE Global Conference on Signal and Information Processing, GlobalSIP 2018 |
---|---|
Country/Territory | United States |
City | Anaheim |
Period | 11/26/18 → 11/29/18 |
Bibliographical note
Funding Information:The authors graciously acknowledge support from the DARPA YFA, Grant N66001-14-1-4047.
Publisher Copyright:
© 2018 IEEE.
Keywords
- Nonconvex optimization
- Projected gradient descent
- Zeroth-order optimization