In this paper, we design and analyze a new zeroth-order (ZO) stochastic optimization algorithm, ZO-signSGD, which enjoys dual advantages of gradient-free operations and signSGD. The latter requires only the sign information of gradient estimates but is able to achieve a comparable or even better convergence speed than SGD-type algorithms. Our study shows that ZO-signSGD requires √d times more iterations than signSGD, leading to a convergence rate of O(√d/√T) under some mild conditions, where d is the number of optimization variables, and T is the number of iterations. In addition, we analyze the effects of different types of gradient estimators on the convergence of ZO-signSGD, and propose several variants of ZO-signSGD with O(√d/√T) convergence rate. On the application side we explore the connection between ZO-signSGD and black-box adversarial attacks in robust deep learning. Our empirical evaluations on image classification datasets MNIST and CIFAR-10 demonstrate the superior performance of ZO-signSGD on the generation of adversarial examples from black-box neural networks.
|Original language||English (US)|
|State||Published - Jan 1 2019|
|Event||7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States|
Duration: May 6 2019 → May 9 2019
|Conference||7th International Conference on Learning Representations, ICLR 2019|
|Period||5/6/19 → 5/9/19|