Abstract
In the era of big data, it is challenging to train a machine learning model on a single machine or over a distributed system with a central controller over a large-scale dataset. In this paper, we propose a gradient-tracking based nonconvex stochastic decentralized (GNSD) algorithm for solving nonconvex optimization problems, where the data is partitioned into multiple parts and processed by the local computational resource. Through exchanging the parameters at each node over a network, GNSD is able to find the first-order stationary points (FOSP) efficiently. From the theoretical analysis, it is guaranteed that the convergence rate of GNSD to FOSPs matches the well-known convergence rate \mathcal{O}\left( {1/\sqrt T } \right) of stochastic gradient descent by shrinking the step-size. Finally, we perform extensive numerical experiments on computational clusters to demonstrate the advantage of GNSD compared with other state-of-the-art methods.
Original language | English (US) |
---|---|
Title of host publication | 2019 IEEE Data Science Workshop, DSW 2019 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 315-321 |
Number of pages | 7 |
ISBN (Electronic) | 9781728107080 |
DOIs | |
State | Published - Jun 2019 |
Event | 2019 IEEE Data Science Workshop, DSW 2019 - Minneapolis, United States Duration: Jun 2 2019 → Jun 5 2019 |
Publication series
Name | 2019 IEEE Data Science Workshop, DSW 2019 - Proceedings |
---|
Conference
Conference | 2019 IEEE Data Science Workshop, DSW 2019 |
---|---|
Country/Territory | United States |
City | Minneapolis |
Period | 6/2/19 → 6/5/19 |
Bibliographical note
Funding Information:†equal contribution. The authors of this paper have been supported by NSF grants CMMI-1727757, CCF-1526078, an AFOSR grant 15RT0767, and a Digital TechnologyInitiative Seed Grant from the Digital Technology Center at University of Minnesota.
Publisher Copyright:
© 2019 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
Keywords
- Stochastic
- decentralized
- gradient tracking
- neural networks
- nonconvex optimization