Communication-efficient distributed learning via lazily aggregated quantized gradients

Jun Sun, Tianyi Chen, Georgios B. Giannakis, Zaiyue Yang

Research output: Contribution to journalConference articlepeer-review

52 Scopus citations


The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative quantized gradient communications by reusing outdated gradients. Quantizing and skipping result in 'lazy' worker-server communications, which justifies the term Lazily Aggregated Quantized gradient that is henceforth abbreviated as LAQ. Our LAQ can provably attain the same linear convergence rate as the gradient descent in the strongly convex case, while effecting major savings in the communication overhead both in transmitted bits as well as in communication rounds. Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
StatePublished - 2019
Event33rd Annual Conference on Neural Information Processing Systems, NeurIPS 2019 - Vancouver, Canada
Duration: Dec 8 2019Dec 14 2019

Bibliographical note

Funding Information:
This work by J. Sun and Z. Yang is supported in part by the Shenzhen Committee on Science and Innovations under Grant GJHZ20180411143603361, in part by the Department of Science and Technology of Guangdong Province under Grant 2018A050506003, and in part by the Natural Science Foundation of China under Grant 61873118. The work by J. Sun is also supported by China Scholarship Council. The work by G. Giannakis is supported in part by NSF 1500713, and 1711471.

Publisher Copyright:
© 2019 Neural information processing systems foundation. All rights reserved.

Copyright 2020 Elsevier B.V., All rights reserved.


Dive into the research topics of 'Communication-efficient distributed learning via lazily aggregated quantized gradients'. Together they form a unique fingerprint.

Cite this