Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks

Zhaoxian Wu, Qing Ling, Tianyi Chen, Georgios B. Giannakis

Research output: Contribution to journalArticlepeer-review

111 Scopus citations


This paper deals with distributed finite-sum optimization for learning over multiple workers in the presence of malicious Byzantine attacks. Most resilient approaches so far combine stochastic gradient descent (SGD) with different robust aggregation rules. However, the sizeable SGD-induced stochastic gradient noise challenges discerning malicious messages sent by the Byzantine attackers from noisy stochastic gradients sent by the 'honest' workers. This motivates reducing the variance of stochastic gradients as a means of robustifying SGD. To this end, a novel Byzantine attack resilient distributed (Byrd-) SAGA approach is introduced for federated learning tasks involving multiple workers. Rather than the mean employed by distributed SAGA, the novel Byrd-SAGA relies on the geometric median to aggregate the corrected stochastic gradients sent by the workers. When less than half of the workers are Byzantine attackers, Byrd-SAGA attains provably linear convergence to a neighborhood of the optimal solution, with the asymptotic learning error determined by the number of Byzantine workers. Numerical tests corroborate the robustness to various Byzantine attacks, as well as the merits of Byrd-SAGA over Byzantine attack resilient distributed SGD.

Original languageEnglish (US)
Article number9153949
Pages (from-to)4583-4596
Number of pages14
JournalIEEE Transactions on Signal Processing
StatePublished - 2020

Bibliographical note

Funding Information:
Manuscript received December 29, 2019; revised May 19, 2020; accepted July 20, 2020. Date of publication July 31, 2020; date of current version September 1, 2020. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Vincent Gripon. The work of Qing Ling was supported in part by NSF China under Grants 61573331 and 61973324 and in part by Fundamental Research Funds for the Central Universities. The work of Georgios B. Giannakis was supported by NSF under Grants 1509040, 1508993, 1711471, and 1901134. This article was presented in part at the IEEE International Conference on Acoustics, Speech, and Signal Processing, Barcelona, Spain, May 4–8, 2020. (Corresponding author: Qing Ling.) Zhaoxian Wu and Qing Ling are with the School of Data and Computer Science and Guangdong Province Key Laboratory of Computational Science, Sun Yat-Sen University, Guangzhou 510006, China (e-mail: wuzhx23@mail2.sysu.edu.cn; lingqing556@mail.sysu.edu.cn).

Publisher Copyright:
© 1991-2012 IEEE.


  • Byzantine attacks
  • Distributed finite-sum optimization
  • gradient noise
  • variance reduction


Dive into the research topics of 'Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks'. Together they form a unique fingerprint.

Cite this