Providing privacy protection has been one of the primary motivations of Federated Learning (FL). Recently, there has been a line of work on incorporating the formal privacy notion of differential privacy with FL. To guarantee the client-level differential privacy in FL algorithms, the clients' transmitted model updates have to be clipped before adding privacy noise. Such clipping operation is substantially different from its counterpart of gradient clipping in the centralized differentially private SGD and has not been well-understood. In this paper, we first empirically demonstrate that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity when training neural networks, which is partly because the clients' updates become similar for several popular deep architectures. Based on this key observation, we provide the convergence analysis of a differential private (DP) FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients' updates. To the best of our knowledge, this is the first work that rigorously investigates theoretical and empirical issues regarding the clipping operation in FL algorithms.
|Original language||English (US)|
|Number of pages||20|
|Journal||Proceedings of Machine Learning Research|
|State||Published - 2022|
|Event||39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States|
Duration: Jul 17 2022 → Jul 23 2022
Bibliographical noteFunding Information:
We thank the anonymous reviewers for valuable feedback on the merit of the work, and helpful suggestions on improving the presentation. Z. S. Wu was supported in part by the NSF CNS #2120603, a CMU CyLab 2021 grant, a Google Faculty Research Award, and a Mozilla Research Grant. M. Hong, X. Chen and X. Zhang are supported in part by NSF grants CIF-1910385, CMMI-1727757 and AFOSR grant 19RT0424.
Copyright © 2022 by the author(s)