TiFL: A Tier-based Federated Learning System

Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan, Yue Cheng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

249 Scopus citations

Abstract

Federated Learning (FL) enables learning a shared model acrossmany clients without violating the privacy requirements. One of the key attributes in FL is the heterogeneity that exists in both resource and data due to the differences in computation and communication capacity, as well as the quantity and content of data among different clients. We conduct a case study to show that heterogeneity in resource and data has a significant impact on training time and model accuracy in conventional FL systems. To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource anddata quantity. To further tame the heterogeneity caused by non-IID (Independent and Identical Distribution) data and resources, TiFL employs an adaptive tier selection approach to update the tiering on-the-fly based on the observed training performance and accuracy. We prototype TiFL in a FL testbed following Google's FL architecture and evaluate it using the state-of-the-art FL benchmarks. Experimental evaluation shows that TiFL outperforms the conventional FL in various heterogeneous conditions. With the proposed adaptive tier selection policy, we demonstrate that TiFL achieves much faster training performance while achieving the same or better test accuracy across the board.

Original languageEnglish (US)
Title of host publicationHPDC 2020 - Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages125-136
Number of pages12
ISBN (Electronic)9781450370523
DOIs
StatePublished - Jun 23 2020
Externally publishedYes
Event29th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2020 - Stockholm, Sweden
Duration: Jun 23 2020Jun 26 2020

Publication series

NameHPDC 2020 - Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference29th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2020
Country/TerritorySweden
CityStockholm
Period6/23/206/26/20

Bibliographical note

Publisher Copyright:
© 2020 ACM.

Keywords

  • data heterogeneity
  • edge computing
  • federated learning
  • non-IID
  • resource heterogeneity
  • stragglers

Fingerprint

Dive into the research topics of 'TiFL: A Tier-based Federated Learning System'. Together they form a unique fingerprint.

Cite this