Abstract
In Federated Learning (FL), clients independently train local models and share them with a central aggregator to build a global model. Impermissibility to access clients' data and collaborative training make FL appealing for applications with data-privacy concerns, such as medical imaging. However, these FL characteristics pose unprecedented challenges for debugging. When a global model's performance deteriorates, identifying the responsible rounds and clients is a major pain point. Developers resort to trial-and-error debugging with subsets of clients, hoping to increase the global model's accuracy or let future FL rounds retune the model, which are time-consuming and costly. We design a systematic fault localization framework, Fedde-bug,that advances the FL debugging on two novel fronts. First, Feddebug enables interactive debugging of realtime collaborative training in FL by leveraging record and replay techniques to construct a simulation that mirrors live FL. Feddebug'sbreakpoint can help inspect an FL state (round, client, and global model) and move between rounds and clients' models seam-lessly, enabling a fine-grained step-by-step inspection. Second, Feddebug automatically identifies the client(s) responsible for lowering the global model's performance without any testing data and labels-both are essential for existing debugging techniques. Feddebug's strengths come from adapting differential testing in conjunction with neuron activations to determine the client(s) deviating from normal behavior. Feddebug achieves 100% accuracy in finding a single faulty client and 90.3% accuracy in finding multiple faulty clients. Feddebug's interactive de-bugging incurs 1.2% overhead during training, while it localizes a faulty client in only 2.1% of a round's training time. With FedDebug,we bring effective debugging practices to federated learning, improving the quality and productivity of FL application developers.
| Original language | English (US) |
|---|---|
| Title of host publication | Proceedings - 2023 IEEE/ACM 45th International Conference on Software Engineering, ICSE 2023 |
| Publisher | IEEE Computer Society |
| Pages | 512-523 |
| Number of pages | 12 |
| ISBN (Electronic) | 9781665457019 |
| DOIs | |
| State | Published - Jul 26 2023 |
| Event | 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023 - Melbourne, Australia Duration: May 15 2023 → May 16 2023 |
Publication series
| Name | Proceedings - International Conference on Software Engineering |
|---|---|
| ISSN (Print) | 0270-5257 |
Conference
| Conference | 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023 |
|---|---|
| Country/Territory | Australia |
| City | Melbourne |
| Period | 5/15/23 → 5/16/23 |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Keywords
- CNN
- client
- fault localization
- federated learning
- neural networks
- software debugging
- testing
Fingerprint
Dive into the research topics of 'FedDebug: Systematic Debugging for Federated Learning Applications'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS