Building Large Models from Small Distributed Models: A Layer Matching Approach

Xinwei Zhang, Bingqing Song, Mehrdad Honarkhah, Jie Dingl, Mingyi Hong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Cross-device federated learning (FL) has emerged as a promising approach to leverage the collective power of numerous clients for training machine learning models using their local data and computational resources. Meanwhile, huge models have recently been developed to fully utilize such a large amount of data and achieve high performance on various tasks. However, training such large models requires substantial storage and computational resources, and resource limitations of the individual client devices in the FL systems often prevent the utilization of large modern deep learning models. In this paper, we address this challenge by proposing a novel federated layer matching (FLM) algorithm that enables the server to construct a deep server model by leveraging relatively shallow client models. The FLM algorithm dynamically matches and combines similar layers from the client models into the server model while incorporating dissimilar layers as new layers. By employing this strategy, clients can train smaller models that align with their device capacities, while the server can aggregate and derive a larger, more powerful server model by leveraging the distributed client data. Our empirical results demonstrate that the FLM algorithm enables the construction of a server model that is significantly larger than the individual client models, and the server model matches the performance of the same-sized model trained in the centralized setting. Further, the model constructed by FLM significantly outperforms the model obtained using the classical model averaging algorithm under the same amount of communication and client computational resources.

Original languageEnglish (US)
Title of host publication2024 IEEE 13rd Sensor Array and Multichannel Signal Processing Workshop, SAM 2024
PublisherIEEE Computer Society
ISBN (Electronic)9798350344813
DOIs
StatePublished - 2024
Event13rd IEEE Sensor Array and Multichannel Signal Processing Workshop, SAM 2024 - Corvallis, United States
Duration: Jul 8 2024Jul 11 2024

Publication series

NameProceedings of the IEEE Sensor Array and Multichannel Signal Processing Workshop
ISSN (Electronic)2151-870X

Conference

Conference13rd IEEE Sensor Array and Multichannel Signal Processing Workshop, SAM 2024
Country/TerritoryUnited States
CityCorvallis
Period7/8/247/11/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Fingerprint

Dive into the research topics of 'Building Large Models from Small Distributed Models: A Layer Matching Approach'. Together they form a unique fingerprint.

Cite this