Communication overheads in distributed systems constitute a large fraction of the total execution time, and limit the scalability of applications running on these systems. We propose a DCT-based approximate communication scheme that takes advantage of the error resiliency of several widely-used applications, and improves communication efficiency by substantially reducing message lengths. Our scheme is implemented into the Message Passing Interface (MPI) library. When evaluated on several representative MPI applications on a real cluster system, it is seen that the fraction of total execution time devoted to communication reduces from 59% to 23%, even accounting for the computational overhead required for DCT encoding. For many communication-intensive applications, it is shown that our approximate communication scheme effectively speeds up the total execution time without much loss in quality of the result.
|Original language||English (US)|
|Title of host publication||2019 IEEE 38th International Performance Computing and Communications Conference, IPCCC 2019|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|State||Published - Oct 2019|
|Event||38th IEEE International Performance Computing and Communications Conference, IPCCC 2019 - London, United Kingdom|
Duration: Oct 29 2019 → Oct 31 2019
|Name||2019 IEEE 38th International Performance Computing and Communications Conference, IPCCC 2019|
|Conference||38th IEEE International Performance Computing and Communications Conference, IPCCC 2019|
|Period||10/29/19 → 10/31/19|
Bibliographical noteFunding Information:
This work was supported in part by U.S. National Science Foundation grant no. CCF-1438286. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
© 2019 IEEE.