A fundamental tradeoff between computation and communication in distributed computing

Songze Li, Mohammad Ali Maddah-Ali, Qian Yu, A. Salman Avestimehr

Research output: Contribution to journalArticlepeer-review

310 Scopus citations

Abstract

How can we optimally trade extra computing power to reduce the communication load in distributed computing? We answer this question by characterizing a fundamental tradeoff between computation and communication in distributed computing, i.e., the two are inversely proportional to each other. More specifically, a general distributed computing framework, motivated by commonly used structures like MapReduce, is considered, where the overall computation is decomposed into computing a set of “Map” and “Reduce” functions distributedly across multiple computing nodes. A coded scheme, named “coded distributed computing” (CDC), is proposed to demonstrate that increasing the computation load of the Map functions by a factor of r (i.e., evaluating each function at r carefully chosen nodes) can create novel coding opportunities that reduce the communication load by the same factor. An information-theoretic lower bound on the communication load is also provided, which matches the communication load achieved by the CDC scheme. As a result, the optimal computation-communication tradeoff in distributed computing is exactly characterized. Finally, the coding techniques of CDC is applied to the Hadoop TeraSort benchmark to develop a novel CodedTeraSort algorithm, which is empirically demonstrated to speed up the overall job execution by 1.97× - 3.39×, for typical settings of interest.

Original languageEnglish (US)
Pages (from-to)109-128
Number of pages20
JournalIEEE Transactions on Information Theory
Volume64
Issue number1
DOIs
StatePublished - 2018
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2017 IEEE.

Keywords

  • Coded multicasting
  • Coded terasort
  • Computation-communication tradeoff
  • Distributed computing
  • MapReduce

Fingerprint

Dive into the research topics of 'A fundamental tradeoff between computation and communication in distributed computing'. Together they form a unique fingerprint.

Cite this