We consider the data shuffling problem, in which a master node is connected to a set of worker nodes, via a shared link, in order to communicate a set of files to the worker nodes. The master node has access to a database of files. In every shuffling iteration, each worker node processes a new subset of files, and has excess storage to partially cache the remaining files. We characterize the exact rate-memory trade-off for the worst-case shuffling under the assumption that cached files are uncoded, by deriving the minimum communication rate for a given storage capacity per worker node. As a byproduct, the exact rate-memory trade-off for any random shuffling is characterized when the number of files is equal to the number of worker nodes. We propose a novel deterministic and systematic coded shuffling scheme, which improves the state of the art. Then, we prove the optimality of our proposed scheme by deriving a matching lower bound and showing that the placement phase of the proposed coded shuffling scheme is optimal over all shuffles.
|Original language||English (US)|
|Title of host publication||2018 IEEE International Symposium on Information Theory, ISIT 2018|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Number of pages||5|
|State||Published - Aug 15 2018|
|Event||2018 IEEE International Symposium on Information Theory, ISIT 2018 - Vail, United States|
Duration: Jun 17 2018 → Jun 22 2018
|Name||IEEE International Symposium on Information Theory - Proceedings|
|Other||2018 IEEE International Symposium on Information Theory, ISIT 2018|
|Period||6/17/18 → 6/22/18|
Bibliographical notePublisher Copyright:
© 2018 IEEE.