General Coded Computing in a Probabilistic Straggler Regime

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Coded computing has demonstrated promising results in addressing straggler resiliency in distributed computing systems. However, most coded computing schemes are designed for exact computation, requiring the number of responding servers to exceed a certain recovery threshold. Additionally, these schemes are tailored to highly structured functions. Recently, new coded computing schemes for general computing functions, where exact computation is replaced with approximate computation, have emerged. In these schemes, the availability of additional results leads to more accurate estimations of the computational tasks. This flexibility introduces new questions that need to be addressed. This paper considers a practically important scenario in the context of general coded computing, where each server may become a straggler with probability p, independently of others. We theoretically analyze the approximation error of two existing general coded computing schemes: Berrut Approximate Coded Computing (BACC) and Learning-Theoretic Coded Computing (LeTCC). Under the probabilistic straggler configuration, we demonstrate that the average approximation error for BACC and LeTCC converges to zero at rates of at least O(log1/p4(N) · N-2) and O(log1 / p3(N) · N-3), respectively. This is perhaps surprising, as earlier results do not indicate convergence when the number of stragglers scales with the total number of servers N. However, in this case, despite the average number of stragglers being Np, the independence of servers in becoming stragglers allows the approximation error to converge to zero. These theoretical results are validated through experiments on various computational tasks, including deep neural networks.

Original languageEnglish (US)
Title of host publicationISIT 2025 - 2025 IEEE International Symposium on Information Theory, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331543990
DOIs
StatePublished - 2025
Event2025 IEEE International Symposium on Information Theory, ISIT 2025 - Ann Arbor, United States
Duration: Jun 22 2025Jun 27 2025

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8095

Conference

Conference2025 IEEE International Symposium on Information Theory, ISIT 2025
Country/TerritoryUnited States
CityAnn Arbor
Period6/22/256/27/25

Bibliographical note

Publisher Copyright:
© 2025 IEEE.

Fingerprint

Dive into the research topics of 'General Coded Computing in a Probabilistic Straggler Regime'. Together they form a unique fingerprint.

Cite this