Minimizing Tardiness for Data-Intensive Applications in Heterogeneous Systems: A Matching Theory Perspective

Ke Xu, Liang Lv, Tong Li, Meng Shen, Haiyang Wang, Kun Yang

Research output: Contribution to journalArticle

Abstract

The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.

Original languageEnglish (US)
Article number8772187
Pages (from-to)144-158
Number of pages15
JournalIEEE Transactions on Parallel and Distributed Systems
Volume31
Issue number1
DOIs
StatePublished - Jan 1 2020

Fingerprint

Processing
Scheduling algorithms
Servers
Internet

Keywords

  • Heterogeneous system
  • data-intensive application
  • matching theory
  • power-of-D

Cite this

Minimizing Tardiness for Data-Intensive Applications in Heterogeneous Systems : A Matching Theory Perspective. / Xu, Ke; Lv, Liang; Li, Tong; Shen, Meng; Wang, Haiyang; Yang, Kun.

In: IEEE Transactions on Parallel and Distributed Systems, Vol. 31, No. 1, 8772187, 01.01.2020, p. 144-158.

Research output: Contribution to journalArticle

@article{8b3f84f6bb56413e8e89075d80fbf578,
title = "Minimizing Tardiness for Data-Intensive Applications in Heterogeneous Systems: A Matching Theory Perspective",
abstract = "The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.",
keywords = "Heterogeneous system, data-intensive application, matching theory, power-of-D",
author = "Ke Xu and Liang Lv and Tong Li and Meng Shen and Haiyang Wang and Kun Yang",
year = "2020",
month = "1",
day = "1",
doi = "10.1109/TPDS.2019.2930992",
language = "English (US)",
volume = "31",
pages = "144--158",
journal = "IEEE Transactions on Parallel and Distributed Systems",
issn = "1045-9219",
publisher = "IEEE Computer Society",
number = "1",

}

TY - JOUR

T1 - Minimizing Tardiness for Data-Intensive Applications in Heterogeneous Systems

T2 - A Matching Theory Perspective

AU - Xu, Ke

AU - Lv, Liang

AU - Li, Tong

AU - Shen, Meng

AU - Wang, Haiyang

AU - Yang, Kun

PY - 2020/1/1

Y1 - 2020/1/1

N2 - The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.

AB - The increasing data requirements of Internet applications have driven a dramatic surge in developing new programming paradigms and complex scheduling algorithms to handle data-intensive workloads. Due to the expanding volume and the variety of such flows, their raw data are often processed on Intermediate Processing Nodes (IPNs) before being sent to servers. However, the intermediate processing constraint is rarely considered in existing flow computing models. This paper aims to minimize the tardiness of data-intensive applications in the presence of intermediate processing constraint. Motivating cases show that the tardiness is affected by both IPN locations and flow dispatching strategies. Based on the observation that dispatching flows to IPNs is essentially building a matching between flows and IPNs, a novel solution is proposed based on matching theory. In the deployment phase, a tardiness-aware deferred acceptance algorithm is developed to optimize IPN locations. In the operation phase, the Power-of-D paradigm and matching theory are combined together to dispatch flows efficiently. Evaluation results show that our solution effectively minimizes the total tardiness of data-intensive applications in heterogeneous systems.

KW - Heterogeneous system

KW - data-intensive application

KW - matching theory

KW - power-of-D

UR - http://www.scopus.com/inward/record.url?scp=85070685168&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85070685168&partnerID=8YFLogxK

U2 - 10.1109/TPDS.2019.2930992

DO - 10.1109/TPDS.2019.2930992

M3 - Article

AN - SCOPUS:85070685168

VL - 31

SP - 144

EP - 158

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 1

M1 - 8772187

ER -