VarCatcher: A Framework for Tackling Performance Variability of Parallel Workloads on Multi-Core

Weihua Zhang, Xiaofeng Ji, Bo Song, Shiqiang Yu, Haibo Chen, Tao Li, Pen Chung Yew, Wenyun Zhao

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

The non-deterministic nature of multi-threaded workloads running on multi-core platforms often leads to notable performance variability from run to run. Such variability makes experimental results prone to misinterpretations or misguided claims. To deal with such variability, statistical inference methods are usually used to summarize the experimental results with certain confidence levels by running the experiments or measurements a large number of times. However, such statistical results are often too vague or too simplistic. They are not sufficient to help users understand the causes of such variability, and allow more in-depth analysis on the results or reproduce the results for validation during design space exploration. To allow better analyzability and reproducibility, we propose a framework to tackle such variability, called VarCatcher. The key to VarCatcher is to characterize a parallel execution using Parallel Characteristics Vector (PCV). A clustering-based approach is then used to group runs with similar execution characteristics that can later be used to analyze results in-depth, to customize different evaluation strategies, reproduce the result for variability, to determine the impact of features, or to assist performance diagnosis. We have built a prototype of VarCatcher that includes a user-level toolset for runtime monitoring and measurements using the Intel Processor Trace feature on commodity Intel processors as well as an architecture extension with very low runtime overheads (around 3 and 0.01 percent accordingly). Several case studies confirm that VarCatcher enables several appealing features such as in-depth result analysis, customized evaluation strategies, and reproducibility.

Original languageEnglish (US)
Article number7576653
Pages (from-to)1215-1228
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Volume28
Issue number4
DOIs
StatePublished - Apr 1 2017

Bibliographical note

Funding Information:
We are grateful to supports from the National Key Research and Development Program of China (No. 2016YFB0800104), the National Natural Science Foundation of China (No. 61672160) and Shanghai Science and Technology Development Funds (16JC1400801). We would like to thank all our anonymous reviewers for valuable feedback on the paper.

Keywords

  • Variability
  • evaluation
  • multi core
  • parallel application

Fingerprint Dive into the research topics of 'VarCatcher: A Framework for Tackling Performance Variability of Parallel Workloads on Multi-Core'. Together they form a unique fingerprint.

Cite this