The non-deterministic nature of multi-threaded workloads running on multi-core platforms often leads to notable performance variability from run to run. Such variability makes experimental results prone to misinterpretations or misguided claims. To deal with such variability, statistical inference methods are usually used to summarize the experimental results with certain confidence levels by running the experiments or measurements a large number of times. However, such statistical results are often too vague or too simplistic. They are not sufficient to help users understand the causes of such variability, and allow more in-depth analysis on the results or reproduce the results for validation during design space exploration. To allow better analyzability and reproducibility, we propose a framework to tackle such variability, called VarCatcher. The key to VarCatcher is to characterize a parallel execution using Parallel Characteristics Vector (PCV). A clustering-based approach is then used to group runs with similar execution characteristics that can later be used to analyze results in-depth, to customize different evaluation strategies, reproduce the result for variability, to determine the impact of features, or to assist performance diagnosis. We have built a prototype of VarCatcher that includes a user-level toolset for runtime monitoring and measurements using the Intel Processor Trace feature on commodity Intel processors as well as an architecture extension with very low runtime overheads (around 3 and 0.01 percent accordingly). Several case studies confirm that VarCatcher enables several appealing features such as in-depth result analysis, customized evaluation strategies, and reproducibility.
|Original language||English (US)|
|Number of pages||14|
|Journal||IEEE Transactions on Parallel and Distributed Systems|
|State||Published - Apr 1 2017|
Bibliographical noteFunding Information:
We are grateful to supports from the National Key Research and Development Program of China (No. 2016YFB0800104), the National Natural Science Foundation of China (No. 61672160) and Shanghai Science and Technology Development Funds (16JC1400801). We would like to thank all our anonymous reviewers for valuable feedback on the paper.
- multi core
- parallel application