Evaluating benchmark subsetting approaches

Joshua J. Yi, Resit Sendag, Lieven Eeckhout, Ajay Joshi, David J. Lilja, Lizy K. John

Research output: Chapter in Book/Report/Conference proceedingConference contribution

22 Scopus citations


To reduce the simulation time to a tractable amount or due to compilation (or other related) problems, computer architects often simulate only a subset of the benchmarks in a benchmark suite. However, if the architect chooses a subset of benchmarks that is not representative, the subsequent simulation results will, at best, be misleading or, at worst, yield incorrect conclusions. To address this problem, computer architects have recently proposed several statistically-based approaches to subset a benchmark suite. While some of these approaches are well-grounded statistically, what has not yet been thoroughly evaluated is the: 1) Absolute accuracy, 2) Relative accuracy across a range of processor and memory subsystem enhancements, and 3) Representativeness and coverage of each approach for a range of subset sizes. Specifically, this paper evaluates statistically-based subsetting approaches based on principal components analysis (PCA) and the Plackett and Burman (P&B) design, in addition to prevailing approaches such as integer vs. floating-point, core vs. memory-bound, by language, and at random. Our results show that the two statistically-based approaches, PCA and P&B, have the best absolute and relative accuracy for CPI and energy-delay product (EDP), produce subsets that are the most representative, and choose benchmark and input set pairs that are most well-distributed across the benchmark space. To achieve a 5% absolute CPI and EDP error, across a wide range of configurations, PCA and P&B typically need about 17 benchmark and input set pairs, while the other five approaches often choose more than 30 benchmark and input set pairs.

Original languageEnglish (US)
Title of host publicationProceedings of the 2006 IEEE International Symposium on Workload Characterization, IISWC - 2006
Number of pages12
StatePublished - 2006
EventIEEE International Symposium on Workload Characterization, IISWC-2006 - San Jose, CA, United States
Duration: Oct 25 2006Oct 27 2006

Publication series

NameProceedings of the 2006 IEEE International Symposium on Workload Characterization, IISWC - 2006


OtherIEEE International Symposium on Workload Characterization, IISWC-2006
Country/TerritoryUnited States
CitySan Jose, CA


Dive into the research topics of 'Evaluating benchmark subsetting approaches'. Together they form a unique fingerprint.

Cite this