TY - GEN
T1 - Accurate statistical approaches for generating representative workload compositions
AU - Eeckhout, Lieven
AU - Sundareswarat, Rashmi
AU - Yi, Joshua J.
AU - Lilja, David J
AU - Schrater, Paul R
N1 - Copyright:
Copyright 2011 Elsevier B.V., All rights reserved.
PY - 2005
Y1 - 2005
N2 - Composing a representative workload is a crucial step during the design process of a microprocessor. The workload should be composed in such a way that it is representative for the target domain of application and yet, the amount of redundancy in the workload should be minimized as much as possible in order not to overly increase the total simulation time. As a result, there is an important trade-off that needs to be made between workload representativeness and simulation accuracy versus simulation speed. Previous work used statistical data analysis techniques to identify representative benchmarks and corresponding inputs, also called a subset, from a large set of potential benchmarks and inputs. These methodologies measure a number of program characteristics on which Principal Components Analysis (PCA) is applied before identifying distinct program behaviors among the benchmarks using cluster analysis. In this paper we propose Independent Components Analysis (ICA) as a better alternative to PCA as it does not assume that the original data set has a Gaussian distribution, which allows ICA to better find the important axes in the workload space. Our experimental results using SPEC CPU2000 benchmarks show that ICA significantly outperforms PCA in that ICA achieves smaller benchmark subsets that are more accurate than those found by PCA.
AB - Composing a representative workload is a crucial step during the design process of a microprocessor. The workload should be composed in such a way that it is representative for the target domain of application and yet, the amount of redundancy in the workload should be minimized as much as possible in order not to overly increase the total simulation time. As a result, there is an important trade-off that needs to be made between workload representativeness and simulation accuracy versus simulation speed. Previous work used statistical data analysis techniques to identify representative benchmarks and corresponding inputs, also called a subset, from a large set of potential benchmarks and inputs. These methodologies measure a number of program characteristics on which Principal Components Analysis (PCA) is applied before identifying distinct program behaviors among the benchmarks using cluster analysis. In this paper we propose Independent Components Analysis (ICA) as a better alternative to PCA as it does not assume that the original data set has a Gaussian distribution, which allows ICA to better find the important axes in the workload space. Our experimental results using SPEC CPU2000 benchmarks show that ICA significantly outperforms PCA in that ICA achieves smaller benchmark subsets that are more accurate than those found by PCA.
UR - http://www.scopus.com/inward/record.url?scp=33749055123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749055123&partnerID=8YFLogxK
U2 - 10.1109/IISWC.2005.1526001
DO - 10.1109/IISWC.2005.1526001
M3 - Conference contribution
AN - SCOPUS:33749055123
SN - 0780394615
SN - 9780780394612
T3 - Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005
SP - 56
EP - 66
BT - Proceedings of the 2005 IEEE International Symposium on Workload Characterization, IISWC-2005
T2 - 2005 IEEE International Symposium on Workload Characterization, IISWC-2005
Y2 - 6 October 2005 through 8 October 2005
ER -