TY - JOUR
T1 - ASLOP
T2 - A field-access affinity-based structure data layout optimizer
AU - Yan, Jia Nian
AU - He, Jiang Zhou
AU - Chen, Wen Guang
AU - Yew, Pen Chung
AU - Zheng, Wei Min
PY - 2011/9
Y1 - 2011/9
N2 - By rearranging the data, data layout optimizations improve the utilization of a cache line between two of its successive refills, thus reducing the total number of cache line refills and improving the performance of a program. In this paper, we show that to enable structure data layout optimizations to be effective, two parameters, namely intra-instance affinity and inter-instance affinity, need to be considered at the same time in order to model the cache line utilization more accurately. We also propose a lightweight approach to measure intra-instance affinity and inter-instance affinity to avoid complex memory trace analyses. A prototype, called ASLOP, has been implemented in the Open64 compiler and evaluated using benchmarks from SPEC CPU 2000, SPEC CPU 2006 and Olden benchmark suites that have extensive structure types. Our approach can achieve up to 48.1% performance improvement over the original programs, and 11.9% over the optimized programs using maximal reshaping, an existing approach that is known to produce close to the best results, on the two platforms we tested.
AB - By rearranging the data, data layout optimizations improve the utilization of a cache line between two of its successive refills, thus reducing the total number of cache line refills and improving the performance of a program. In this paper, we show that to enable structure data layout optimizations to be effective, two parameters, namely intra-instance affinity and inter-instance affinity, need to be considered at the same time in order to model the cache line utilization more accurately. We also propose a lightweight approach to measure intra-instance affinity and inter-instance affinity to avoid complex memory trace analyses. A prototype, called ASLOP, has been implemented in the Open64 compiler and evaluated using benchmarks from SPEC CPU 2000, SPEC CPU 2006 and Olden benchmark suites that have extensive structure types. Our approach can achieve up to 48.1% performance improvement over the original programs, and 11.9% over the optimized programs using maximal reshaping, an existing approach that is known to produce close to the best results, on the two platforms we tested.
KW - compiler optimization
KW - data-layout optimization
KW - inter-instance affinity
KW - intrainstance affinity
KW - memory hierarchy
UR - http://www.scopus.com/inward/record.url?scp=79961020409&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79961020409&partnerID=8YFLogxK
U2 - 10.1007/s11432-011-4265-0
DO - 10.1007/s11432-011-4265-0
M3 - Article
AN - SCOPUS:79961020409
SN - 1674-733X
VL - 54
SP - 1769
EP - 1783
JO - Science China Information Sciences
JF - Science China Information Sciences
IS - 9
ER -