TY - JOUR
T1 - Access region locality for high-bandwidth processor memory system design
AU - Cho, Sangyeun
AU - Yew, Pen Chang
AU - Lee, Gyungho
PY - 1999/12/1
Y1 - 1999/12/1
N2 - This paper studies an interesting yet less explored behavior of memory access instructions, called access region locality. Unlike the traditional temporal and spatial data locality that focuses on individual memory locations and how accesses to the locations are inter-related, the access region locality concerns with each static memory instruction and its range of access locations at run time. We consider program's data, heap, and stack regions in this paper. Our experimental study using a set of SPEC95 benchmark programs shows that most memory reference instructions access a single region at run time. Also shown is that it is possible to accurately predict the access region of a memory instruction at run time by scrutinizing the addressing mode of the instruction and the past access region history of it. A simple run-time access region predictor is developed that is similar to a branch predictor in structure. We describe and evaluate a superscalar processor with two distinct sets of memory pipelines, driven by the access region predictor. Experimental results indicate that the proposed mechanism is very effective in providing high memory bandwidth to the processor, resulting in comparable or better performance than a conventional memory design with a heavily multi-ported data cache that can lead to much higher hardware complexity.
AB - This paper studies an interesting yet less explored behavior of memory access instructions, called access region locality. Unlike the traditional temporal and spatial data locality that focuses on individual memory locations and how accesses to the locations are inter-related, the access region locality concerns with each static memory instruction and its range of access locations at run time. We consider program's data, heap, and stack regions in this paper. Our experimental study using a set of SPEC95 benchmark programs shows that most memory reference instructions access a single region at run time. Also shown is that it is possible to accurately predict the access region of a memory instruction at run time by scrutinizing the addressing mode of the instruction and the past access region history of it. A simple run-time access region predictor is developed that is similar to a branch predictor in structure. We describe and evaluate a superscalar processor with two distinct sets of memory pipelines, driven by the access region predictor. Experimental results indicate that the proposed mechanism is very effective in providing high memory bandwidth to the processor, resulting in comparable or better performance than a conventional memory design with a heavily multi-ported data cache that can lead to much higher hardware complexity.
UR - http://www.scopus.com/inward/record.url?scp=0033311287&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0033311287&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:0033311287
SN - 1072-4451
SP - 136
EP - 146
JO - Proceedings of the Annual International Symposium on Microarchitecture
JF - Proceedings of the Annual International Symposium on Microarchitecture
T2 - Proceedings of the 1999 32nd Annual ACM/IEEE International Symposium on Microarchitecture, MICRO-32
Y2 - 16 November 1999 through 18 November 1999
ER -