In this paper, we develop a compiler algorithm for detecting references to stale data in shared-memory multiprocessors. The algorithm consists of two key analysis techniques, stale reference detection and locality preserving analysis. While the stale reference detection finds the memory reference patterns that may violate cache coherence, the locality preserving analysis minimizes the number of such stale references by analyzing both temporal and spatial reuses. By computing the regions referenced by arrays inside loops, we extend the previous scalar algorithms [7, 9] for more precise analysis. We have implemented the algorithm on the Polaris parallelizing compiler , and using execution-driven simulations on Perfect Club benchmarks we demonstrate how unnecessary cache misses can be eliminated by the automatic stale reference detection.
|Number of pages
|IEEE Symposium on Parallel and Distributed Processing - Proceedings
|Published - Jan 1 1996
|Proceedings of the 1996 10th International Parallel Processing Symposium - Honolulu, HI, USA
Duration: Apr 15 1996 → Apr 19 1996