Abstract
Cache coherence mechanisms in shared-memory multiprocessors typically use either updating or invalidating to prevent access to stale data, but neither enforcement strategy is the best choice for all programs. We present a compile-time optimization that uses the look-ahead capability of the compiler to select updating, invalidating, or neither for each write reference in a program to thereby produce the best overall memory performance. We implement this optimization in the Parafrase-2 compiler for memory references to scalar variables and use trace-driven simulations to compare the performance of this compiler-assisted adaptive coherence enforcement to hardware-only mechanisms. We find that this compiler optimization can produce miss ratios comparable to those produced by an updating-only mechanism while frequently reducing the total network traffic to below that produced by any of the hardware-only mechanisms.
Original language | English (US) |
---|---|
Pages (from-to) | 69-78 |
Number of pages | 10 |
Journal | Unknown Journal |
Issue number | A-50 |
State | Published - Dec 1 1994 |
Event | Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'94) - Montreal, Can Duration: Aug 24 1994 → Aug 26 1994 |