Cache coherence mechanisms in shared-memory multiprocessors typically use either updating or invalidating to prevent access to stale data, but neither enforcement strategy is the best choice for all programs. We present a compile-time optimization that uses the look-ahead capability of the compiler to select updating, invalidating, or neither for each write reference in a program to thereby produce the best overall memory performance. We implement this optimization in the Parafrase-2 compiler for memory references to scalar variables and use trace-driven simulations to compare the performance of this compiler-assisted adaptive coherence enforcement to hardware-only mechanisms. We find that this compiler optimization can produce miss ratios comparable to those produced by an updating-only mechanism while frequently reducing the total network traffic to below that produced by any of the hardware-only mechanisms.
|Original language||English (US)|
|Number of pages||10|
|State||Published - Dec 1 1994|
|Event||Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques (PACT'94) - Montreal, Can|
Duration: Aug 24 1994 → Aug 26 1994