TY - GEN
T1 - Compiler optimization of memory-resident value communication between speculative threads
AU - Zhai, Antonia
AU - Colohan, Christopher B.
AU - Steffan, J. Gregory
AU - Mowry, Todd C.
PY - 2004
Y1 - 2004
N2 - Efficient inter-thread value communication is essential for improving performance in Thread-Level Speculation (TLS). Although several mechanisms for improving value communication using hardware support have been proposed, there is relatively little work on exploiting the potential of compiler optimization. Building on recent research on compiler optimization of scalar value communication between speculative threads, we propose compiler techniques for the optimization of memory-resident values. In TLS, data dependences through memory-resident values are tracked by the underlying hardware and preserved by re-executing any speculative thread that violates a dependence; however, re-execution incurs a large performance penalty and should be used only to resolve data dependences that are infrequent. In contrast, value communication for frequently-occurring data dependences must be very efficient. In this paper, we propose using the compiler to first identify frequently-occurring memory-resident data dependences, then insert synchronization for communicating values to preserve these dependences. We find that by synchronizing frequently-occurring data dependences we can significantly improve the efficiency of parallel execution. A comparison between compiler-inserted and hardware-inserted memory synchronization reveals that the two techniques are complementary, with each technique benefitting different benchmarks.
AB - Efficient inter-thread value communication is essential for improving performance in Thread-Level Speculation (TLS). Although several mechanisms for improving value communication using hardware support have been proposed, there is relatively little work on exploiting the potential of compiler optimization. Building on recent research on compiler optimization of scalar value communication between speculative threads, we propose compiler techniques for the optimization of memory-resident values. In TLS, data dependences through memory-resident values are tracked by the underlying hardware and preserved by re-executing any speculative thread that violates a dependence; however, re-execution incurs a large performance penalty and should be used only to resolve data dependences that are infrequent. In contrast, value communication for frequently-occurring data dependences must be very efficient. In this paper, we propose using the compiler to first identify frequently-occurring memory-resident data dependences, then insert synchronization for communicating values to preserve these dependences. We find that by synchronizing frequently-occurring data dependences we can significantly improve the efficiency of parallel execution. A comparison between compiler-inserted and hardware-inserted memory synchronization reveals that the two techniques are complementary, with each technique benefitting different benchmarks.
UR - http://www.scopus.com/inward/record.url?scp=3042567406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=3042567406&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:3042567406
SN - 0769521029
SN - 9780769521022
T3 - International Symposium on Code Generation and Optimization, CGO
SP - 39
EP - 50
BT - International Symposium on Code Generation and Optimization, CGO 2004
T2 - International Symposium on Code Generation and Optimization, CGO 2004
Y2 - 20 March 2004 through 24 March 2004
ER -