TY - GEN
T1 - Exploiting TLS parallelism at multiple loop-nest levels
AU - Packirisamy, Venkatesan
AU - Zhai, Antonia B
PY - 2009
Y1 - 2009
N2 - As the number of cores integrated onto a single chip increases, architecture and compiler designers are challenged with the difficulty of utilizing these cores to improve the performance of a single application. Thread-level speculation (TLS) can potentially help by allowing possibly dependent threads to speculatively execute in parallel. Extracting speculative thread from sequential applications is key to efficient TLS execution. Previous work on thread extraction has focused on parallelizing iterations from a single loop-nest level or function continuation. However, the amount of parallelism available at a single loopnest level is sometimes limited, and we are forced to look for parallelism across multiple loop-nest levels. In this paper we propose SpecOPTAL - a compiler algorithm that statically allocates cores to threads extracted from different levels of loopnests. We show that, a subset of SPEC 2006 benchmarks are able to benefit from the proposed technique.
AB - As the number of cores integrated onto a single chip increases, architecture and compiler designers are challenged with the difficulty of utilizing these cores to improve the performance of a single application. Thread-level speculation (TLS) can potentially help by allowing possibly dependent threads to speculatively execute in parallel. Extracting speculative thread from sequential applications is key to efficient TLS execution. Previous work on thread extraction has focused on parallelizing iterations from a single loop-nest level or function continuation. However, the amount of parallelism available at a single loopnest level is sometimes limited, and we are forced to look for parallelism across multiple loop-nest levels. In this paper we propose SpecOPTAL - a compiler algorithm that statically allocates cores to threads extracted from different levels of loopnests. We show that, a subset of SPEC 2006 benchmarks are able to benefit from the proposed technique.
UR - http://www.scopus.com/inward/record.url?scp=77949634324&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77949634324&partnerID=8YFLogxK
U2 - 10.1109/ICPADS.2009.143
DO - 10.1109/ICPADS.2009.143
M3 - Conference contribution
AN - SCOPUS:77949634324
SN - 9780769539003
T3 - Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS
SP - 205
EP - 212
BT - ICPADS '09 - 15th International Conference on Parallel and Distributed Systems
T2 - 15th International Conference on Parallel and Distributed Systems, ICPADS '09
Y2 - 8 December 2009 through 11 December 2009
ER -