Low-overhead, high-speed multi-core barrier synchronization

John Sartori, Rakesh Kumar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Scopus citations


Whereas efficient barrier implementations were once a concern only in high-performance computing, recent trends in core integration make the topic relevant even for general-purpose CMPs. While the nature of CMP applications requires low-latency, the cost of low-latency barrier implementations using hardware-based techniques can be prohibitive for CMPs, where die area represents opportunities for throughput and yield. Similarly, whereas traditional multiprocessor barrier implementations were developed primarily for dedicated environments, scheduling and multi-programming on CMPs require more adaptable barrier implementations. In this paper, we present and evaluate three barrier implementations that are hybrids of software and dedicated hardware barriers and are specifically tailored for CMPs. The implementations leverage the unique characteristics of CMPs and provide low latency comparable to that of dedicated hardware networks at a fraction of the cost. The implementations also support adaptability, enabling efficient multi-programming and dynamic remapping of the barrier network.

Original languageEnglish (US)
Title of host publicationHigh Performance Embedded Architectures and Compilers - 5th International Conference, HiPEAC 2010, Proceedings
Number of pages17
StatePublished - Mar 25 2010
Event5th International Conference on High Performance Embedded Architectures and Compilers, HiPEAC 2010 - Pisa, Italy
Duration: Jan 25 2010Jan 27 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5952 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other5th International Conference on High Performance Embedded Architectures and Compilers, HiPEAC 2010


Dive into the research topics of 'Low-overhead, high-speed multi-core barrier synchronization'. Together they form a unique fingerprint.

Cite this