In a large shared-memory multiprocessor system, a large number of simultaneous accesses to a single shared variable (called a hot spot in ) can degrade the performance of its shared memory system. Software combining  is an inexpensive alternative to the hardware combining networks [3, 9] for tackling this problem. This paper gives software combining algorithms for three different types of hot-spot accesses: (1) barrier synchronizations in parallel loops, (2) fetch-and-add type of operations, and (3) P and V operations on semaphores. They include most of the general hot-spot access patterns. By using software combining trees to distribute hot-spot accessings, the number of processors that can access the same location is greatly reduced. In these algorithms, the completion time of a hot-spot access is in the order O(log2N) in a multiprocessor system with N processors, assuming that the delay of a switch element in an interconnection network is a constant time, O(1).
Bibliographical noteFunding Information:
* This work was supported in part by the National Science Foundation under Grants US NSF DCR84-06916 and US NSF DCR84-10110, the U.S. Department of Energy under Grant NS DOE DE FG02-85ER25001, and by the donations of IBM and Digital Equipment Corporations.