In this paper, we present new methods for load balancing of unstructured tree computations on large-scale SIMD machines, and analyze the scalability of these and other existing schemes. An efficient formulation of tree search on a SIMD machine comprises of two major components: (i) a triggering mechanism, which determines when the search space redistribution must occur to balance search space over processors; and (ii) a scheme to redistribute the search space. We have devised a new redistribution mechanism and a new triggering mechanism. Either of these can be used in conjunction with triggering and redistribution mechanisms developed by other researchers. We analyze the scalability of these mechanisms, and verify the results experimentally. The analysis and experiments show that our new load balancing methods are highly scalable on SIMD architectures. Their scalability is shown to be no worse than that of the best load balancing schemes on MIMD architectures. We verify our theoretical results by implementing the 15-puzzle problem on a CM-21 SIMD parallel computer.
|Original language||English (US)|
|Title of host publication||Proceedings of the 1992 ACM/IEEE conference on Supercomputing, Supercomputing 1992|
|Publisher||Association for Computing Machinery|
|Number of pages||10|
|State||Published - Dec 1 1992|
|Event||1992 ACM/IEEE conference on Supercomputing, Supercomputing 1992 - Minneapolis, United States|
Duration: Nov 16 1992 → Nov 20 1992
|Name||Proceedings of the International Conference on Supercomputing|
|Other||1992 ACM/IEEE conference on Supercomputing, Supercomputing 1992|
|Period||11/16/92 → 11/20/92|
Bibliographical noteFunding Information:
1 Introduction Tree search is central to solving a variety of problems in artificial intelligence [12, 26], combinatorial optimization [11, 21], operations research  and Monte-Carlo evaluations of functional integrals . The trees that need to be searched for most practical problems happen to be quite large, and for many tree search algorithms, different parts can be searched relatively independently. These trees tend to be highly irregular in nature and hence, a naive scheme for partitioning the search space can result in highly uneven distribution of work among processors and lead to poor overall performance. The job of partitioning irregular search spaces is particularly difficult for SIMD parallel computers such as the CM-2, in which all processors work in lock-step to execute the same *This work was supported by IST/SDIO through the Army Research Office grant #28408-MA-SDI and by the Army High Performance Computing Research Center at the University of Minnesota.
© 1992 IEEE.