Solving a system of equations of the form Tx = y, where T is a sparse triangular matrix, is required after the factorization phase in the direct methods of solving systems of linear equations. A few parallel formulations have been proposed recently. The common belief in parallelizing this problem is that the parallel formulation utilizing a two dimensional distribution of T is unscalable. In this paper, we propose the first known efficient scalable parallel algorithm which uses a two dimensional block cyclic distribution of T. The algorithm is shown to be applicable to dense as well as sparse triangular solvers. Since most of the known highly scalable algorithms employed in the factorization phase yield a two dimensional distribution of T, our algorithm avoids the redistribution cost incurred by the one dimensional algorithms. We present the parallel runtime and scalability analyses of the proposed two dimensional algorithm. The dense triangular solver is shown to be scalable. The sparse triangular solver is shown to be at least as scalable as the dense solver. We also show that it is optimal for one class of sparse systems. The experimental results of the sparse triangular solver show that it has good speedup characteristics and yields high performance for a variety of sparse systems.
|Original language||English (US)|
|Number of pages||7|
|State||Published - Dec 1 1997|
|Event||Proceedings of the 1997 4th International Conference on High Performance Computing, HiPC - Bangalore, India|
Duration: Dec 18 1997 → Dec 21 1997
|Other||Proceedings of the 1997 4th International Conference on High Performance Computing, HiPC|
|Period||12/18/97 → 12/21/97|