TY - JOUR
T1 - Parallel matrix-vector product using approximate hierarchical methods
AU - Grama, Ananth
AU - Kumar, Vipin
AU - Sameh, Ahmed
PY - 1995
Y1 - 1995
N2 - Matrix-vector products (mat-vecs) form the core of iterative methods used for solving dense linear systems. Often, these systems arise in the solution of integral equations used in electromagnetics, heat transfer, and wave propagation. In this paper, we present a parallel approximate method for computing mat-vecs used in the solution of integral equations. We use this method to compute dense mat-vecs of hundreds of thousands of elements. The combined speedups obtained from the use of approximate methods and parallel processing represent an improvement of several orders of magnitude over exact mat-vecs on uniprocessors. We demonstrate that our parallel formulation incurs minimal parallel processing overhead and scales up to a large number of processors. We study the impact of varying the accuracy of the approximate mat-vec on overall time and on parallel efficiency. Experimental results are presented for 256 processor Cray T3D and Thinking Machines CM5 parallel computers. We have achieved computation rates in excess of 5 GFLOPS on the T3D.
AB - Matrix-vector products (mat-vecs) form the core of iterative methods used for solving dense linear systems. Often, these systems arise in the solution of integral equations used in electromagnetics, heat transfer, and wave propagation. In this paper, we present a parallel approximate method for computing mat-vecs used in the solution of integral equations. We use this method to compute dense mat-vecs of hundreds of thousands of elements. The combined speedups obtained from the use of approximate methods and parallel processing represent an improvement of several orders of magnitude over exact mat-vecs on uniprocessors. We demonstrate that our parallel formulation incurs minimal parallel processing overhead and scales up to a large number of processors. We study the impact of varying the accuracy of the approximate mat-vec on overall time and on parallel efficiency. Experimental results are presented for 256 processor Cray T3D and Thinking Machines CM5 parallel computers. We have achieved computation rates in excess of 5 GFLOPS on the T3D.
UR - http://www.scopus.com/inward/record.url?scp=0029430861&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0029430861&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:0029430861
SN - 1063-9535
VL - 2
SP - 2065
EP - 2084
JO - Proceedings of the ACM/IEEE Supercomputing Conference
JF - Proceedings of the ACM/IEEE Supercomputing Conference
T2 - Proceedings of the 1995 ACM/IEEE Supercomputing Conference. Part 2 (of 2)
Y2 - 3 December 1995 through 8 December 1995
ER -