We propose several implementations of Gaussian elimination for solving banded linear systems on multiprocessors. Three simple architectures are considered: a multiprocessor ring, a grid array, and a hypercube. Our complexity analysis fully accounts for communication delays by using simple models where both latency and actual transfer times are incorporated. When the number of processors is small relative to the bandwidth of the system, a row-interleaved implementation of Gaussian elimination algorithm is attractive. Otherwise, a two-dimensional grid is essential for achieving higher speedup. The hypercube architecture gives the smallest communication latency times.