A programmable systolic array cell for signal processing applications is described. The cell uses two chips: the 16-b NCR45CM16 CMOS Multiplier/Accumulator (MAC) for arithmetic, and the Systolic Array Controller (SAC) for routing data and controlling the MAC. The SAC has a 64 by 18 b static RAM which is used each cycle: once to read a control word and once to read or write a data word. The SAC has two 16-b data streams and one 6-b address stream. A 16-b bidirectional port routes data between the 71-pin SAC and the 24-pin MAC. All major cell resources can operate concurrently. The many practical details of implementing systolic array algorithms on an array of SAC/MAC cells are fully presented. A library of macros for commonly used program segments is described. Key issues are discussed such as programming the MAC, scaling operands, loading RAM, synchronizing cells, delaying data, unloading results, combining the macros into a program, and pipelining a program. Two systolic algorithms are developed: matrix multiplication on a linear array, and matrix multiplication on a two-dimensional array. With a two-dimensional array, a series of pipelined matrix-matrix multiplications uses the MAC every cycle.
|Original language||English (US)|
|Number of pages||13|
|Journal||IEEE Transactions on Acoustics, Speech, and Signal Processing|
|State||Published - Jul 1990|