Design and Programming of a Flexible, Cost-Effective Systolic Array Cell for Digital Signal Processing

Ross A.W. Smith, Mike Dillon, Gerald E. Sobelman

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


A programmable systolic array cell for signal processing applications is described. The cell uses two chips: the 16-b NCR45CM16 CMOS Multiplier/Accumulator (MAC) for arithmetic, and the Systolic Array Controller (SAC) for routing data and controlling the MAC. The SAC has a 64 by 18 b static RAM which is used each cycle: once to read a control word and once to read or write a data word. The SAC has two 16-b data streams and one 6-b address stream. A 16-b bidirectional port routes data between the 71-pin SAC and the 24-pin MAC. All major cell resources can operate concurrently. The many practical details of implementing systolic array algorithms on an array of SAC/MAC cells are fully presented. A library of macros for commonly used program segments is described. Key issues are discussed such as programming the MAC, scaling operands, loading RAM, synchronizing cells, delaying data, unloading results, combining the macros into a program, and pipelining a program. Two systolic algorithms are developed: matrix multiplication on a linear array, and matrix multiplication on a two-dimensional array. With a two-dimensional array, a series of pipelined matrix-matrix multiplications uses the MAC every cycle.

Original languageEnglish (US)
Pages (from-to)1198-1210
Number of pages13
JournalIEEE Transactions on Acoustics, Speech, and Signal Processing
Issue number7
StatePublished - Jul 1990

Fingerprint Dive into the research topics of 'Design and Programming of a Flexible, Cost-Effective Systolic Array Cell for Digital Signal Processing'. Together they form a unique fingerprint.

Cite this