FORMS: Fine-grained polarized ReRAM-based in-situ computation for mixed-signal DNN accelerator

Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding

Research output: Chapter in Book/Report/Conference proceedingConference contribution

42 Scopus citations

Abstract

Recent work demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication - the intensive and key computation in deep neural networks (DNNs). One key problem is the weights that are signed values. However, in a ReRAM crossbar, weights are stored as conductance of the crossbar cells, and the in-situ computation assumes all cells on each crossbar column are of the same sign. The current architectures either use two ReRAM crossbars for positive and negative weights (PRIME), or add an offset to weights so that all values become positive (ISAAC). Neither solution is ideal: they either double the cost of crossbars, or incur extra offset circuity. To better address this problem, we propose FORMS, a fine-grained ReRAM-based DNN accelerator with algorithm/hardware co-design. Instead of trying to represent the positive/negative weights, our key design principle is to enforce exactly what is assumed in the in-situ computation - ensuring that all weights in the same column of a crossbar have the same sign. It naturally avoids the cost of an additional crossbar. Such polarized weights can be nicely generated using alternating direction method of multipliers (ADMM) regularized optimization during the DNN training, which can exactly enforce certain patterns in DNN weights. To achieve high accuracy, we divide the crossbar into logical sub-arrays and only enforce this property within the fine-grained sub-array columns. Crucially, the small sub-arrays provides a unique opportunity for input zero-skipping, which can significantly avoid unnecessary computations and reduce computation time. At the same time, it also makes the hardware much easier to implement and is less susceptible to non-idealities and noise than coarse-grained architectures. Putting all together, with the same optimized DNN models, FORMS achieves 1.50× and 1.93× throughput improvement in terms of $\frac{{GOPs}}{{s \times m{m^2}}}$ and $\frac{{GOPs}}{W}$ compared to ISAAC, and 1.12× ~2.4 × speed up in terms of frame per second over optimized ISAAC with almost the same power/area cost. Interestingly, FORMS optimization framework can even speed up the original ISAAC from 10.7 × up to 377.9×, reflecting the importance of software/hardware co-design optimizations.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture, ISCA 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages265-278
Number of pages14
ISBN (Electronic)9781665433334
DOIs
StatePublished - Jun 2021
Externally publishedYes
Event48th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2021 - Virtual, Online, Spain
Duration: Jun 14 2021Jun 19 2021

Publication series

NameProceedings - International Symposium on Computer Architecture
Volume2021-June
ISSN (Print)1063-6897

Conference

Conference48th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2021
Country/TerritorySpain
CityVirtual, Online
Period6/14/216/19/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Fingerprint

Dive into the research topics of 'FORMS: Fine-grained polarized ReRAM-based in-situ computation for mixed-signal DNN accelerator'. Together they form a unique fingerprint.

Cite this