A DNN Compression Framework for SOT-MRAM-based Processing-In-Memory Engine

Geng Yuan, Xiaolong Ma, Sheng Lin, Zhengang Li, Jieren Deng, Caiwen Ding

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

The computing wall and data movement challenges of deep neural networks (DNNs) have exposed the limitations of conventional CMOS-based DNN accelerators. Furthermore, the deep structure and large model size will make DNNs prohibitive to embedded systems and IoT devices, where low power consumption is required. To address these challenges, spin-orbit torque magnetic random-access memory (SOT-MRAM) and SOT-MRAM based Processing-In-Memory (PIM) engines have been used to reduce the power consumption of DNNs since SOT-MRAM has the characteristic of near-zero standby power, high density, non-volatile. However, the drawbacks of SOT-MRAM based PIM engines such as high writing latency and requiring low bit-width data decrease its popularity as a favorable energy-efficient DNN accelerator. To mitigate these drawbacks, we propose an ultra-energy-efficient framework by using model compression techniques including weight pruning and quantization from the software level considering the architecture of SOT-MRAM PIM. And we incorporate the alternating direction method of multipliers (ADMM) into the training phase to further guarantee the solution feasibility and satisfy SOT-MRAM hardware constraints. Thus, the footprint and power consumption of SOT-MRAM PIM can be reduced, while increasing the overall system performance rate (frame per second) in the meantime, making our proposed ADMM-based SOT-MRAM PIM more energy efficient and suitable for embedded systems or IoT devices. Our experimental results show the accuracy and compression rate of our proposed framework is consistently outperforming the reference works, while the efficiency (area power) and performance rate of SOT-MRAM PIM engine is significantly improved.

Original languageEnglish (US)
Title of host publicationProceedings - 33rd IEEE International System on Chip Conference, SOCC 2020
EditorsGang Qu, Jinjun Xiong, Danella Zhao, Venki Muthukumar, Md Farhadur Reza, Ramalingam Sridhar
PublisherIEEE Computer Society
Pages37-42
Number of pages6
ISBN (Electronic)9781728187457
DOIs
StatePublished - Sep 8 2020
Externally publishedYes
Event33rd IEEE International System on Chip Conference, SOCC 2020 - Virtual, Las Vegas, United States
Duration: Sep 8 2020Sep 11 2020

Publication series

NameInternational System on Chip Conference
Volume2020-September
ISSN (Print)2164-1676
ISSN (Electronic)2164-1706

Conference

Conference33rd IEEE International System on Chip Conference, SOCC 2020
Country/TerritoryUnited States
CityVirtual, Las Vegas
Period9/8/209/11/20

Bibliographical note

Publisher Copyright:
© 2020 IEEE.

Fingerprint

Dive into the research topics of 'A DNN Compression Framework for SOT-MRAM-based Processing-In-Memory Engine'. Together they form a unique fingerprint.

Cite this