TY - JOUR
T1 - In-memory processing on the spintronic cram
T2 - From hardware design to application mapping
AU - Zabihi, Masoud
AU - Chowdhury, Zamshed Iqbal
AU - Zhao, Zhengyang
AU - Karpuzcu, Ulya R.
AU - Wang, Jian Ping
AU - Sapatnekar, Sachin S.
N1 - Publisher Copyright:
© 1968-2012 IEEE.
PY - 2019/8/1
Y1 - 2019/8/1
N2 - The Computational Random Access Memory (CRAM) is a platform that makes a small modification to a standard spintronics-based memory array to organically enable logic operations within the array. CRAM provides a true in-memory computational platform that can perform computations within the memory array, as against other methods that send computational tasks to a separate processor module or a near-memory module at the periphery of the memory array. This paper describes how the CRAM structure can be built and utilized, accounting for considerations at the device, gate, and functional levels. Techniques for constructing fundamental gates are first overviewed, accounting for electrical and noise margin considerations. Next, these logic operations are composed to schedule operations in the array that implement basic arithmetic operations such as addition and multiplication. These methods are then demonstrated on 2D convolution with multibit data, and a binary neural inference engine. The performance of the CRAM is analyzed on near-Term and longer-Term spintronic device technologies. Significant improvements in energy and execution time for the CRAM-based implementation over a near-memory processing system are demonstrated, and can be attributed to the ability of CRAM to overcome the memory access bottleneck, and to provide high levels of parallelism to the computation.
AB - The Computational Random Access Memory (CRAM) is a platform that makes a small modification to a standard spintronics-based memory array to organically enable logic operations within the array. CRAM provides a true in-memory computational platform that can perform computations within the memory array, as against other methods that send computational tasks to a separate processor module or a near-memory module at the periphery of the memory array. This paper describes how the CRAM structure can be built and utilized, accounting for considerations at the device, gate, and functional levels. Techniques for constructing fundamental gates are first overviewed, accounting for electrical and noise margin considerations. Next, these logic operations are composed to schedule operations in the array that implement basic arithmetic operations such as addition and multiplication. These methods are then demonstrated on 2D convolution with multibit data, and a binary neural inference engine. The performance of the CRAM is analyzed on near-Term and longer-Term spintronic device technologies. Significant improvements in energy and execution time for the CRAM-based implementation over a near-memory processing system are demonstrated, and can be attributed to the ability of CRAM to overcome the memory access bottleneck, and to provide high levels of parallelism to the computation.
KW - STT-MRAM
KW - Spintronics
KW - in-memory computing
KW - memory bottleneck
KW - neuromorphic computing
KW - nonvolatile memory
UR - http://www.scopus.com/inward/record.url?scp=85050388355&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050388355&partnerID=8YFLogxK
U2 - 10.1109/TC.2018.2858251
DO - 10.1109/TC.2018.2858251
M3 - Article
AN - SCOPUS:85050388355
SN - 0018-9340
VL - 68
SP - 1159
EP - 1173
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 8
M1 - 8416761
ER -