Big data applications are memory-intensive, and the cost of bringing data from the memory to the processor involves large overheads in energy and processing time. This has driven the push towards specialized accelerator units that can perform computations close to where the data is stored. Two approaches have been proposed: near-memory computing places computational units at the periphery of memory for fast data access. true in-memory computing uses the memory array to perform computations through simple reconfigurations. Although there has been a great deal of recent interest in the area of in-memory computing, most solutions that are purported to fall into this class are really near-memory processors that perform computation near the edge of memory arrays/subarrays rather than inside it.We discuss a true in-memory computation platform in this presentation, the Computational Random Access Memory (CRAM). The CRAM enables this capability by making a small modification to a standard spintronics-based memory array. The CRAM-based approach is digital, unlike prior analog-like in-memory/near-memory solutions, which provides more robustness to process variations, particularly in immature technologies than analog schemes. Our solution is based on spintronics technology, which is attractive because of its robustness, high endurance, and its trajectory towards fast improvement [2, 4]. The outline of the CRAM approach was first proposed in , operating primarily at the technology level with some expositions at the circuit level. The work was developed further to show system-level applications and performance estimations in  based on a spin-transfer-torque (STT) magnetic tunnel junction (MTJ). Next, in , a bridge was built between the two to provide an explicit link between CRAM technology, circuit implementations, and operation scheduling. Most recently, in , a redesigned CRAM was designed around a new MTJ based on the spin-Hall effect (SHE), providing greatly improved energy efficiency. This talk provides an overview of several years of effort in developing the CRAM concept and surveys all of these efforts. The presentation covers alternatives at the technology level, followed by a description of how the in-memory computing array is designed, using the basic MTJ unit and some switches, to function both as a memory and a computational unit. This array is then used to build gates and arithmetic units by appropriately interconnecting memory cells, allowing high degrees of parallelism. Next, we show how complex arithmetic operations can be performed through appropriate scheduling (for adders, multipliers, dot products) and data placement of the operands. Finally, we demonstrate how this approach can be used to implement sample applications, such as neuromorphic inference engine and a 2D convolution, presenting results that benchmark the performance of these CRAMs against near-memory computation platforms. The performance gains can be atrributed to (a) highly efficient local processing within the memory, and (b) high levels of parallelism in rows of the memory.
|Original language||English (US)|
|Title of host publication||GLSVLSI 2019 - Proceedings of the 2019 Great Lakes Symposium on VLSI|
|Publisher||Association for Computing Machinery|
|Number of pages||1|
|State||Published - May 13 2019|
|Event||29th Great Lakes Symposium on VLSI, GLSVLSI 2019 - Tysons Corner, United States|
Duration: May 9 2019 → May 11 2019
|Name||Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI|
|Conference||29th Great Lakes Symposium on VLSI, GLSVLSI 2019|
|Period||5/9/19 → 5/11/19|
Bibliographical noteFunding Information:
This work was supported in part by the DARPA Non-Volatile Logic program, NSF SPX Award CCF-1725420, and by C-SPIN, one of the six SRC STARnet Centers, sponsored by MARCO and DARPA.
© 2019 ACM.
- In-memory computing
- Memory bottleneck
- Neuromorphic computing
- Nonvolatile memory
- Spin-hall effect