Abstract
State-of-the-art in-memory computing (IMC) architectures employ an array of homogeneous tiles and severely underutilize processing elements (PEs). In this article, the researchers propose an area and energy optimization methodology to generate a heterogeneous IMC architecture coupled with an optimized Network-on-Chip (NoC) for deep neural network (DNN) acceleration. The researchers first propose an area-aware optimization technique that improves the PE array utilization. This is achieved by generating a heterogeneous tile-based IMC architecture that consists of tiles of different sizes, with different numbers of PEs where each PE is of the same size. They minimize the communication energy across a large number of tiles using an NoC architecture with optimized tile-to-router mapping and scheduling. Overall, our proposed area and energy optimization methodology generates a heterogeneous IMC architecture coupled with an optimized NoC for DNN acceleration.
Original language | English (US) |
---|---|
Article number | 9114969 |
Pages (from-to) | 79-87 |
Number of pages | 9 |
Journal | IEEE Design and Test |
Volume | 37 |
Issue number | 6 |
DOIs | |
State | Published - Dec 2020 |
Externally published | Yes |
Bibliographical note
Funding Information:This work was supported in part by the Center for Brain-Inspired Computing (C-BRIC), one of the six centers in JUMP; in part by the Semiconductor Research Corporation program sponsored by the Defense Advanced Research Projects Agency (DARPA), National Science Foundation (NSF) CAREER under Award CNS-1651624; and in part by the Semiconductor Research Corporation under Grant 2938.001. Gokul Krishnan and Sumit K. Man-dal contributed equally to this work.
Keywords
- Deep Neural Networks
- In-Memory Computing
- Interconnect
- Network-on-Chip
- Neural Network Accelerator
- RRAM