Abstract
As neural networks continue to infiltrate diverse application domains, computing will begin to move out of the cloud and onto edge devices necessitating fast, reliable, and low-power (LP) solutions. To meet these requirements, we propose a time-domain core using one-shot delay measurements and a lightweight post-processing technique, dynamic threshold error correction (DTEC). This design differs from traditional digital implementations in that it uses the delay accumulated through a simple inverter chain distributed through an SRAM array to intrinsically compute resource intensive multiply-accumulate (MAC) operations. Implemented in 65-nm LP CMOS, we achieve an energy efficiency of 104.8 TOp/s/W at 0.7-V with 3b resolution for 19.1 fJ/MAC.
Original language | English (US) |
---|---|
Article number | 8718342 |
Pages (from-to) | 2777-2785 |
Number of pages | 9 |
Journal | IEEE Journal of Solid-State Circuits |
Volume | 54 |
Issue number | 10 |
DOIs | |
State | Published - Oct 2019 |
Bibliographical note
Funding Information:Manuscript received January 14, 2019; revised March 11, 2019 and April 25, 2019; accepted April 29, 2019. Date of publication May 20, 2019; date of current version September 24, 2019. This paper was approved by Guest Editor Chen-Hao Chang. This work was supported in part by the National Science Foundation under Award CCF-1763761 and in part by IGERT under Grant DGE-1069104. (Corresponding author: Chris H. Kim.) The authors are with the Electrical and Computer Engineering Department, University of Minnesota, Minneapolis, MN 55455 USA (e-mail: [email protected]; [email protected]).
Publisher Copyright:
© 1966-2012 IEEE.
Keywords
- Machine learning (ML)
- neuromorphic computing
- time-domain computing
- time-to-digital converter (TDC)