On-Chip Sparse Learning Acceleration With CMOS and Resistive Synaptic Devices

Jae Sun Seo, Binbin Lin, Minkyu Kim, Pai Yu Chen, Deepak Kadetotad, Zihan Xu, Abinash Mohanty, Sarma Vrudhula, Shimeng Yu, Jieping Ye, Yu Cao

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

Many recent advances in sparse coding led its wide adoption in signal processing, pattern classification, and object recognition applications. Even with improved performance in state-of-the-art algorithms and the hardware platform of CPUs/GPUs, solving a sparse coding problem still requires expensive computations, making real-time large-scale learning a very challenging problem. In this paper, we cooptimize algorithm, architecture, circuit, and device for real-time energy-efficient on-chip hardware acceleration of sparse coding. The principle of hardware acceleration is to recognize the properties of learning algorithms, which involve many parallel operations of data fetch and matrix/vector multiplication/addition. Today's von Neumann architecture, however, is not suitable for such parallelization, due to the separation of memory and the computing unit that makes sequential operations inevitable. Such principle drives both the selection of algorithms and the design evolution from CPU to CMOS application-specific integrated circuits (ASIC) to parallel architecture with resistive crosspoint array (PARCA) that we propose. The CMOS ASIC scheme implements sparse coding with SRAM dictionaries and all-digital circuits, and PARCA employs resistive-RAM dictionaries with special read and write circuits. We show that 65 nm implementation of the CMOS ASIC and PARCA scheme accelerates sparse coding computation by 394 and 2140x, respectively, compared to software running on a eight-core CPU. Simulated power for both hardware schemes lie in the milli-Watt range, making it viable for portable single-chip learning applications.

Original languageEnglish (US)
Article number7268884
Pages (from-to)969-979
Number of pages11
JournalIEEE Transactions on Nanotechnology
Volume14
Issue number6
DOIs
StatePublished - Nov 2015
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2002-2012 IEEE.

Keywords

  • Application specific integrated circuits
  • CMOS integrated circuits
  • Dictionaries
  • Hardware
  • Unsupervised learning
  • Very large scale integration

Fingerprint

Dive into the research topics of 'On-Chip Sparse Learning Acceleration With CMOS and Resistive Synaptic Devices'. Together they form a unique fingerprint.

Cite this