Abstract
In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-The-shelf SRAMs for non-Transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - International SoC Design Conference, ISOCC 2020 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 21-22 |
Number of pages | 2 |
ISBN (Electronic) | 9781728183312 |
DOIs | |
State | Published - Oct 21 2020 |
Externally published | Yes |
Event | 17th International System-on-Chip Design Conference, ISOCC 2020 - Yeosu, Korea, Republic of Duration: Oct 21 2020 → Oct 24 2020 |
Publication series
Name | Proceedings - International SoC Design Conference, ISOCC 2020 |
---|
Conference
Conference | 17th International System-on-Chip Design Conference, ISOCC 2020 |
---|---|
Country/Territory | Korea, Republic of |
City | Yeosu |
Period | 10/21/20 → 10/24/20 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
Keywords
- convolutional neural networks
- energy efficiency
- hardware accelerator
- on-device training