Abstract
Training of convolutional neural networks (CNNs) on embedded platforms to support on-device learning is earning vital importance in recent days. Designing flexible training hardware is much more challenging than inference hardware, due to design complexity and large computation/memory requirement. In this work, we present an automatic compiler based FPGA accelerator with 16-bit fixed-point precision for complete CNN training, including Forward Pass (FP), Backward Pass (BP) and Weight Update (WU). We implemented an optimized RTL library to perform training-specific tasks and developed an RTL compiler to automatically generate FPGA-synthesizable RTL based on user-defined constraints. We present a new cyclic weight storage/access scheme for on-chip BRAM and off-chip DRAM to efficiently implement non-transpose and transpose operations during FP and BP phases, respectively. Representative CNNs for CIFAR-10 dataset are implemented and trained on Intel Stratix 10 GX FPGA using proposed hardware architecture, demonstrating up to 479 GOPS performance.
Original language | English (US) |
---|---|
Title of host publication | Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019 |
Editors | Ioannis Sourdis, Christos-Savvas Bouganis, Carlos Alvarez, Leonel Antonio Toledo Diaz, Pedro Valero, Xavier Martorell |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 166-172 |
Number of pages | 7 |
ISBN (Electronic) | 9781728148847 |
DOIs | |
State | Published - Sep 2019 |
Externally published | Yes |
Event | 29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019 - Barcelona, Spain Duration: Sep 9 2019 → Sep 13 2019 |
Publication series
Name | Proceedings - 29th International Conference on Field-Programmable Logic and Applications, FPL 2019 |
---|
Conference
Conference | 29th International Conferenceon Field-Programmable Logic and Applications, FPL 2019 |
---|---|
Country/Territory | Spain |
City | Barcelona |
Period | 9/9/19 → 9/13/19 |
Bibliographical note
Publisher Copyright:© 2019 IEEE.
Keywords
- Back-propagation
- Convolution neural networks
- FPGA
- Hardware accelerator
- Neural network training