TY - JOUR
T1 - Efficient continual learning at the edge with progressive segmented training
AU - Du, Xiaocong
AU - Venkataramanaiah, Shreyas Kolala
AU - Li, Zheng
AU - Suh, Han Sok
AU - Yin, Shihui
AU - Krishnan, Gokul
AU - Liu, Frank
AU - Seo, Jae Sun
AU - Cao, Yu
N1 - Publisher Copyright:
© 2022 The Author(s). Published by IOP Publishing Ltd.
PY - 2022/12/1
Y1 - 2022/12/1
N2 - There is an increasing need for continual learning in dynamic systems at the edge, such as self-driving vehicles, surveillance drones, and robotic systems. Such a system requires learning from the data stream, training the model to preserve previous information and adapt to a new task, and generating a single-headed vector for future inference, within a limited power budget. Different from previous continual learning algorithms with dynamic structures, this work focuses on a single network and model segmentation to mitigate catastrophic forgetting problem. Leveraging the redundant capacity of a single network, model parameters for each task are separated into two groups: one important group which is frozen to preserve current knowledge, and a secondary group to be saved (not pruned) for future learning. A fixed-size memory containing a small amount of previously seen data is further adopted to assist the training. Without additional regularization, the simple yet effective approach of progressive segmented training (PST) successfully incorporates multiple tasks and achieves state-of-the-art accuracy in the single-head evaluation on the CIFAR-10 and CIFAR-100 datasets. Moreover, the segmented training significantly improves computation efficiency in continual learning and thus, enabling efficient continual learning at the edge. On Intel Stratix-10 MX FPGA, we further demonstrate the efficiency of PST with representative CNNs trained on CIFAR-10.
AB - There is an increasing need for continual learning in dynamic systems at the edge, such as self-driving vehicles, surveillance drones, and robotic systems. Such a system requires learning from the data stream, training the model to preserve previous information and adapt to a new task, and generating a single-headed vector for future inference, within a limited power budget. Different from previous continual learning algorithms with dynamic structures, this work focuses on a single network and model segmentation to mitigate catastrophic forgetting problem. Leveraging the redundant capacity of a single network, model parameters for each task are separated into two groups: one important group which is frozen to preserve current knowledge, and a secondary group to be saved (not pruned) for future learning. A fixed-size memory containing a small amount of previously seen data is further adopted to assist the training. Without additional regularization, the simple yet effective approach of progressive segmented training (PST) successfully incorporates multiple tasks and achieves state-of-the-art accuracy in the single-head evaluation on the CIFAR-10 and CIFAR-100 datasets. Moreover, the segmented training significantly improves computation efficiency in continual learning and thus, enabling efficient continual learning at the edge. On Intel Stratix-10 MX FPGA, we further demonstrate the efficiency of PST with representative CNNs trained on CIFAR-10.
KW - acquisitive learning
KW - brain inspiration
KW - continual learning
KW - deep neural network
KW - model adaption
UR - http://www.scopus.com/inward/record.url?scp=85160950958&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85160950958&partnerID=8YFLogxK
U2 - 10.1088/2634-4386/ac9899
DO - 10.1088/2634-4386/ac9899
M3 - Article
AN - SCOPUS:85160950958
SN - 2634-4386
VL - 2
JO - Neuromorphic Computing and Engineering
JF - Neuromorphic Computing and Engineering
IS - 4
M1 - 044006
ER -