Online control basis selection by a regularized actor critic algorithm

Jianjun Yuan, Andrew Lamperski

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations


Policy gradient algorithms are useful reinforcement learning methods which optimize a control policy by performing stochastic gradient descent with respect to controller parameters. In this paper, we extend actor-critic algorithms by adding an ℓ1 norm regularization on the actor part, which makes our algorithm automatically select and optimize the useful controller basis functions. Our method is closely related to existing approaches to sparse controller design and actuator selection, but in contrast to these, our approach runs online and does not require a plant model. In order to utilize ℓ1 regularization online, the actor updates are extended to include an iterative soft-thresholding step. Convergence of the algorithm is proved using methods from stochastic approximation. The effectiveness of our algorithm for control basis and actuator selection is demonstrated on numerical examples.

Original languageEnglish (US)
Title of host publication2017 American Control Conference, ACC 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (Electronic)9781509059928
StatePublished - Jun 29 2017
Event2017 American Control Conference, ACC 2017 - Seattle, United States
Duration: May 24 2017May 26 2017

Publication series

NameProceedings of the American Control Conference
ISSN (Print)0743-1619


Other2017 American Control Conference, ACC 2017
Country/TerritoryUnited States


Dive into the research topics of 'Online control basis selection by a regularized actor critic algorithm'. Together they form a unique fingerprint.

Cite this