Efficient and Real-Time Reinforcement Learning for Linear Quadratic Systems with Application to H-Infinity Control

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

This paper presents a model-free, real-time, data-efficient Q-learning-based algorithm to solve the H control of linear discrete-time systems. The computational complexity is shown to reduce from O(q3) in the literature to O(q2) in the proposed algorithm, where q is quadratic in the sum of the size of state variables, control inputs, and disturbance. An adaptive optimal controller is designed and the parameters of the action and critic networks are learned online without the knowledge of the system dynamics, making the proposed algorithm completely model-free. Also, a sufficient probing noise is only needed in the first iteration and does not affect the proposed algorithm. With no need for an initial stabilizing policy, the algorithm converges to the closed-form solution obtained by solving the Riccati equation. A simulation study is performed by applying the proposed algorithm to real-time control of an autonomous mobility-on-demand (AMoD) system for a real-world case study to evaluate the effectiveness of the proposed algorithm.

Original languageEnglish (US)
Title of host publication2023 62nd IEEE Conference on Decision and Control, CDC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6277-6282
Number of pages6
ISBN (Electronic)9798350301243
DOIs
StatePublished - 2023
Event62nd IEEE Conference on Decision and Control, CDC 2023 - Singapore, Singapore
Duration: Dec 13 2023Dec 15 2023

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Conference

Conference62nd IEEE Conference on Decision and Control, CDC 2023
Country/TerritorySingapore
CitySingapore
Period12/13/2312/15/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Fingerprint

Dive into the research topics of 'Efficient and Real-Time Reinforcement Learning for Linear Quadratic Systems with Application to H-Infinity Control'. Together they form a unique fingerprint.

Cite this