Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints during Reinforcement Learning

Derya Aksaray, Yasin Yazicioglu, Ahmet Semi Asarkaya

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

We propose a novel constrained reinforcement learning method for finding optimal policies in Markov Decision Processes while satisfying temporal logic constraints with a desired probability throughout the learning process. An automata-theoretic approach is proposed to ensure the probabilistic satisfaction of the constraint in each episode, which is different from penalizing violations to achieve constraint satisfaction after a sufficiently large number of episodes. The proposed approach is based on computing a lower bound on the probability of constraint satisfaction and adjusting the exploration behavior as needed. We present theoretical results on the probabilistic constraint satisfaction achieved by the proposed approach. We also numerically demonstrate the proposed idea in a drone scenario, where the constraint is to perform periodically arriving pick-up and delivery tasks and the objective is to fly over high-reward zones to simultaneously perform aerial monitoring.

Original languageEnglish (US)
Title of host publicationIEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6531-6537
Number of pages7
ISBN (Electronic)9781665417143
DOIs
StatePublished - 2021
Event2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021 - Prague, Czech Republic
Duration: Sep 27 2021Oct 1 2021

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2021 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2021
Country/TerritoryCzech Republic
CityPrague
Period9/27/2110/1/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Fingerprint

Dive into the research topics of 'Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints during Reinforcement Learning'. Together they form a unique fingerprint.

Cite this