Temporal-Logic-Constrained Hybrid Reinforcement Learning to Perform Optimal Aerial Monitoring with Delivery Drones

Ahmet Semi Asarkaya, Derya Aksaray, Yasin Yazicioglu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

In this paper, we consider a package delivery drone that is desired to simultaneously perform aerial monitoring as a secondary mission. To integrate this secondary mission, we utilize a reward function representing the value of information gathered via aerial monitoring. We use time window temporal logic (TWTL) specifications to define the pickup and delivery tasks while utilizing reinforcement learning (RL) to maximize the expected sum of rewards. The high-level decision-making of the drone is modeled as a Markov decision process (MDP). In this regard, we extend the previous work where a model-free RL algorithm was used to solve this optimization problem. We propose a modified Dyna-Q algorithm to address the shortage of online samples. We provide extensive simulation results to compare the performance of the model-free and hybrid RL algorithms in this application and investigate the effect of the different system parameters on the overall performance.

Original languageEnglish (US)
Title of host publication2021 International Conference on Unmanned Aircraft Systems, ICUAS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages285-294
Number of pages10
ISBN (Electronic)9780738131153
DOIs
StatePublished - Jun 15 2021
Event2021 International Conference on Unmanned Aircraft Systems, ICUAS 2021 - Athens, Greece
Duration: Jun 15 2021Jun 18 2021

Publication series

Name2021 International Conference on Unmanned Aircraft Systems, ICUAS 2021

Conference

Conference2021 International Conference on Unmanned Aircraft Systems, ICUAS 2021
Country/TerritoryGreece
CityAthens
Period6/15/216/18/21

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Fingerprint

Dive into the research topics of 'Temporal-Logic-Constrained Hybrid Reinforcement Learning to Perform Optimal Aerial Monitoring with Delivery Drones'. Together they form a unique fingerprint.

Cite this