Abstract
We consider a delivery drone that is supposed to achieve pick-up and delivery tasks that arrive stochastically during a mission. Since a delivery drone is often equipped with a camera, it can also gather useful information by monitoring the environment during the pick-up and delivery task. Motivated by the multi-use of drones, we address a persistent monitoring problem where a drone’s high-level decision making is modeled as a Markov decision process (MDP) with unknown transition probabilities. The reward function is designed based on the valuable information over the environment, and the pick-up and delivery tasks are defined by bounded time temporal logic specifications. We use a reinforcement learning (RL) algorithm that maximizes the expected sum of rewards while various dynamically arriving temporal logic specifications are satisfied with a desired probability in every episode during learning. We demonstrate the simulation results and discuss the quality of the proposed method.
| Original language | English (US) |
|---|---|
| Title of host publication | AIAA Scitech 2021 Forum |
| Publisher | American Institute of Aeronautics and Astronautics Inc, AIAA |
| Pages | 1-13 |
| Number of pages | 13 |
| ISBN (Print) | 9781624106095 |
| DOIs | |
| State | Published - Jan 4 2021 |
| Event | AIAA Science and Technology Forum and Exposition, AIAA SciTech Forum 2021 - Virtual, Online Duration: Jan 11 2021 → Jan 15 2021 |
Publication series
| Name | AIAA Scitech 2021 Forum |
|---|
Conference
| Conference | AIAA Science and Technology Forum and Exposition, AIAA SciTech Forum 2021 |
|---|---|
| City | Virtual, Online |
| Period | 1/11/21 → 1/15/21 |
Bibliographical note
Publisher Copyright:© 2021, American Institute of Aeronautics and Astronautics Inc, AIAA. All Rights Reserved.