Model-free reinforcement learning (RL) algorithms are used to solve sequential decision-making problems under uncertainty. They are data-driven methods and do not require an explicit model of the studied system or environment. Because of this characteristic, they are widely utilized in Intelligent Transportation Systems (ITS), as real-world transportation systems are highly complex and extremely difficult to model. However, in most literature, decisions are made according to the expected long-term return estimated by the RL algorithm, ignoring the underlying risk. In this work, a distributional RL algorithm called implicit quantile network is adapted for the energy management problem of a delivery vehicle. Instead of only estimating the expected long-term return, the full return distribution is estimated implicitly. This is highly beneficial for applications in ITS, as uncertainty and randomness are intrinsic characteristics of transportation systems. In addition, risk-aware strategies are integrated into the algorithm with the risk measure of conditional value at risk. In this study, we demonstrate that by changing a hyperparameter, the trade-off between fuel efficiency and the risk of running out of battery power during a delivery trip can be controlled according to different application scenarios and personal preferences.
|Original language||English (US)|
|Title of host publication||2020 IEEE 16th International Conference on Automation Science and Engineering, CASE 2020|
|Publisher||IEEE Computer Society|
|Number of pages||7|
|State||Published - Aug 2020|
|Event||16th IEEE International Conference on Automation Science and Engineering, CASE 2020 - Hong Kong, Hong Kong|
Duration: Aug 20 2020 → Aug 21 2020
|Name||IEEE International Conference on Automation Science and Engineering|
|Conference||16th IEEE International Conference on Automation Science and Engineering, CASE 2020|
|Period||8/20/20 → 8/21/20|
Bibliographical noteFunding Information:
*Corresponding author The information, data, or work presented herein was funded in part by the Advanced Research Projects Agency-Energy (ARPA-E) U.S. Department of Energy, under Award Number DE-AR0000795.
The information, data, or work presented herein was funded in part by the Advanced Research Projects Agency-Energy (ARPA-E) U.S. Department of Energy, under Award Number DE-AR0000795. The views and opinions of authors expressed herein do not necessarily state or reflect those of thenUitedtSteasoGvernment or anygaencyetrehof.
© 2020 IEEE.