Abstract
This study presents a robust deep reinforcement learning (RL) approach for real-time, network-wide holding control with transfer synchronization, considering stochastic passenger demand and vehicle running time during daily operations. The problem is formulated within a multi-agent RL framework where each active trip is considered as an agent, which not only interacts with the environment but also with other agents in the considered transit network. A specific learning procedure is developed to learn robust policies by introducing $maxmin$ optimization into the learning objective. The agents are trained via deep deterministic policy gradient algorithm (DDPG) using an extended actor-critic framework with a joint action approximator. The effectiveness of the proposed approach is evaluated in a simulator, which is calibrated using data collected from a transit network in Twin Cities Minnesota, USA. The learned policy is compared with no control, rule-based control and the rolling horizon optimization control (RHOC). Computational results suggest that RL approach can significantly reduce the online computation time by about 50% compared with RHOC. In terms of policy performance, under deterministic scenario, the average waiting time of RL approach is 1.3% higher than the theoretical lower bound of average waiting time; under stochastic scenarios, RL approach could reduce as much as 18% average waiting time than RHOC, and the performance relative to RHOC improves when the level of system uncertainty increases. Evaluation under disrupted environment also suggests that the proposed RL method is more robust against short term uncertainties. The promising results in terms of both online computational efficiency and solution effectiveness suggest that the proposed RL method is a valid candidate for real-time transit control when the dynamics cannot be modeled perfectly with system uncertainties, as is the case for the network-wide transfer synchronization problem.
Original language | English (US) |
---|---|
Pages (from-to) | 23993-24007 |
Number of pages | 15 |
Journal | IEEE Transactions on Intelligent Transportation Systems |
Volume | 23 |
Issue number | 12 |
DOIs | |
State | Published - Dec 1 2022 |
Bibliographical note
Publisher Copyright:© 2000-2011 IEEE.
Keywords
- Transit holding control
- actor-critic learning architecture
- deep deterministic policy gradient
- network-wide transfer synchronization
- robust multi-agent reinforcement learning