Abstract
Weakly coupled Markov decision processes (MDPs) are stochastic dynamic programs where decisions in independent sub-MDPs are linked via constraints. Their exact solution is computationally intractable. Numerical experiments have shown that Lagrangian relaxation can be an effective approximation technique. This paper considers two classes of weakly coupled MDPs with imperfect information. In the first case, the transition probabilities for each sub-MDP are characterized by parameters whose values are unknown. This yields a Bayes-adaptive weakly coupled MDP. In the second case, the decision-maker cannot observe the actual state and instead receives a noisy signal. This yields a weakly coupled partially observable MDP. Computationally tractable approximate dynamic programming methods combining semi-stochastic certainty equivalent control or Thompson sampling with Lagrangian relaxation are proposed. These methods are applied to a class of stochastic dynamic resource allocation problems and to restless multi-armed bandit problems with partially observable states. Insights are drawn from numerical experiments.
Original language | English (US) |
---|---|
Title of host publication | 2019 Winter Simulation Conference, WSC 2019 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 3609-3620 |
Number of pages | 12 |
ISBN (Electronic) | 9781728132839 |
DOIs | |
State | Published - Dec 2019 |
Externally published | Yes |
Event | 2019 Winter Simulation Conference, WSC 2019 - National Harbor, United States Duration: Dec 8 2019 → Dec 11 2019 |
Publication series
Name | Proceedings - Winter Simulation Conference |
---|---|
Volume | 2019-December |
ISSN (Print) | 0891-7736 |
Conference
Conference | 2019 Winter Simulation Conference, WSC 2019 |
---|---|
Country/Territory | United States |
City | National Harbor |
Period | 12/8/19 → 12/11/19 |
Bibliographical note
Publisher Copyright:© 2019 IEEE.