Weakly Coupled Markov Decision Processes with Imperfect Information

Mahshid Salemi Parizi, Archis Ghate

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Weakly coupled Markov decision processes (MDPs) are stochastic dynamic programs where decisions in independent sub-MDPs are linked via constraints. Their exact solution is computationally intractable. Numerical experiments have shown that Lagrangian relaxation can be an effective approximation technique. This paper considers two classes of weakly coupled MDPs with imperfect information. In the first case, the transition probabilities for each sub-MDP are characterized by parameters whose values are unknown. This yields a Bayes-adaptive weakly coupled MDP. In the second case, the decision-maker cannot observe the actual state and instead receives a noisy signal. This yields a weakly coupled partially observable MDP. Computationally tractable approximate dynamic programming methods combining semi-stochastic certainty equivalent control or Thompson sampling with Lagrangian relaxation are proposed. These methods are applied to a class of stochastic dynamic resource allocation problems and to restless multi-armed bandit problems with partially observable states. Insights are drawn from numerical experiments.

Original languageEnglish (US)
Title of host publication2019 Winter Simulation Conference, WSC 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3609-3620
Number of pages12
ISBN (Electronic)9781728132839
DOIs
StatePublished - Dec 2019
Externally publishedYes
Event2019 Winter Simulation Conference, WSC 2019 - National Harbor, United States
Duration: Dec 8 2019Dec 11 2019

Publication series

NameProceedings - Winter Simulation Conference
Volume2019-December
ISSN (Print)0891-7736

Conference

Conference2019 Winter Simulation Conference, WSC 2019
Country/TerritoryUnited States
CityNational Harbor
Period12/8/1912/11/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Fingerprint

Dive into the research topics of 'Weakly Coupled Markov Decision Processes with Imperfect Information'. Together they form a unique fingerprint.

Cite this