Learning Partially Observable Markov Decision Processes Using Coupled Canonical Polyadic Decomposition

Kejun Huang, Zhuoran Yang, Zhaoran Wang, Mingyi Hong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a new algorithm for learning the model parameters of a partially observable Markov decision process (POMDP) based on coupled canonical polyadic decomposition (CPD). Coupled CPD for a set of tensors is an extension to CPD for individual tensors, which has improved identifiability properties, as well as an analogous simultaneous diagonalization (SD) algorithm for uniquely recovering the latent factors efficiently. We explain how to form a set of three-way tensors from the trajectory of a POMDP under a stationary memoryless policy, so that coupled CPD can be applied afterwards to recover the model parameters, with identifiability and computational guarantees.

Original languageEnglish (US)
Title of host publication2019 IEEE Data Science Workshop, DSW 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages295-299
Number of pages5
ISBN (Electronic)9781728107080
DOIs
StatePublished - Jun 2019
Event2019 IEEE Data Science Workshop, DSW 2019 - Minneapolis, United States
Duration: Jun 2 2019Jun 5 2019

Publication series

Name2019 IEEE Data Science Workshop, DSW 2019 - Proceedings

Conference

Conference2019 IEEE Data Science Workshop, DSW 2019
Country/TerritoryUnited States
CityMinneapolis
Period6/2/196/5/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Keywords

  • coupled CPD
  • partially observable Markov decision process
  • reinforcement learning
  • tensor decomposition

Fingerprint

Dive into the research topics of 'Learning Partially Observable Markov Decision Processes Using Coupled Canonical Polyadic Decomposition'. Together they form a unique fingerprint.

Cite this