Bisimulation for Markov decision processes through families of functional expressions

Norm Ferns, Doina Precup, Sophia Knight

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

We transfer a notion of quantitative bisimilarity for labelled Markov processes [1] to Markov decision processes with continuous state spaces. This notion takes the form of a pseudometric on the system states, cast in terms of the equivalence of a family of functional expressions evaluated on those states and interpreted as a real-valued modal logic. Our proof amounts to a slight modification of previous techniques [2,3] used to prove equivalence with a fixed-point pseudometric on the state-space of a labelled Markov process and making heavy use of the Kantorovich probability metric. Indeed, we again demonstrate equivalence with a fixed-point pseudometric defined on Markov decision processes [4]; what is novel is that we recast this proof in terms of integral probability metrics [5] defined through the family of functional expressions, shifting emphasis back to properties of such families. The hope is that a judicious choice of family might lead to something more computationally tractable than bisimilarity whilst maintaining its pleasing theoretical guarantees. Moreover, we use a trick from descriptive set theory to extend our results to MDPs with bounded measurable reward functions, dropping a previous continuity constraint on rewards and Markov kernels.

Original languageEnglish (US)
Title of host publicationHorizons of the Mind
Subtitle of host publicationA Tribute to Prakash Panangaden - Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday
PublisherSpringer- Verlag
Pages319-342
Number of pages24
ISBN (Print)9783319068794
DOIs
StatePublished - Jan 1 2014
Externally publishedYes
EventPrakashFest Conference - Oxford, United Kingdom
Duration: May 19 2014May 22 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8464 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

OtherPrakashFest Conference
CountryUnited Kingdom
CityOxford
Period5/19/145/22/14

Fingerprint Dive into the research topics of 'Bisimulation for Markov decision processes through families of functional expressions'. Together they form a unique fingerprint.

  • Cite this

    Ferns, N., Precup, D., & Knight, S. (2014). Bisimulation for Markov decision processes through families of functional expressions. In Horizons of the Mind: A Tribute to Prakash Panangaden - Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday (pp. 319-342). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 8464 LNCS). Springer- Verlag. https://doi.org/10.1007/978-3-319-06880-0_17