Recursive Constraints to Prevent Instability in Constrained Reinforcement Learning

Autor:	Lee, Jaeyoung, Sedwards, Sean, Czarnecki, Krzysztof
Rok vydání:	2022
Předmět:	Computer Science - Machine Learning Computer Science - Artificial Intelligence 68T05 I.2.6
Druh dokumentu:	Working Paper
Popis:	We consider the challenge of finding a deterministic policy for a Markov decision process that uniformly (in all states) maximizes one reward subject to a probabilistic constraint over a different reward. Existing solutions do not fully address our precise problem definition, which nevertheless arises naturally in the context of safety-critical robotic systems. This class of problem is known to be hard, but the combined requirements of determinism and uniform optimality can create learning instability. In this work, after describing and motivating our problem with a simple example, we present a suitable constrained reinforcement learning algorithm that prevents learning instability, using recursive constraints. Our proposed approach admits an approximative form that improves efficiency and is conservative w.r.t. the constraint. Comment: Accepted at 1st Multi-Objective Decision Making Workshop (MODeM 2021). Cite as: Jaeyoung Lee, Sean Sedwards and Krzysztof Czarnecki. (2021). Recursive constraints to prevent instability in constrained reinforcement learning. In: Proc. 1st Multi-Objective Decision Making Workshop (MODeM 2021), Hayes, Mannion, Vamplew (eds). Online at http://modem2021.cs.nuigalway.ie
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2201.07958 Zobrazit plný text záznamu View this record from Arxiv