Reinforcement Learning with Bounded Information Loss

Autor:	Jan Peters, Katharina Mülling, Yevgeny Seldin, Yasemin Altun, Ali Mohammad-Djafari, Jean-François Bercher, Pierre Bessiére
Rok vydání:	2011
Předmět:	Mathematical optimization Kullback–Leibler divergence business.industry Bounded function Bayesian probability Entropy (information theory) Reinforcement learning Covariant transformation Robotics Artificial intelligence business Premature convergence Mathematics
Zdroj:	AIP Conference Proceedings
Popis:	Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature convergence and implausible solutions. As first suggested in the context of covariant or natural policy gradients, many of these problems may be addressed by constraining the information loss. In this paper, we continue this path of reasoning and suggest two reinforcement learning methods, i.e., a model‐based and a model free algorithm that bound the loss in relative entropy while maximizing their return. The resulting methods differ significantly from previous policy gradient approaches and yields an exact update step. It works well on typical reinforcement learning benchmark problems as well as novel evaluations in robotics. We also show a Bayesian bound motivation of this new approach [8].
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::0f89dc89fd0bd3f28c3beb1fa61bc824 https://hdl.handle.net/21.11116/0000-0002-81C9-9 Zobrazit plný text záznamu