Zobrazeno 1 - 10
of 40
pro vyhledávání: '"van Seijen, Harm"'
Inspired by human conscious planning, we propose Skipper, a model-based reinforcement learning framework utilizing spatio-temporal abstractions to generalize better in novel situations. It automatically decomposes the given task into smaller, more ma
Externí odkaz:
http://arxiv.org/abs/2310.00229
Autor:
Rahimi-Kalahroudi, Ali, Rajendran, Janarthanan, Momennejad, Ida, van Seijen, Harm, Chandar, Sarath
One of the key behavioral characteristics used in neuroscience to determine whether the subject of study -- be it a rodent or a human -- exhibits model-based learning is effective adaptation to local changes in the environment, a particular form of a
Externí odkaz:
http://arxiv.org/abs/2303.08690
Autor:
Islam, Riashat, Tomar, Manan, Lamb, Alex, Efroni, Yonathan, Zang, Hongyu, Didolkar, Aniket, Misra, Dipendra, Li, Xin, van Seijen, Harm, Combes, Remi Tachet des, Langford, John
Learning to control an agent from data collected offline in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that i
Externí odkaz:
http://arxiv.org/abs/2211.00164
Humans commonly solve complex problems by decomposing them into easier subproblems and then combining the subproblem solutions. This type of compositional reasoning permits reuse of the subproblem solutions when tackling future tasks that share part
Externí odkaz:
http://arxiv.org/abs/2207.00429
Autor:
Wan, Yi, Rahimi-Kalahroudi, Ali, Rajendran, Janarthanan, Momennejad, Ida, Chandar, Sarath, van Seijen, Harm
In recent years, a growing number of deep model-based reinforcement learning (RL) methods have been introduced. The interest in deep model-based RL is not surprising, given its many potential benefits, such as higher sample efficiency and the potenti
Externí odkaz:
http://arxiv.org/abs/2204.11464
Autor:
Weir, Nathaniel, Yuan, Xingdi, Côté, Marc-Alexandre, Hausknecht, Matthew, Laroche, Romain, Momennejad, Ida, Van Seijen, Harm, Van Durme, Benjamin
Humans have the capability, aided by the expressive compositionality of their language, to learn quickly by demonstration. They are able to describe unseen task-performing procedures and generalize their execution to other contexts. In this work, we
Externí odkaz:
http://arxiv.org/abs/2203.04806
We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on the agent's trajectory that improves the sample efficiency in sparse-reward MDPs. We show that any optimal policy necessarily satisfies the k-SP constraint. Notably, the k-SP con
Externí odkaz:
http://arxiv.org/abs/2107.06405
Autor:
Zhang, Shangtong, Laroche, Romain, van Seijen, Harm, Whiteson, Shimon, Combes, Remi Tachet des
We investigate the discounting mismatch in actor-critic algorithm implementations from a representation learning perspective. Theoretically, actor-critic algorithms usually have discounting for both actor and critic, i.e., there is a $\gamma^t$ term
Externí odkaz:
http://arxiv.org/abs/2010.01069
Deep model-based Reinforcement Learning (RL) has the potential to substantially improve the sample-efficiency of deep RL. While various challenges have long held it back, a number of papers have recently come out reporting success with deep model-bas
Externí odkaz:
http://arxiv.org/abs/2007.03158
In an effort to better understand the different ways in which the discount factor affects the optimization process in reinforcement learning, we designed a set of experiments to study each effect in isolation. Our analysis reveals that the common per
Externí odkaz:
http://arxiv.org/abs/1906.00572