Zobrazeno 1 - 10
of 16
pro vyhledávání: '"Klissarov, Martin"'
Autor:
Klissarov, Martin, Henaff, Mikael, Raileanu, Roberta, Sodhani, Shagun, Vincent, Pascal, Zhang, Amy, Bacon, Pierre-Luc, Precup, Doina, Machado, Marlos C., D'Oro, Pierluca
Describing skills in natural language has the potential to provide an accessible way to inject human knowledge about decision-making into an AI system. We present MaestroMotif, a method for AI-assisted skill design, which yields high-performing and a
Externí odkaz:
http://arxiv.org/abs/2412.08542
Large pretrained models are showing increasingly better performance in reasoning and planning tasks across different modalities, opening the possibility to leverage them for complex sequential decision making problems. In this paper, we investigate t
Externí odkaz:
http://arxiv.org/abs/2410.05656
Pre-trained Vision-Language Models (VLMs) are able to understand visual concepts, describe and decompose complex tasks into sub-tasks, and provide feedback on task completion. In this paper, we aim to leverage these capabilities to support the traini
Externí odkaz:
http://arxiv.org/abs/2402.04764
Autor:
Klissarov, Martin, D'Oro, Pierluca, Sodhani, Shagun, Raileanu, Roberta, Bacon, Pierre-Luc, Vincent, Pascal, Zhang, Amy, Henaff, Mikael
Exploring rich environments and evaluating one's actions without prior knowledge is immensely challenging. In this paper, we propose Motif, a general method to interface such prior knowledge from a Large Language Model (LLM) with an agent. Motif is b
Externí odkaz:
http://arxiv.org/abs/2310.00166
Autor:
Klissarov, Martin, Machado, Marlos C.
Selecting exploratory actions that generate a rich stream of experience for better learning is a fundamental challenge in reinforcement learning (RL). An approach to tackle this problem consists in selecting actions according to specific policies for
Externí odkaz:
http://arxiv.org/abs/2301.11181
Autor:
Klissarov, Martin, Precup, Doina
Temporal abstraction in reinforcement learning (RL), offers the promise of improving generalization and knowledge transfer in complex environments, by propagating information more efficiently over time. Although option learning was initially formulat
Externí odkaz:
http://arxiv.org/abs/2112.03097
Autor:
Klissarov, Martin, Precup, Doina
Potential-based reward shaping provides an approach for designing good reward functions, with the purpose of speeding up learning. However, automatically finding potential functions for complex environments is a difficult problem (in fact, of the sam
Externí odkaz:
http://arxiv.org/abs/2010.02474
Autor:
Khetarpal, Khimya, Klissarov, Martin, Chevalier-Boisvert, Maxime, Bacon, Pierre-Luc, Precup, Doina
Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can initiate,
Externí odkaz:
http://arxiv.org/abs/2001.00271
We present new results on learning temporally extended actions for continuoustasks, using the options framework (Suttonet al.[1999b], Precup [2000]). In orderto achieve this goal we work with the option-critic architecture (Baconet al.[2017])using a
Externí odkaz:
http://arxiv.org/abs/1712.00004