Zobrazeno 1 - 10
of 24
pro vyhledávání: '"Raichuk, Anton"'
Autor:
Pazos-Outón, Luis Miguel, Vasconcelos, Cristina Nader, Raichuk, Anton, Arnab, Anurag, Morris, Dan, Neumann, Maxim
Protecting and restoring forest ecosystems is critical for biodiversity conservation and carbon sequestration. Forest monitoring on a global scale is essential for prioritizing and assessing conservation efforts. Satellite-based remote sensing is the
Externí odkaz:
http://arxiv.org/abs/2406.18554
Autor:
Cideron, Geoffrey, Girgin, Sertan, Raichuk, Anton, Pietquin, Olivier, Bachem, Olivier, Hussenot, Léonard
We investigate models that can generate arbitrary natural language text (e.g. all English sentences) from a bounded, convex and well-behaved control space. We call them universal vec2text models. Such models would allow making semantic decisions in t
Externí odkaz:
http://arxiv.org/abs/2209.06792
Autor:
Dadashi, Robert, Hussenot, Léonard, Vincent, Damien, Girgin, Sertan, Raichuk, Anton, Geist, Matthieu, Pietquin, Olivier
In this paper, we propose a novel Reinforcement Learning (RL) framework for problems with continuous action spaces: Action Quantization from Demonstrations (AQuaDem). The proposed approach consists in learning a discretization of continuous action sp
Externí odkaz:
http://arxiv.org/abs/2110.10149
Autor:
Gu, Shixiang Shane, Diaz, Manfred, Freeman, Daniel C., Furuta, Hiroki, Ghasemipour, Seyed Kamyar Seyed, Raichuk, Anton, David, Byron, Frey, Erik, Coumans, Erwin, Bachem, Olivier
The goal of continuous control is to synthesize desired behaviors. In reinforcement learning (RL)-driven approaches, this is often accomplished through careful task reward engineering for efficient exploration and running an off-the-shelf RL algorith
Externí odkaz:
http://arxiv.org/abs/2110.04686
The $Q$-function is a central quantity in many Reinforcement Learning (RL) algorithms for which RL agents behave following a (soft)-greedy policy w.r.t. to $Q$. It is a powerful tool that allows action selection without a model of the environment and
Externí odkaz:
http://arxiv.org/abs/2108.07041
Autor:
Freeman, C. Daniel, Frey, Erik, Raichuk, Anton, Girgin, Sertan, Mordatch, Igor, Bachem, Olivier
We present Brax, an open source library for rigid body simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but rema
Externí odkaz:
http://arxiv.org/abs/2106.13281
Autor:
Orsini, Manu, Raichuk, Anton, Hussenot, Léonard, Vincent, Damien, Dadashi, Robert, Girgin, Sertan, Geist, Matthieu, Bachem, Olivier, Pietquin, Olivier, Andrychowicz, Marcin
Adversarial imitation learning has become a popular framework for imitation in continuous control. Over the years, several variations of its components were proposed to enhance the performance of the learned policies as well as the sample complexity
Externí odkaz:
http://arxiv.org/abs/2106.00672
Autor:
Hussenot, Leonard, Andrychowicz, Marcin, Vincent, Damien, Dadashi, Robert, Raichuk, Anton, Stafiniak, Lukasz, Girgin, Sertan, Marinier, Raphael, Momchev, Nikola, Ramos, Sabela, Orsini, Manu, Bachem, Olivier, Geist, Matthieu, Pietquin, Olivier
We address the issue of tuning hyperparameters (HPs) for imitation learning algorithms in the context of continuous-control, when the underlying reward function of the demonstrating expert cannot be observed at any time. The vast literature in imitat
Externí odkaz:
http://arxiv.org/abs/2105.12034
Object-centric representations have recently enabled significant progress in tackling relational reasoning tasks. By building a strong object-centric inductive bias into neural architectures, recent efforts have improved generalization and data effic
Externí odkaz:
http://arxiv.org/abs/2104.09402
Autor:
Andrychowicz, Marcin, Raichuk, Anton, Stańczyk, Piotr, Orsini, Manu, Girgin, Sertan, Marinier, Raphael, Hussenot, Léonard, Geist, Matthieu, Pietquin, Olivier, Michalski, Marcin, Gelly, Sylvain, Bachem, Olivier
In recent years, on-policy reinforcement learning (RL) has been successfully applied to many different continuous control tasks. While RL algorithms are often conceptually simple, their state-of-the-art implementations take numerous low- and high-lev
Externí odkaz:
http://arxiv.org/abs/2006.05990