Zobrazeno 1 - 10
of 819
pro vyhledávání: '"Romoff, P."'
Exploration bonuses in reinforcement learning guide long-horizon exploration by defining custom intrinsic objectives. Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence
Externí odkaz:
http://arxiv.org/abs/2310.18144
Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes. However, a challenge that often arises is the trade-off between both the computational efficiency of the decision-making proce
Externí odkaz:
http://arxiv.org/abs/2308.09629
Autor:
Dmitry M. Tsyganov
Publikováno v:
Литературный факт, Vol 3, Iss 33, Pp 416-436 (2024)
The article focuses on the translation and publication of L.-F. Céline’s debut novel Voyage au bout de la nuit (1932) as an example from the history of Soviet cultural diplomacy in the 1930s. So far, the specialized literature has not given unambi
Externí odkaz:
https://doaj.org/article/628b6387727149528a72d7fd846ab1b5
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Beeching, Edward, Peter, Maxim, Marcotte, Philippe, Debangoye, Jilles, Simonin, Olivier, Romoff, Joshua, Wolf, Christian
We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions. In this setting, classical symbolic planners are not applicable or difficult to adapt. We introduce a
Externí odkaz:
http://arxiv.org/abs/2112.11731
The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors. Most often, practitioners go about the task of behavior specification by manually engineering the reward function, a c
Externí odkaz:
http://arxiv.org/abs/2112.12228
In video games, non-player characters (NPCs) are used to enhance the players' experience in a variety of ways, e.g., as enemies, allies, or innocent bystanders. A crucial component of NPCs is navigation, which allows them to move from one point to an
Externí odkaz:
http://arxiv.org/abs/2011.04764
Autor:
Romoff, Joshua, Henderson, Peter, Kanaa, David, Bengio, Emmanuel, Touati, Ahmed, Bacon, Pierre-Luc, Pineau, Joelle
We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers. Our method, TDprop, computes a per parameter learning rate based on the diagonal
Externí odkaz:
http://arxiv.org/abs/2007.02786
Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts of machine learning research. We introduce a framework that makes this easier by providing a simple interface for tracking realtime energy cons
Externí odkaz:
http://arxiv.org/abs/2002.05651
Publikováno v:
Advances in Neural Information Processing Systems (2019) 13299-13309
Multi-simulator training has contributed to the recent success of Deep Reinforcement Learning by stabilizing learning and allowing for higher training throughputs. We propose Gossip-based Actor-Learner Architectures (GALA) where several actor-learner
Externí odkaz:
http://arxiv.org/abs/1906.04585