Výsledky vyhledávání - "Ostrovski, Georg"

Report

Deep Reinforcement Learning with Plasticity Injection

Autor: Nikishin, Evgenii, Oh, Junhyuk, Ostrovski, Georg, Lyle, Clare, Pascanu, Razvan, Dabney, Will, Barreto, André

A growing body of evidence suggests that neural networks employed in deep reinforcement learning (RL) gradually lose their plasticity, the ability to learn from new data; however, the analysis and mitigation of this phenomenon is hampered by the comp

Externí odkaz: http://arxiv.org/abs/2305.15555

Zobrazit plný text záznamu

Report

An Analysis of Quantile Temporal-Difference Learning

Autor: Rowland, Mark, Munos, Rémi, Azar, Mohammad Gheshlaghi, Tang, Yunhao, Ostrovski, Georg, Harutyunyan, Anna, Tuyls, Karl, Bellemare, Marc G., Dabney, Will

We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning. Despite these empirical successe

Externí odkaz: http://arxiv.org/abs/2301.04462

Zobrazit plný text záznamu

Report

An Empirical Study of Implicit Regularization in Deep Offline RL

Autor: Gulcehre, Caglar, Srinivasan, Srivatsan, Sygnowski, Jakub, Ostrovski, Georg, Farajtabar, Mehrdad, Hoffman, Matt, Pascanu, Razvan, Doucet, Arnaud

Deep neural networks are the most commonly used function approximators in offline reinforcement learning. Prior works have shown that neural nets trained with TD-learning and gradient descent can exhibit implicit regularization that can be characteri

Externí odkaz: http://arxiv.org/abs/2207.02099

Zobrazit plný text záznamu

Report

The Phenomenon of Policy Churn

Autor: Schaul, Tom, Barreto, André, Quan, John, Ostrovski, Georg

We identify and study the phenomenon of policy churn, that is, the rapid change of the greedy policy in value-based reinforcement learning. Policy churn operates at a surprisingly rapid pace, changing the greedy action in a large fraction of states w

Externí odkaz: http://arxiv.org/abs/2206.00730

Zobrazit plný text záznamu

Report

The Difficulty of Passive Learning in Deep Reinforcement Learning

Autor: Ostrovski, Georg, Castro, Pablo Samuel, Dabney, Will

Learning to act from observational data without active environmental interaction is a well-known challenge in Reinforcement Learning (RL). Recent approaches involve constraints on the learned policy or conservative updates, preventing strong deviatio

Externí odkaz: http://arxiv.org/abs/2110.14020

Zobrazit plný text záznamu

Report

When should agents explore?

Autor: Pîslar, Miruna, Szepesvari, David, Ostrovski, Georg, Borsa, Diana, Schaul, Tom

Exploration remains a central challenge for reinforcement learning (RL). Virtually all existing methods share the feature of a monolithic behaviour policy that changes only gradually (at best). In contrast, the exploratory behaviours of animals and h

Externí odkaz: http://arxiv.org/abs/2108.11811

Zobrazit plný text záznamu

Report

Return-based Scaling: Yet Another Normalisation Trick for Deep RL

Autor: Schaul, Tom, Ostrovski, Georg, Kemaev, Iurii, Borsa, Diana

Scaling issues are mundane yet irritating for practitioners of reinforcement learning. Error scales vary across domains, tasks, and stages of learning; sometimes by many orders of magnitude. This can be detrimental to learning speed and stability, cr

Externí odkaz: http://arxiv.org/abs/2105.05347

Zobrazit plný text záznamu

Report

On The Effect of Auxiliary Tasks on Representation Dynamics

Autor: Lyle, Clare, Rowland, Mark, Ostrovski, Georg, Dabney, Will

While auxiliary tasks play a key role in shaping the representations learnt by reinforcement learning agents, much is still unknown about the mechanisms through which this is achieved. This work develops our understanding of the relationship between

Externí odkaz: http://arxiv.org/abs/2102.13089

Zobrazit plný text záznamu

Report

Temporally-Extended {\epsilon}-Greedy Exploration

Autor: Dabney, Will, Ostrovski, Georg, Barreto, André

Recent work on exploration in reinforcement learning (RL) has led to a series of increasingly complex solutions to the problem. This increase in complexity often comes at the expense of generality. Recent empirical studies suggest that, when applied

Externí odkaz: http://arxiv.org/abs/2006.01782

Zobrazit plný text záznamu

Report

Adapting Behaviour for Learning Progress

Autor: Schaul, Tom, Borsa, Diana, Ding, David, Szepesvari, David, Ostrovski, Georg, Dabney, Will, Osindero, Simon

Determining what experience to generate to best facilitate learning (i.e. exploration) is one of the distinguishing features and open challenges in reinforcement learning. The advent of distributed agents that interact with parallel instances of the

Externí odkaz: http://arxiv.org/abs/1912.06910

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání