Výsledky vyhledávání - "Van Hasselt, P."

Report

Normalization and effective learning rates in reinforcement learning

Autor: Lyle, Clare, Zheng, Zeyu, Khetarpal, Khimya, Martens, James, van Hasselt, Hado, Pascanu, Razvan, Dabney, Will

Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature, with several works highlighting diverse benefits such as improving loss landscape conditioning and combatting overestim

Externí odkaz: http://arxiv.org/abs/2407.01800

Zobrazit plný text záznamu

Report

Disentangling the Causes of Plasticity Loss in Neural Networks

Autor: Lyle, Clare, Zheng, Zeyu, Khetarpal, Khimya, van Hasselt, Hado, Pascanu, Razvan, Martens, James, Dabney, Will

Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a \textit{stationary} data distribution. In settings where this assumption is

Externí odkaz: http://arxiv.org/abs/2402.18762

Zobrazit plný text záznamu

Report

A Survey of Temporal Credit Assignment in Deep Reinforcement Learning

Autor: Pignatelli, Eduardo, Ferret, Johan, Geist, Matthieu, Mesnard, Thomas, van Hasselt, Hado, Pietquin, Olivier, Toni, Laura

The Credit Assignment Problem (CAP) refers to the longstanding challenge of Reinforcement Learning (RL) agents to associate actions with their long-term consequences. Solving the CAP is a crucial step towards the successful deployment of RL in the re

Externí odkaz: http://arxiv.org/abs/2312.01072

Zobrazit plný text záznamu

Akademický článek

QSPRpred: a Flexible Open-Source Quantitative Structure-Property Relationship Modelling Tool

Autor: Helle W. van den Maagdenberg, Martin Šícho, David Alencar Araripe, Sohvi Luukkonen, Linde Schoenmaker, Michiel Jespers, Olivier J. M. Béquignon, Marina Gorostiola González, Remco L. van den Broek, Andrius Bernatavicius, J. G. Coen van Hasselt, Piet. H. van der Graaf, Gerard J. P. van Westen

Publikováno v: Journal of Cheminformatics, Vol 16, Iss 1, Pp 1-16 (2024)

Abstract Building reliable and robust quantitative structure–property relationship (QSPR) models is a challenging task. First, the experimental data needs to be obtained, analyzed and curated. Second, the number of available methods is continuously

Externí odkaz: https://doaj.org/article/9debc1b05a614b88bc3be65b20d3adda

Zobrazit plný text záznamu

Plný text ve formátu HTML

Report

A Definition of Continual Reinforcement Learning

Autor: Abel, David, Barreto, André, Van Roy, Benjamin, Precup, Doina, van Hasselt, Hado, Singh, Satinder

In a standard view of the reinforcement learning problem, an agent's goal is to efficiently identify a policy that maximizes long-term reward. However, this perspective is based on a restricted view of learning as finding a solution, rather than trea

Externí odkaz: http://arxiv.org/abs/2307.11046

Zobrazit plný text záznamu

Report

On the Convergence of Bounded Agents

Autor: Abel, David, Barreto, André, van Hasselt, Hado, Van Roy, Benjamin, Precup, Doina, Singh, Satinder

When has an agent converged? Standard models of the reinforcement learning problem give rise to a straightforward definition of convergence: An agent converges when its behavior or performance in each environment state stops changing. However, as we

Externí odkaz: http://arxiv.org/abs/2307.11044

Zobrazit plný text záznamu

Report

Exploration via Epistemic Value Estimation

Autor: Schmitt, Simon, Shawe-Taylor, John, van Hasselt, Hado

How to efficiently explore in reinforcement learning is an open problem. Many exploration algorithms employ the epistemic uncertainty of their own value predictions -- for instance to compute an exploration bonus or upper confidence bound. Unfortunat

Externí odkaz: http://arxiv.org/abs/2303.04012

Zobrazit plný text záznamu

Report

Learning How to Infer Partial MDPs for In-Context Adaptation and Exploration

Autor: Jiang, Chentian, Ke, Nan Rosemary, van Hasselt, Hado

To generalize across tasks, an agent should acquire knowledge from past tasks that facilitate adaptation and exploration in future tasks. We focus on the problem of in-context adaptation and exploration, where an agent only relies on context, i.e., h

Externí odkaz: http://arxiv.org/abs/2302.04250

Zobrazit plný text záznamu

Report

Optimistic Meta-Gradients

Autor: Flennerhag, Sebastian, Zahavy, Tom, O'Donoghue, Brendan, van Hasselt, Hado, György, András, Singh, Satinder

We study the connection between gradient-based meta-learning and convex op-timisation. We observe that gradient descent with momentum is a special case of meta-gradients, and building on recent results in optimisation, we prove convergence rates for

Externí odkaz: http://arxiv.org/abs/2301.03236

Zobrazit plný text záznamu

Report

Human-level Atari 200x faster

Autor: Kapturowski, Steven, Campos, Víctor, Jiang, Ray, Rakićević, Nemanja, van Hasselt, Hado, Blundell, Charles, Badia, Adrià Puigdomènech

The task of building general agents that perform well over a wide range of tasks has been an important goal in reinforcement learning since its inception. The problem has been subject of research of a large body of work, with performance frequently m

Externí odkaz: http://arxiv.org/abs/2209.07550

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání