Výsledky vyhledávání - "Harutyunyan, Anna"

Report

Autor: Abel, David, Ho, Mark K., Harutyunyan, Anna

Modern reinforcement learning has been conditioned by at least three dogmas. The first is the environment spotlight, which refers to our tendency to focus on modeling environments rather than agents. The second is our treatment of learning as finding

Externí odkaz: http://arxiv.org/abs/2407.10583

Zobrazit plný text záznamu

Report

Bootstrapped Representations in Reinforcement Learning

Autor: Lan, Charline Le, Tu, Stephen, Rowland, Mark, Harutyunyan, Anna, Agarwal, Rishabh, Bellemare, Marc G., Dabney, Will

In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces. While one of the promises of deep learning algorithms is to automatically construct features well-tuned for the task they try to solve, su

Externí odkaz: http://arxiv.org/abs/2306.10171

Zobrazit plný text záznamu

Report

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm

Autor: Tang, Yunhao, Kozuno, Tadashi, Rowland, Mark, Harutyunyan, Anna, Munos, Rémi, Pires, Bernardo Ávila, Valko, Michal

Multi-step learning applies lookahead over multiple time steps and has proved valuable in policy evaluation settings. However, in the optimal control case, the impact of multi-step learning has been relatively limited despite a number of prior effort

Externí odkaz: http://arxiv.org/abs/2305.18501

Zobrazit plný text záznamu

Report

An Analysis of Quantile Temporal-Difference Learning

Autor: Rowland, Mark, Munos, Rémi, Azar, Mohammad Gheshlaghi, Tang, Yunhao, Ostrovski, Georg, Harutyunyan, Anna, Tuyls, Karl, Bellemare, Marc G., Dabney, Will

We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning. Despite these empirical successe

Externí odkaz: http://arxiv.org/abs/2301.04462

Zobrazit plný text záznamu

Report

On the Expressivity of Markov Reward

Autor: Abel, David, Dabney, Will, Harutyunyan, Anna, Ho, Mark K., Littman, Michael L., Precup, Doina, Singh, Satinder

Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions

Externí odkaz: http://arxiv.org/abs/2111.00876

Zobrazit plný text záznamu

Report

Counterfactual Credit Assignment in Model-Free Reinforcement Learning

Autor: Mesnard, Thomas, Weber, Théophane, Viola, Fabio, Thakoor, Shantanu, Saade, Alaa, Harutyunyan, Anna, Dabney, Will, Stepleton, Tom, Heess, Nicolas, Guez, Arthur, Moulines, Éric, Hutter, Marcus, Buesing, Lars, Munos, Rémi

Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards. In particular, this requires separating skill from luck, i.e. disentangling the effect of an action on rewards from that of external fact

Externí odkaz: http://arxiv.org/abs/2011.09464

Zobrazit plný text záznamu

Report

Useful Policy Invariant Shaping from Arbitrary Advice

Autor: Behboudian, Paniz, Satsangi, Yash, Taylor, Matthew E., Harutyunyan, Anna, Bowling, Michael

Reinforcement learning is a powerful learning paradigm in which agents can learn to maximize sparse and delayed reward signals. Although RL has had many impressive successes in complex domains, learning can take hours, days, or even years of training

Externí odkaz: http://arxiv.org/abs/2011.01297

Zobrazit plný text záznamu

Report

Hindsight Credit Assignment

Autor: Harutyunyan, Anna, Dabney, Will, Mesnard, Thomas, Azar, Mohammad, Piot, Bilal, Heess, Nicolas, van Hasselt, Hado, Wayne, Greg, Singh, Satinder, Precup, Doina, Munos, Remi

We consider the problem of efficient credit assignment in reinforcement learning. In order to efficiently and meaningfully utilize new data, we propose to explicitly assign credit to past decisions based on the likelihood of them having led to the ob

Externí odkaz: http://arxiv.org/abs/1912.02503

Zobrazit plný text záznamu

Report

Conditional Importance Sampling for Off-Policy Learning

Autor: Rowland, Mark, Harutyunyan, Anna, van Hasselt, Hado, Borsa, Diana, Schaul, Tom, Munos, Rémi, Dabney, Will

The principal contribution of this paper is a conceptual framework for off-policy reinforcement learning, based on conditional expectations of importance sampling ratios. This framework yields new perspectives and understanding of existing off-policy

Externí odkaz: http://arxiv.org/abs/1910.07479

Zobrazit plný text záznamu

Report

The Termination Critic

Autor: Harutyunyan, Anna, Dabney, Will, Borsa, Diana, Heess, Nicolas, Munos, Remi, Precup, Doina

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We propose an algorithm that focuses on the termination condition, as opposed to -- as is common -- the policy.

Externí odkaz: http://arxiv.org/abs/1902.09996

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání