Výsledky vyhledávání

Report

A Hypothesis on Black Swan in Unchanging Environments

Autor: Lee, Hyunin, Park, Chanwoo, Abel, David, Jin, Ming

Black swan events are statistically rare occurrences that carry extremely high risks. A typical view of defining black swan events is heavily assumed to originate from an unpredictable time-varying environments; however, the community lacks a compreh

Externí odkaz: http://arxiv.org/abs/2407.18422

Zobrazit plný text záznamu

Report

Three Dogmas of Reinforcement Learning

Autor: Abel, David, Ho, Mark K., Harutyunyan, Anna

Modern reinforcement learning has been conditioned by at least three dogmas. The first is the environment spotlight, which refers to our tendency to focus on modeling environments rather than agents. The second is our treatment of learning as finding

Externí odkaz: http://arxiv.org/abs/2407.10583

Zobrazit plný text záznamu

Report

Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input

Autor: Peng, Andi, Sun, Yuying, Shu, Tianmin, Abel, David

Humans use social context to specify preferences over behaviors, i.e. their reward functions. Yet, algorithms for inferring reward models from preference data do not take this social learning view into account. Inspired by pragmatic human communicati

Externí odkaz: http://arxiv.org/abs/2405.14769

Zobrazit plný text záznamu

Report

A Definition of Continual Reinforcement Learning

Autor: Abel, David, Barreto, André, Van Roy, Benjamin, Precup, Doina, van Hasselt, Hado, Singh, Satinder

In a standard view of the reinforcement learning problem, an agent's goal is to efficiently identify a policy that maximizes long-term reward. However, this perspective is based on a restricted view of learning as finding a solution, rather than trea

Externí odkaz: http://arxiv.org/abs/2307.11046

Zobrazit plný text záznamu

Report

On the Convergence of Bounded Agents

Autor: Abel, David, Barreto, André, van Hasselt, Hado, Van Roy, Benjamin, Precup, Doina, Singh, Satinder

When has an agent converged? Standard models of the reinforcement learning problem give rise to a straightforward definition of convergence: An agent converges when its behavior or performance in each environment state stops changing. However, as we

Externí odkaz: http://arxiv.org/abs/2307.11044

Zobrazit plný text záznamu

Report

Settling the Reward Hypothesis

Autor: Bowling, Michael, Martin, John D., Abel, David, Dabney, Will

The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)." We aim to fully settle this hypothesis. This will

Externí odkaz: http://arxiv.org/abs/2212.10420

Zobrazit plný text záznamu

Report

Meta-Gradients in Non-Stationary Environments

Autor: Luketina, Jelena, Flennerhag, Sebastian, Schroecker, Yannick, Abel, David, Zahavy, Tom, Singh, Satinder

Meta-gradient methods (Xu et al., 2018; Zahavy et al., 2020) offer a promising solution to the problem of hyperparameter selection and adaptation in non-stationary reinforcement learning problems. However, the properties of meta-gradients in such env

Externí odkaz: http://arxiv.org/abs/2209.06159

Zobrazit plný text záznamu

Report

A Theory of Abstraction in Reinforcement Learning

Autor: Abel, David

Publikováno v: Doctoral Dissertation, Department of Computer Science, Brown University, 2020

Reinforcement learning defines the problem facing agents that learn to make good decisions through action and observation alone. To be effective problem solvers, such agents must efficiently explore vast worlds, assign credit from delayed feedback, a

Externí odkaz: http://arxiv.org/abs/2203.00397

Zobrazit plný text záznamu

Akademický článek

Selection in molecular evolution

Autor: Abel, David Lynn

Publikováno v: In Studies in History and Philosophy of Science October 2024 107:54-63

Zobrazit plný text záznamu

Report

On the Expressivity of Markov Reward

Autor: Abel, David, Dabney, Will, Harutyunyan, Anna, Ho, Mark K., Littman, Michael L., Precup, Doina, Singh, Satinder

Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions

Externí odkaz: http://arxiv.org/abs/2111.00876

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání