Výsledky vyhledávání - "SUTTON, RICHARD"

Report

Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning

Autor: Yu, Huizhen, Wan, Yi, Sutton, Richard S.

This paper studies asynchronous stochastic approximation (SA) algorithms and their application to reinforcement learning in semi-Markov decision processes (SMDPs) with an average-reward criterion. We first extend Borkar and Meyn's stability proof met

Externí odkaz: http://arxiv.org/abs/2409.03915

Zobrazit plný text záznamu

Report

On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes

Autor: Wan, Yi, Yu, Huizhen, Sutton, Richard S.

This paper analyzes reinforcement learning (RL) algorithms for Markov decision processes (MDPs) under the average-reward criterion. We focus on Q-learning algorithms based on relative value iteration (RVI), which are model-free stochastic analogues o

Externí odkaz: http://arxiv.org/abs/2408.16262

Zobrazit plný text záznamu

Report

An Idiosyncrasy of Time-discretization in Reinforcement Learning

Autor: De Asis, Kris, Sutton, Richard S.

Many reinforcement learning algorithms are built on an assumption that an agent interacts with an environment over fixed-duration, discrete time steps. However, physical systems are continuous in time, requiring a choice of time-discretization granul

Externí odkaz: http://arxiv.org/abs/2406.14951

Zobrazit plný text záznamu

Report

Reward Centering

Autor: Naik, Abhishek, Wan, Yi, Tomar, Manan, Sutton, Richard S.

We show that discounted methods for solving continuing reinforcement learning problems can perform significantly better if they center their rewards by subtracting out the rewards' empirical average. The improvement is substantial at commonly used di

Externí odkaz: http://arxiv.org/abs/2405.09999

Zobrazit plný text záznamu

Report

MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters

Autor: Sharifnassab, Arsalan, Salehkaleybar, Saber, Sutton, Richard

This paper addresses the challenge of optimizing meta-parameters (i.e., hyperparameters) in machine learning algorithms, a critical factor influencing training efficiency and model performance. Moving away from the computationally expensive tradition

Externí odkaz: http://arxiv.org/abs/2402.02342

Zobrazit plný text záznamu

Kniha

Thrive : A Practical Guide to Harness Your Resilience and Realize Your Potential. [elektronicky zdroj]

Autor: Sutton, Richard

Externí odkaz: Kolekce e-knih KNAV (Registrovani uzivatele: plny text online 5 minut, dalsi pristup na vyzadani. Registered users: full text online 5 minutes, further access on request.)

Report

Step-size Optimization for Continual Learning

Autor: Degris, Thomas, Javed, Khurram, Sharifnassab, Arsalan, Liu, Yuxin, Sutton, Richard

In continual learning, a learner has to keep learning from the data over its whole life time. A key issue is to decide what knowledge to keep and what knowledge to let go. In a neural network, this can be implemented by using a step-size vector to sc

Externí odkaz: http://arxiv.org/abs/2401.17401

Zobrazit plný text záznamu

Report

A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays

Autor: Yu, Huizhen, Wan, Yi, Sutton, Richard S.

In this paper, we study asynchronous stochastic approximation algorithms without communication delays. Our main contribution is a stability proof for these algorithms that extends a method of Borkar and Meyn by accommodating more general noise condit

Externí odkaz: http://arxiv.org/abs/2312.15091

Zobrazit plný text záznamu

Report

Iterative Option Discovery for Planning, by Planning

Autor: Young, Kenny, Sutton, Richard S.

Discovering useful temporal abstractions, in the form of options, is widely thought to be key to applying reinforcement learning and planning to increasingly complex domains. Building on the empirical success of the Expert Iteration approach to polic

Externí odkaz: http://arxiv.org/abs/2310.01569

Zobrazit plný text záznamu

Report

Value-aware Importance Weighting for Off-policy Reinforcement Learning

Autor: De Asis, Kristopher, Graves, Eric, Sutton, Richard S.

Importance sampling is a central idea underlying off-policy prediction in reinforcement learning. It provides a strategy for re-weighting samples from a distribution to obtain unbiased estimates under another distribution. However, importance samplin

Externí odkaz: http://arxiv.org/abs/2306.15625

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání