Výsledky vyhledávání - "Genewein, Tim"

Report

Autor: Ruoss, Anian, Delétang, Grégoire, Medapati, Sourabh, Grau-Moya, Jordi, Wenliang, Li Kevin, Catt, Elliot, Reid, John, Genewein, Tim

The recent breakthrough successes in machine learning are mainly attributed to scale: namely large-scale attention-based architectures and datasets of unprecedented scale. This paper investigates the impact of training at scale for chess. Unlike trad

Externí odkaz: http://arxiv.org/abs/2402.04494

Zobrazit plný text záznamu

Report

Learning Universal Predictors

Autor: Grau-Moya, Jordi, Genewein, Tim, Hutter, Marcus, Orseau, Laurent, Delétang, Grégoire, Catt, Elliot, Ruoss, Anian, Wenliang, Li Kevin, Mattern, Christopher, Aitchison, Matthew, Veness, Joel

Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are the limits of

Externí odkaz: http://arxiv.org/abs/2401.14953

Zobrazit plný text záznamu

Report

Language Modeling Is Compression

Autor: Delétang, Grégoire, Ruoss, Anian, Duquenne, Paul-Ambroise, Catt, Elliot, Genewein, Tim, Mattern, Christopher, Grau-Moya, Jordi, Wenliang, Li Kevin, Aitchison, Matthew, Orseau, Laurent, Hutter, Marcus, Veness, Joel

It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (la

Externí odkaz: http://arxiv.org/abs/2309.10668

Zobrazit plný text záznamu

Report

Randomized Positional Encodings Boost Length Generalization of Transformers

Autor: Ruoss, Anian, Delétang, Grégoire, Genewein, Tim, Grau-Moya, Jordi, Csordás, Róbert, Bennani, Mehdi, Legg, Shane, Veness, Joel

Transformers have impressive generalization capabilities on tasks with a fixed context length. However, they fail to generalize to sequences of arbitrary length, even for seemingly simple tasks such as duplicating a string. Moreover, simply training

Externí odkaz: http://arxiv.org/abs/2305.16843

Zobrazit plný text záznamu

Report

Memory-Based Meta-Learning on Non-Stationary Distributions

Autor: Genewein, Tim, Delétang, Grégoire, Ruoss, Anian, Wenliang, Li Kevin, Catt, Elliot, Dutordoir, Vincent, Grau-Moya, Jordi, Orseau, Laurent, Hutter, Marcus, Veness, Joel

Memory-based meta-learning is a technique for approximating Bayes-optimal predictors. Under fairly general conditions, minimizing sequential prediction error, measured by the log loss, leads to implicit meta-learning. The goal of this work is to inve

Externí odkaz: http://arxiv.org/abs/2302.03067

Zobrazit plný text záznamu

Report

Beyond Bayes-optimality: meta-learning what you know you don't know

Autor: Grau-Moya, Jordi, Delétang, Grégoire, Kunesch, Markus, Genewein, Tim, Catt, Elliot, Li, Kevin, Ruoss, Anian, Cundy, Chris, Veness, Joel, Wang, Jane, Hutter, Marcus, Summerfield, Christopher, Legg, Shane, Ortega, Pedro

Meta-training agents with memory has been shown to culminate in Bayes-optimal agents, which casts Bayes-optimality as the implicit solution to a numerical optimization problem rather than an explicit modeling assumption. Bayes-optimal agents are risk

Externí odkaz: http://arxiv.org/abs/2209.15618

Zobrazit plný text záznamu

Report

Neural Networks and the Chomsky Hierarchy

Autor: Delétang, Grégoire, Ruoss, Anian, Grau-Moya, Jordi, Genewein, Tim, Wenliang, Li Kevin, Catt, Elliot, Cundy, Chris, Hutter, Marcus, Legg, Shane, Veness, Joel, Ortega, Pedro A.

Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (20'91

Externí odkaz: http://arxiv.org/abs/2207.02098

Zobrazit plný text záznamu

Report

Your Policy Regularizer is Secretly an Adversary

Autor: Brekelmans, Rob, Genewein, Tim, Grau-Moya, Jordi, Delétang, Grégoire, Kunesch, Markus, Legg, Shane, Ortega, Pedro

Publikováno v: TMLR (2022) https://openreview.net/forum?id=berNQMTYWZ

Policy regularization methods such as maximum entropy regularization are widely used in reinforcement learning to improve the robustness of a learned policy. In this paper, we show how this robustness arises from hedging against worst-case perturbati

Externí odkaz: http://arxiv.org/abs/2203.12592

Zobrazit plný text záznamu

Report

Model-Free Risk-Sensitive Reinforcement Learning

Autor: Delétang, Grégoire, Grau-Moya, Jordi, Kunesch, Markus, Genewein, Tim, Brekelmans, Rob, Legg, Shane, Ortega, Pedro A.

We extend temporal-difference (TD) learning in order to obtain risk-sensitive, model-free reinforcement learning algorithms. This extension can be regarded as modification of the Rescorla-Wagner rule, where the (sigmoidal) stimulus is taken to be eit

Externí odkaz: http://arxiv.org/abs/2111.02907

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání