Zobrazeno 1 - 10
of 22
pro vyhledávání: '"Delétang, Grégoire"'
Autor:
Phuong, Mary, Aitchison, Matthew, Catt, Elliot, Cogan, Sarah, Kaskasoli, Alexandre, Krakovna, Victoria, Lindner, David, Rahtz, Matthew, Assael, Yannis, Hodkinson, Sarah, Howard, Heidi, Lieberum, Tom, Kumar, Ramana, Raad, Maria Abi, Webson, Albert, Ho, Lewis, Lin, Sharon, Farquhar, Sebastian, Hutter, Marcus, Deletang, Gregoire, Ruoss, Anian, El-Sayed, Seliem, Brown, Sasha, Dragan, Anca, Shah, Rohin, Dafoe, Allan, Shevlane, Toby
To understand the risks posed by a new AI system, we must understand what it can and cannot do. Building on prior work, we introduce a programme of new "dangerous capability" evaluations and pilot them on Gemini 1.0 models. Our evaluations cover four
Externí odkaz:
http://arxiv.org/abs/2403.13793
Autor:
Ruoss, Anian, Delétang, Grégoire, Medapati, Sourabh, Grau-Moya, Jordi, Wenliang, Li Kevin, Catt, Elliot, Reid, John, Lewis, Cannada A., Veness, Joel, Genewein, Tim
This paper uses chess, a landmark planning problem in AI, to assess transformers' performance on a planning task where memorization is futile $\unicode{x2013}$ even at a large scale. To this end, we release ChessBench, a large-scale benchmark dataset
Externí odkaz:
http://arxiv.org/abs/2402.04494
Autor:
Grau-Moya, Jordi, Genewein, Tim, Hutter, Marcus, Orseau, Laurent, Delétang, Grégoire, Catt, Elliot, Ruoss, Anian, Wenliang, Li Kevin, Mattern, Christopher, Aitchison, Matthew, Veness, Joel
Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are the limits of
Externí odkaz:
http://arxiv.org/abs/2401.14953
Autor:
Wenliang, Li Kevin, Delétang, Grégoire, Aitchison, Matthew, Hutter, Marcus, Ruoss, Anian, Gretton, Arthur, Rowland, Mark
We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. We derive several new algorithms for dynamic programming and temporal-difference learnin
Externí odkaz:
http://arxiv.org/abs/2312.07358
Autor:
Delétang, Grégoire, Ruoss, Anian, Duquenne, Paul-Ambroise, Catt, Elliot, Genewein, Tim, Mattern, Christopher, Grau-Moya, Jordi, Wenliang, Li Kevin, Aitchison, Matthew, Orseau, Laurent, Hutter, Marcus, Veness, Joel
It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (la
Externí odkaz:
http://arxiv.org/abs/2309.10668
Autor:
Ruoss, Anian, Delétang, Grégoire, Genewein, Tim, Grau-Moya, Jordi, Csordás, Róbert, Bennani, Mehdi, Legg, Shane, Veness, Joel
Transformers have impressive generalization capabilities on tasks with a fixed context length. However, they fail to generalize to sequences of arbitrary length, even for seemingly simple tasks such as duplicating a string. Moreover, simply training
Externí odkaz:
http://arxiv.org/abs/2305.16843
Autor:
Genewein, Tim, Delétang, Grégoire, Ruoss, Anian, Wenliang, Li Kevin, Catt, Elliot, Dutordoir, Vincent, Grau-Moya, Jordi, Orseau, Laurent, Hutter, Marcus, Veness, Joel
Memory-based meta-learning is a technique for approximating Bayes-optimal predictors. Under fairly general conditions, minimizing sequential prediction error, measured by the log loss, leads to implicit meta-learning. The goal of this work is to inve
Externí odkaz:
http://arxiv.org/abs/2302.03067
Autor:
Grau-Moya, Jordi, Delétang, Grégoire, Kunesch, Markus, Genewein, Tim, Catt, Elliot, Li, Kevin, Ruoss, Anian, Cundy, Chris, Veness, Joel, Wang, Jane, Hutter, Marcus, Summerfield, Christopher, Legg, Shane, Ortega, Pedro
Meta-training agents with memory has been shown to culminate in Bayes-optimal agents, which casts Bayes-optimality as the implicit solution to a numerical optimization problem rather than an explicit modeling assumption. Bayes-optimal agents are risk
Externí odkaz:
http://arxiv.org/abs/2209.15618
Autor:
Delétang, Grégoire, Ruoss, Anian, Grau-Moya, Jordi, Genewein, Tim, Wenliang, Li Kevin, Catt, Elliot, Cundy, Chris, Hutter, Marcus, Legg, Shane, Veness, Joel, Ortega, Pedro A.
Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (20'91
Externí odkaz:
http://arxiv.org/abs/2207.02098
Autor:
Brekelmans, Rob, Genewein, Tim, Grau-Moya, Jordi, Delétang, Grégoire, Kunesch, Markus, Legg, Shane, Ortega, Pedro
Publikováno v:
TMLR (2022) https://openreview.net/forum?id=berNQMTYWZ
Policy regularization methods such as maximum entropy regularization are widely used in reinforcement learning to improve the robustness of a learned policy. In this paper, we show how this robustness arises from hedging against worst-case perturbati
Externí odkaz:
http://arxiv.org/abs/2203.12592