Zobrazeno 1 - 10
of 17
pro vyhledávání: '"Wenliang, Li Kevin"'
We propose a new algorithm for model-based distributional reinforcement learning (RL), and prove that it is minimax-optimal for approximating return distributions with a generative model (up to logarithmic factors), resolving an open question of Zhan
Externí odkaz:
http://arxiv.org/abs/2402.07598
Autor:
Ruoss, Anian, Delétang, Grégoire, Medapati, Sourabh, Grau-Moya, Jordi, Wenliang, Li Kevin, Catt, Elliot, Reid, John, Genewein, Tim
The recent breakthrough successes in machine learning are mainly attributed to scale: namely large-scale attention-based architectures and datasets of unprecedented scale. This paper investigates the impact of training at scale for chess. Unlike trad
Externí odkaz:
http://arxiv.org/abs/2402.04494
Autor:
Grau-Moya, Jordi, Genewein, Tim, Hutter, Marcus, Orseau, Laurent, Delétang, Grégoire, Catt, Elliot, Ruoss, Anian, Wenliang, Li Kevin, Mattern, Christopher, Aitchison, Matthew, Veness, Joel
Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data. Broad exposure to different tasks leads to versatile representations enabling general problem solving. But, what are the limits of
Externí odkaz:
http://arxiv.org/abs/2401.14953
Autor:
Wenliang, Li Kevin, Delétang, Grégoire, Aitchison, Matthew, Hutter, Marcus, Ruoss, Anian, Gretton, Arthur, Rowland, Mark
We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. We derive several new algorithms for dynamic programming and temporal-difference learnin
Externí odkaz:
http://arxiv.org/abs/2312.07358
Autor:
Wenliang, Li Kevin, Moran, Ben
How do score-based generative models (SBMs) learn the data distribution supported on a low-dimensional manifold? We investigate the score model of a trained SBM through its linear approximations and subspaces spanned by local feature vectors. During
Externí odkaz:
http://arxiv.org/abs/2311.09952
Autor:
Wang, Zhe, Veličković, Petar, Hennes, Daniel, Tomašev, Nenad, Prince, Laurel, Kaisers, Michael, Bachrach, Yoram, Elie, Romuald, Wenliang, Li Kevin, Piccinini, Federico, Spearman, William, Graham, Ian, Connor, Jerome, Yang, Yi, Recasens, Adrià, Khan, Mina, Beauguerlange, Nathalie, Sprechmann, Pablo, Moreno, Pol, Heess, Nicolas, Bowling, Michael, Hassabis, Demis, Tuyls, Karl
Identifying key patterns of tactics implemented by rival teams, and developing effective responses, lies at the heart of modern football. However, doing so algorithmically remains an open research challenge. To address this unmet need, we propose Tac
Externí odkaz:
http://arxiv.org/abs/2310.10553
Autor:
Delétang, Grégoire, Ruoss, Anian, Duquenne, Paul-Ambroise, Catt, Elliot, Genewein, Tim, Mattern, Christopher, Grau-Moya, Jordi, Wenliang, Li Kevin, Aitchison, Matthew, Orseau, Laurent, Hutter, Marcus, Veness, Joel
It has long been established that predictive models can be transformed into lossless compressors and vice versa. Incidentally, in recent years, the machine learning community has focused on training increasingly large and powerful self-supervised (la
Externí odkaz:
http://arxiv.org/abs/2309.10668
Autor:
Genewein, Tim, Delétang, Grégoire, Ruoss, Anian, Wenliang, Li Kevin, Catt, Elliot, Dutordoir, Vincent, Grau-Moya, Jordi, Orseau, Laurent, Hutter, Marcus, Veness, Joel
Memory-based meta-learning is a technique for approximating Bayes-optimal predictors. Under fairly general conditions, minimizing sequential prediction error, measured by the log loss, leads to implicit meta-learning. The goal of this work is to inve
Externí odkaz:
http://arxiv.org/abs/2302.03067
Autor:
Wenliang, Li Kevin
Score matching (SM) is a convenient method for training flexible probabilistic models, which is often preferred over the traditional maximum-likelihood (ML) approach. However, these models are less interpretable than normalized models; as such, train
Externí odkaz:
http://arxiv.org/abs/2210.13390
Autor:
Delétang, Grégoire, Ruoss, Anian, Grau-Moya, Jordi, Genewein, Tim, Wenliang, Li Kevin, Catt, Elliot, Cundy, Chris, Hutter, Marcus, Legg, Shane, Veness, Joel, Ortega, Pedro A.
Reliable generalization lies at the heart of safe ML and AI. However, understanding when and how neural networks generalize remains one of the most important unsolved problems in the field. In this work, we conduct an extensive empirical study (20'91
Externí odkaz:
http://arxiv.org/abs/2207.02098