Výsledky vyhledávání - "Rashid, Tabish"

Report

Aligning Agents like Large Language Models

Autor: Jelley, Adam, Cao, Yuhan, Bignell, Dave, Devlin, Sam, Rashid, Tabish

Training agents to behave as desired in complex 3D environments from high-dimensional sensory information is challenging. Imitation learning from diverse human behavior provides a scalable approach for training an agent with a sensible behavioral pri

Externí odkaz: http://arxiv.org/abs/2406.04208

Zobrazit plný text záznamu

Report

Visual Encoders for Data-Efficient Imitation Learning in Modern Video Games

Autor: Schäfer, Lukas, Jones, Logan, Kanervisto, Anssi, Cao, Yuhan, Rashid, Tabish, Georgescu, Raluca, Bignell, Dave, Sen, Siddhartha, Gavito, Andrea Treviño, Devlin, Sam

Video games have served as useful benchmarks for the decision making community, but going beyond Atari games towards training agents in modern games has been prohibitively expensive for the vast majority of the research community. Recent progress in

Externí odkaz: http://arxiv.org/abs/2312.02312

Zobrazit plný text záznamu

Report

Imitating Human Behaviour with Diffusion Models

Autor: Pearce, Tim, Rashid, Tabish, Kanervisto, Anssi, Bignell, Dave, Sun, Mingfei, Georgescu, Raluca, Macua, Sergio Valcarcel, Tan, Shan Zheng, Momennejad, Ida, Hofmann, Katja, Devlin, Sam

Publikováno v: ICLR 2023

Diffusion models have emerged as powerful generative models in the text-to-image domain. This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments. Human behaviour is stochastic and

Externí odkaz: http://arxiv.org/abs/2301.10677

Zobrazit plný text záznamu

Report

Regularized Softmax Deep Multi-Agent $Q$-Learning

Autor: Pan, Ling, Rashid, Tabish, Peng, Bei, Huang, Longbo, Whiteson, Shimon

Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting. In this work, we empirically demonst

Externí odkaz: http://arxiv.org/abs/2103.11883

Zobrazit plný text záznamu

Report

Estimating $\alpha$-Rank by Maximizing Information Gain

Autor: Rashid, Tabish, Zhang, Cheng, Ciosek, Kamil

Game theory has been increasingly applied in settings where the game is not known outright, but has to be estimated by sampling. For example, meta-games that arise in multi-agent evaluation can only be accessed by running a succession of expensive ex

Externí odkaz: http://arxiv.org/abs/2101.09178

Zobrazit plný text záznamu

Report

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Autor: Rashid, Tabish, Farquhar, Gregory, Peng, Bei, Whiteson, Shimon

QMIX is a popular $Q$-learning algorithm for cooperative MARL in the centralised training and decentralised execution paradigm. In order to enable easy decentralisation, QMIX restricts the joint action $Q$-values it can represent to be a monotonic mi

Externí odkaz: http://arxiv.org/abs/2006.10800

Zobrazit plný text záznamu

Report

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Autor: Rashid, Tabish, Samvelyan, Mikayel, de Witt, Christian Schroeder, Farquhar, Gregory, Foerster, Jakob, Whiteson, Shimon

Publikováno v: Journal of Machine Learning Research 21(178):1-51, 2020

In many real-world settings, a team of agents must coordinate its behaviour while acting in a decentralised fashion. At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and c

Externí odkaz: http://arxiv.org/abs/2003.08839

Zobrazit plný text záznamu

Report

FACMAC: Factored Multi-Agent Centralised Policy Gradients

Autor: Peng, Bei, Rashid, Tabish, de Witt, Christian A. Schroeder, Kamienny, Pierre-Alexandre, Torr, Philip H. S., Böhmer, Wendelin, Whiteson, Shimon

We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces. Like MADDPG, a popular multi-agent actor-critic method, our approach

Externí odkaz: http://arxiv.org/abs/2003.06709

Zobrazit plný text záznamu

Report

Optimistic Exploration even with a Pessimistic Initialisation

Autor: Rashid, Tabish, Peng, Bei, Böhmer, Wendelin, Whiteson, Shimon

Optimistic initialisation is an effective strategy for efficient exploration in reinforcement learning (RL). In the tabular case, all provably efficient model-free algorithms rely on it. However, model-free deep RL algorithms do not use optimistic in

Externí odkaz: http://arxiv.org/abs/2002.12174

Zobrazit plný text záznamu

Report

MAVEN: Multi-Agent Variational Exploration

Autor: Mahajan, Anuj, Rashid, Tabish, Samvelyan, Mikayel, Whiteson, Shimon

Publikováno v: Advances in Neural Information Processing Systems, 32, 2019, 7611-7622

Centralised training with decentralised execution is an important setting for cooperative deep multi-agent reinforcement learning due to communication constraints during execution and computational tractability in training. In this paper, we analyse

Externí odkaz: http://arxiv.org/abs/1910.07483

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání