Výsledky vyhledávání - "MATTHEWS, MICHAEL"

Report

Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks

Autor: Matthews, Michael, Beukman, Michael, Lu, Chris, Foerster, Jakob

While large models trained with self-supervised learning on offline datasets have shown remarkable capabilities in text and image domains, achieving the same generalisation for agents that act in sequential decision problems remains an open challenge

Externí odkaz: http://arxiv.org/abs/2410.23208

Zobrazit plný text záznamu

Report

JaxLife: An Open-Ended Agentic Simulator

Autor: Lu, Chris, Beukman, Michael, Matthews, Michael, Foerster, Jakob

Human intelligence emerged through the process of natural selection and evolution on Earth. We investigate what it would take to re-create this process in silico. While past work has often focused on low-level processes (such as simulating physics or

Externí odkaz: http://arxiv.org/abs/2409.00853

Zobrazit plný text záznamu

Report

Policy-Guided Diffusion

Autor: Jackson, Matthew Thomas, Matthews, Michael Tryfan, Lu, Cong, Ellis, Benjamin, Whiteson, Shimon, Foerster, Jakob

In many real-world settings, agents must learn from an offline dataset gathered by some prior behavior policy. Such a setting naturally leads to distribution shift between the behavior policy and the target policy being trained - requiring policy con

Externí odkaz: http://arxiv.org/abs/2404.06356

Zobrazit plný text záznamu

Report

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

Autor: Matthews, Michael, Beukman, Michael, Ellis, Benjamin, Samvelyan, Mikayel, Jackson, Matthew, Coward, Samuel, Foerster, Jakob

Benchmarks play a crucial role in the development and analysis of reinforcement learning (RL) algorithms. We identify that existing benchmarks used for research into open-ended learning fall into one of two categories. Either they are too slow for me

Externí odkaz: http://arxiv.org/abs/2402.16801

Zobrazit plný text záznamu

Report

Refining Minimax Regret for Unsupervised Environment Design

Autor: Beukman, Michael, Coward, Samuel, Matthews, Michael, Fellows, Mattie, Jiang, Minqi, Dennis, Michael, Foerster, Jakob

In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax

Externí odkaz: http://arxiv.org/abs/2402.12284

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání