Zobrazeno 1 - 10
of 1 457
pro vyhledávání: '"MATTHEWS, MICHAEL"'
Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
While large models trained with self-supervised learning on offline datasets have shown remarkable capabilities in text and image domains, achieving the same generalisation for agents that act in sequential decision problems remains an open challenge
Externí odkaz:
http://arxiv.org/abs/2410.23208
Human intelligence emerged through the process of natural selection and evolution on Earth. We investigate what it would take to re-create this process in silico. While past work has often focused on low-level processes (such as simulating physics or
Externí odkaz:
http://arxiv.org/abs/2409.00853
Autor:
Jackson, Matthew Thomas, Matthews, Michael Tryfan, Lu, Cong, Ellis, Benjamin, Whiteson, Shimon, Foerster, Jakob
In many real-world settings, agents must learn from an offline dataset gathered by some prior behavior policy. Such a setting naturally leads to distribution shift between the behavior policy and the target policy being trained - requiring policy con
Externí odkaz:
http://arxiv.org/abs/2404.06356
Autor:
Matthews, Michael, Beukman, Michael, Ellis, Benjamin, Samvelyan, Mikayel, Jackson, Matthew, Coward, Samuel, Foerster, Jakob
Benchmarks play a crucial role in the development and analysis of reinforcement learning (RL) algorithms. We identify that existing benchmarks used for research into open-ended learning fall into one of two categories. Either they are too slow for me
Externí odkaz:
http://arxiv.org/abs/2402.16801
Autor:
Beukman, Michael, Coward, Samuel, Matthews, Michael, Fellows, Mattie, Jiang, Minqi, Dennis, Michael, Foerster, Jakob
In unsupervised environment design, reinforcement learning agents are trained on environment configurations (levels) generated by an adversary that maximises some objective. Regret is a commonly used objective that theoretically results in a minimax
Externí odkaz:
http://arxiv.org/abs/2402.12284