Zobrazeno 1 - 10
of 223
pro vyhledávání: '"Muldal, A"'
Autor:
Abramson, Josh, Ahuja, Arun, Carnevale, Federico, Georgiev, Petko, Goldin, Alex, Hung, Alden, Landon, Jessica, Lhotka, Jirka, Lillicrap, Timothy, Muldal, Alistair, Powell, George, Santoro, Adam, Scully, Guy, Srivastava, Sanjana, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen, Zhu, Rui
An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, e
Externí odkaz:
http://arxiv.org/abs/2211.11602
Autor:
Yan, Chen, Carnevale, Federico, Georgiev, Petko, Santoro, Adam, Guy, Aurelia, Muldal, Alistair, Hung, Chia-Chun, Abramson, Josh, Lillicrap, Timothy, Wayne, Gregory
Human language learners are exposed to a trickle of informative, context-sensitive language, but a flood of raw sensory data. Through both social language use and internal processes of rehearsal and practice, language learners are able to build high-
Externí odkaz:
http://arxiv.org/abs/2206.03139
Autor:
Abramson, Josh, Ahuja, Arun, Carnevale, Federico, Georgiev, Petko, Goldin, Alex, Hung, Alden, Landon, Jessica, Lillicrap, Timothy, Muldal, Alistair, Richards, Blake, Santoro, Adam, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen
Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster pro
Externí odkaz:
http://arxiv.org/abs/2205.13274
Autor:
Humphreys, Peter C, Raposo, David, Pohlen, Toby, Thornton, Gregory, Chhaparia, Rachita, Muldal, Alistair, Abramson, Josh, Georgiev, Petko, Goldin, Alex, Santoro, Adam, Lillicrap, Timothy
Publikováno v:
Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022
It would be useful for machines to use computers as humans do so that they can aid us in everyday tasks. This is a setting in which there is also the potential to leverage large-scale expert demonstrations and human judgements of interactive behaviou
Externí odkaz:
http://arxiv.org/abs/2202.08137
Autor:
DeepMind Interactive Agents Team, Abramson, Josh, Ahuja, Arun, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Fischer, Felix, Georgiev, Petko, Goldin, Alex, Gupta, Mansi, Harley, Tim, Hill, Felix, Humphreys, Peter C, Hung, Alden, Landon, Jessica, Lillicrap, Timothy, Merzic, Hamza, Muldal, Alistair, Santoro, Adam, Scully, Guy, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen, Zhu, Rui
A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that
Externí odkaz:
http://arxiv.org/abs/2112.03763
Autor:
Abramson, Josh, Ahuja, Arun, Barr, Iain, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Chhaparia, Rachita, Clark, Stephen, Damoc, Bogdan, Dudzik, Andrew, Georgiev, Petko, Guy, Aurelia, Harley, Tim, Hill, Felix, Hung, Alden, Kenton, Zachary, Landon, Jessica, Lillicrap, Timothy, Mathewson, Kory, Mokrá, Soňa, Muldal, Alistair, Santoro, Adam, Savinov, Nikolay, Varma, Vikrant, Wayne, Greg, Williams, Duncan, Wong, Nathaniel, Yan, Chen, Zhu, Rui
A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that
Externí odkaz:
http://arxiv.org/abs/2012.05672
Autor:
Mirza, Mehdi, Jaegle, Andrew, Hunt, Jonathan J., Guez, Arthur, Tunyasuvunakool, Saran, Muldal, Alistair, Weber, Théophane, Karkus, Peter, Racanière, Sébastien, Buesing, Lars, Lillicrap, Timothy, Heess, Nicolas
Recent work in deep reinforcement learning (RL) has produced algorithms capable of mastering challenging games such as Go, chess, or shogi. In these works the RL agent directly observes the natural state of the game and controls that state directly w
Externí odkaz:
http://arxiv.org/abs/2009.05524
Autor:
Tassa, Yuval, Tunyasuvunakool, Saran, Muldal, Alistair, Doron, Yotam, Trochim, Piotr, Liu, Siqi, Bohez, Steven, Merel, Josh, Erez, Tom, Lillicrap, Timothy, Heess, Nicolas
The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. A MuJoCo wrapper provides convenient bindings to functions and data structures. The PyMJCF and Co
Externí odkaz:
http://arxiv.org/abs/2006.12983
Autor:
Barth-Maron, Gabriel, Hoffman, Matthew W., Budden, David, Dabney, Will, Horgan, Dan, TB, Dhruva, Muldal, Alistair, Heess, Nicolas, Lillicrap, Timothy
This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting. We combine this within a distributed framework for off-policy learning in order to develop what we call the Dis
Externí odkaz:
http://arxiv.org/abs/1804.08617
Autor:
Amos, Brandon, Dinh, Laurent, Cabi, Serkan, Rothörl, Thomas, Colmenarejo, Sergio Gómez, Muldal, Alistair, Erez, Tom, Tassa, Yuval, de Freitas, Nando, Denil, Misha
We consider the setting of an agent with a fixed body interacting with an unknown and uncertain external world. We show that models trained to predict proprioceptive information about the agent's body come to represent objects in the external world.
Externí odkaz:
http://arxiv.org/abs/1804.06318