Zobrazeno 1 - 10
of 129
pro vyhledávání: '"Wayne, Greg"'
Autor:
Abramson, Josh, Ahuja, Arun, Carnevale, Federico, Georgiev, Petko, Goldin, Alex, Hung, Alden, Landon, Jessica, Lhotka, Jirka, Lillicrap, Timothy, Muldal, Alistair, Powell, George, Santoro, Adam, Scully, Guy, Srivastava, Sanjana, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen, Zhu, Rui
An important goal in artificial intelligence is to create agents that can both interact naturally with humans and learn from their feedback. Here we demonstrate how to use reinforcement learning from human feedback (RLHF) to improve upon simulated, e
Externí odkaz:
http://arxiv.org/abs/2211.11602
Autor:
Kurth-Nelson, Zeb, Behrens, Timothy, Wayne, Greg, Miller, Kevin, Luettgau, Lennart, Dolan, Ray, Liu, Yunzhe, Schwartenbeck, Philipp
Replay in the brain has been viewed as rehearsal, or, more recently, as sampling from a transition model. Here, we propose a new hypothesis: that replay is able to implement a form of compositional computation where entities are assembled into relati
Externí odkaz:
http://arxiv.org/abs/2209.07453
Autor:
Abramson, Josh, Ahuja, Arun, Carnevale, Federico, Georgiev, Petko, Goldin, Alex, Hung, Alden, Landon, Jessica, Lillicrap, Timothy, Muldal, Alistair, Richards, Blake, Santoro, Adam, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen
Creating agents that can interact naturally with humans is a common goal in artificial intelligence (AI) research. However, evaluating these interactions is challenging: collecting online human-agent interactions is slow and expensive, yet faster pro
Externí odkaz:
http://arxiv.org/abs/2205.13274
Autor:
DeepMind Interactive Agents Team, Abramson, Josh, Ahuja, Arun, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Fischer, Felix, Georgiev, Petko, Goldin, Alex, Gupta, Mansi, Harley, Tim, Hill, Felix, Humphreys, Peter C, Hung, Alden, Landon, Jessica, Lillicrap, Timothy, Merzic, Hamza, Muldal, Alistair, Santoro, Adam, Scully, Guy, von Glehn, Tamara, Wayne, Greg, Wong, Nathaniel, Yan, Chen, Zhu, Rui
A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that
Externí odkaz:
http://arxiv.org/abs/2112.03763
Imitation learning enables agents to reuse and adapt the hard-won expertise of others, offering a solution to several key challenges in learning behavior. Although it is easy to observe behavior in the real-world, the underlying actions may not be ac
Externí odkaz:
http://arxiv.org/abs/2107.03851
Autor:
Raposo, David, Ritter, Sam, Santoro, Adam, Wayne, Greg, Weber, Theophane, Botvinick, Matt, van Hasselt, Hado, Song, Francis
Since the earliest days of reinforcement learning, the workhorse method for assigning credit to actions over time has been temporal-difference (TD) learning, which propagates credit backward timestep-by-timestep. This approach suffers when delays bet
Externí odkaz:
http://arxiv.org/abs/2102.12425
Autor:
Abramson, Josh, Ahuja, Arun, Barr, Iain, Brussee, Arthur, Carnevale, Federico, Cassin, Mary, Chhaparia, Rachita, Clark, Stephen, Damoc, Bogdan, Dudzik, Andrew, Georgiev, Petko, Guy, Aurelia, Harley, Tim, Hill, Felix, Hung, Alden, Kenton, Zachary, Landon, Jessica, Lillicrap, Timothy, Mathewson, Kory, Mokrá, Soňa, Muldal, Alistair, Santoro, Adam, Savinov, Nikolay, Varma, Vikrant, Wayne, Greg, Williams, Duncan, Wong, Nathaniel, Yan, Chen, Zhu, Rui
A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that
Externí odkaz:
http://arxiv.org/abs/2012.05672
We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks. Instead of using backpropagation to learn features, GLNs have a distributed and local credit assignment mechanism based on
Externí odkaz:
http://arxiv.org/abs/2006.05964
An ideal cognitively-inspired memory system would compress and organize incoming items. The Kanerva Machine (Wu et al, 2018) is a Bayesian model that naturally implements online memory compression. However, the organization of the Kanerva Machine is
Externí odkaz:
http://arxiv.org/abs/2002.02385
Autor:
Harutyunyan, Anna, Dabney, Will, Mesnard, Thomas, Azar, Mohammad, Piot, Bilal, Heess, Nicolas, van Hasselt, Hado, Wayne, Greg, Singh, Satinder, Precup, Doina, Munos, Remi
We consider the problem of efficient credit assignment in reinforcement learning. In order to efficiently and meaningfully utilize new data, we propose to explicitly assign credit to past decisions based on the likelihood of them having led to the ob
Externí odkaz:
http://arxiv.org/abs/1912.02503