Zobrazeno 1 - 10
of 43
pro vyhledávání: '"Zintgraf, Luisa"'
Autor:
Beck, Jacob, Vuorio, Risto, Liu, Evan Zheran, Xiong, Zheng, Zintgraf, Luisa, Finn, Chelsea, Whiteson, Shimon
While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising
Externí odkaz:
http://arxiv.org/abs/2301.08028
Autor:
Muglich, Darius, Zintgraf, Luisa, de Witt, Christian Schroeder, Whiteson, Shimon, Foerster, Jakob
Self-play is a common paradigm for constructing solutions in Markov games that can yield optimal policies in collaborative settings. However, these policies often adopt highly-specialized conventions that make playing with a novel partner difficult.
Externí odkaz:
http://arxiv.org/abs/2206.12765
Autor:
Alizadeh, Milad, Tailor, Shyam A., Zintgraf, Luisa M, van Amersfoort, Joost, Farquhar, Sebastian, Lane, Nicholas Donald, Gal, Yarin
Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to en
Externí odkaz:
http://arxiv.org/abs/2202.08132
Consistency is the theoretical property of a meta learning algorithm that ensures that, under certain assumptions, it can adapt to any task at test time. An open question is whether and how theoretical consistency translates into practice, in compari
Externí odkaz:
http://arxiv.org/abs/2112.00478
Autor:
Sokota, Samuel, de Witt, Christian Schroeder, Igl, Maximilian, Zintgraf, Luisa, Torr, Philip, Strohmeier, Martin, Kolter, J. Zico, Whiteson, Shimon, Foerster, Jakob
We consider the problem of communicating exogenous information by means of Markov decision process trajectories. This setting, which we call a Markov coding game (MCG), generalizes both source coding and a large class of referential games. MCGs also
Externí odkaz:
http://arxiv.org/abs/2107.08295
A typical part of learning to play the piano is the progression through a series of practice units that focus on individual dimensions of the skill, such as hand coordination, correct posture, or correct timing. Ideally, a focus on a particular pract
Externí odkaz:
http://arxiv.org/abs/2106.12937
In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error
Externí odkaz:
http://arxiv.org/abs/2104.08492
Autor:
Massiceti, Daniela, Zintgraf, Luisa, Bronskill, John, Theodorou, Lida, Harris, Matthew Tobias, Cutrell, Edward, Morrison, Cecily, Hofmann, Katja, Stumpf, Simone
Object recognition has made great advances in the last decade, but predominately still relies on many high-quality training examples per object category. In contrast, learning new objects from only a few examples could enable many impactful applicati
Externí odkaz:
http://arxiv.org/abs/2104.03841
Autor:
Hayes, Conor F., Rădulescu, Roxana, Bargiacchi, Eugenio, Källström, Johan, Macfarlane, Matthew, Reymond, Mathieu, Verstraeten, Timothy, Zintgraf, Luisa M., Dazeley, Richard, Heintz, Fredrik, Howley, Enda, Irissappane, Athirai A., Mannion, Patrick, Nowé, Ann, Ramos, Gabriel, Restelli, Marcello, Vamplew, Peter, Roijers, Diederik M.
Publikováno v:
Auton Agent Multi-Agent Syst 36, 26 (2022)
Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a singl
Externí odkaz:
http://arxiv.org/abs/2103.09568
Agents that interact with other agents often do not know a priori what the other agents' strategies are, but have to maximise their own online return while interacting with and learning about others. The optimal adaptive behaviour under uncertainty o
Externí odkaz:
http://arxiv.org/abs/2101.03864