Zobrazeno 1 - 10
of 108
pro vyhledávání: '"Abdolmaleki, Abbas"'
Autor:
Abdolmaleki, Abbas, Piot, Bilal, Shahriari, Bobak, Springenberg, Jost Tobias, Hertweck, Tim, Joshi, Rishabh, Oh, Junhyuk, Bloesch, Michael, Lampe, Thomas, Heess, Nicolas, Buchli, Jonas, Riedmiller, Martin
Existing preference optimization methods are mainly designed for directly learning from human feedback with the assumption that paired examples (preferred vs. dis-preferred) are available. In contrast, we propose a method that can leverage unpaired p
Externí odkaz:
http://arxiv.org/abs/2410.04166
Autor:
Zhang, Jingwei, Lampe, Thomas, Abdolmaleki, Abbas, Springenberg, Jost Tobias, Riedmiller, Martin
We propose an agent architecture that automates parts of the common reinforcement learning experiment workflow, to enable automated mastery of control domains for embodied agents. To do so, it leverages a VLM to perform some of the capabilities norma
Externí odkaz:
http://arxiv.org/abs/2409.03402
Autor:
Springenberg, Jost Tobias, Abdolmaleki, Abbas, Zhang, Jingwei, Groth, Oliver, Bloesch, Michael, Lampe, Thomas, Brakel, Philemon, Bechtle, Sarah, Kapturowski, Steven, Hafner, Roland, Heess, Nicolas, Riedmiller, Martin
We show that offline actor-critic reinforcement learning can scale to large models - such as transformers - and follows similar scaling laws as supervised learning. We find that offline actor-critic algorithms can outperform strong, supervised, behav
Externí odkaz:
http://arxiv.org/abs/2402.05546
Autor:
Bhardwaj, Mohak, Lampe, Thomas, Neunert, Michael, Romano, Francesco, Abdolmaleki, Abbas, Byravan, Arunkumar, Wulfmeier, Markus, Riedmiller, Martin, Buchli, Jonas
Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale. However, domains such as fluid dynamical systems exhibit complex dynamic phenomena that are hard to simulate
Externí odkaz:
http://arxiv.org/abs/2402.06102
Autor:
Lampe, Thomas, Abdolmaleki, Abbas, Bechtle, Sarah, Huang, Sandy H., Springenberg, Jost Tobias, Bloesch, Michael, Groth, Oliver, Hafner, Roland, Hertweck, Tim, Neunert, Michael, Wulfmeier, Markus, Zhang, Jingwei, Nori, Francesco, Heess, Nicolas, Riedmiller, Martin
Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient t
Externí odkaz:
http://arxiv.org/abs/2312.11374
Autor:
Mishra, Shruti, Anand, Ankit, Hoffmann, Jordan, Heess, Nicolas, Riedmiller, Martin, Abdolmaleki, Abbas, Precup, Doina
We enable reinforcement learning agents to learn successful behavior policies by utilizing relevant pre-existing teacher policies. The teacher policies are introduced as objectives, in addition to the task objective, in a multi-objective policy optim
Externí odkaz:
http://arxiv.org/abs/2308.15470
Autor:
Bousmalis, Konstantinos, Vezzani, Giulia, Rao, Dushyant, Devin, Coline, Lee, Alex X., Bauza, Maria, Davchev, Todor, Zhou, Yuxiang, Gupta, Agrim, Raju, Akhil, Laurens, Antoine, Fantacci, Claudio, Dalibard, Valentin, Zambelli, Martina, Martins, Murilo, Pevceviciute, Rugile, Blokzijl, Michiel, Denil, Misha, Batchelor, Nathan, Lampe, Thomas, Parisotto, Emilio, Żołna, Konrad, Reed, Scott, Colmenarejo, Sergio Gómez, Scholz, Jon, Abdolmaleki, Abbas, Groth, Oliver, Regli, Jean-Baptiste, Sushkov, Oleg, Rothörl, Tom, Chen, José Enrique, Aytar, Yusuf, Barker, Dave, Ortiz, Joy, Riedmiller, Martin, Springenberg, Jost Tobias, Hadsell, Raia, Nori, Francesco, Heess, Nicolas
The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and lan
Externí odkaz:
http://arxiv.org/abs/2306.11706
Autor:
Zhang, Jingwei, Springenberg, Jost Tobias, Byravan, Arunkumar, Hasenclever, Leonard, Abdolmaleki, Abbas, Rao, Dushyant, Heess, Nicolas, Riedmiller, Martin
In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn a jumpy mo
Externí odkaz:
http://arxiv.org/abs/2302.12617
Autor:
Vezzani, Giulia, Tirumala, Dhruva, Wulfmeier, Markus, Rao, Dushyant, Abdolmaleki, Abbas, Moran, Ben, Haarnoja, Tuomas, Humplik, Jan, Hafner, Roland, Neunert, Michael, Fantacci, Claudio, Hertweck, Tim, Lampe, Thomas, Sadeghi, Fereshteh, Heess, Nicolas, Riedmiller, Martin
The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents. Skill reuse is one of the most common approaches, but current methods have considerable limitations.For examp
Externí odkaz:
http://arxiv.org/abs/2211.13743
Autor:
Lee, Alex X., Devin, Coline, Springenberg, Jost Tobias, Zhou, Yuxiang, Lampe, Thomas, Abdolmaleki, Abbas, Bousmalis, Konstantinos
Reinforcement learning (RL) has been shown to be effective at learning control from experience. However, RL typically requires a large amount of online interaction with the environment. This limits its applicability to real-world settings, such as in
Externí odkaz:
http://arxiv.org/abs/2205.03353