Zobrazeno 1 - 10
of 3 943
pro vyhledávání: '"Abdolmaleki A"'
Autor:
Abdolmaleki, Abbas, Piot, Bilal, Shahriari, Bobak, Springenberg, Jost Tobias, Hertweck, Tim, Joshi, Rishabh, Oh, Junhyuk, Bloesch, Michael, Lampe, Thomas, Heess, Nicolas, Buchli, Jonas, Riedmiller, Martin
Existing preference optimization methods are mainly designed for directly learning from human feedback with the assumption that paired examples (preferred vs. dis-preferred) are available. In contrast, we propose a method that can leverage unpaired p
Externí odkaz:
http://arxiv.org/abs/2410.04166
Autor:
Alnashwan, Rabiah, Yang, Yang, Dong, Yilu, Gope, Prosanta, Abdolmaleki, Behzad, Hussain, Syed Rafiul
Consumers seeking a new mobile plan have many choices in the present mobile landscape. The Mobile Virtual Network Operator (MVNO) has recently gained considerable attention among these options. MVNOs offer various benefits, making them an appealing c
Externí odkaz:
http://arxiv.org/abs/2409.04877
Autor:
Zhang, Jingwei, Lampe, Thomas, Abdolmaleki, Abbas, Springenberg, Jost Tobias, Riedmiller, Martin
We propose an agent architecture that automates parts of the common reinforcement learning experiment workflow, to enable automated mastery of control domains for embodied agents. To do so, it leverages a VLM to perform some of the capabilities norma
Externí odkaz:
http://arxiv.org/abs/2409.03402
Autor:
Abdolmaleki, Reza, Kumashiro, Shinya
Let $A$ be a commutative Noetherian local ring with maximal ideal $\mathfrak{m}$, and let $I$ be an ideal. The fiber cone is then an image of the polynomial ring over the residue field $A/\mathfrak{m}$. The kernel of this map is called the defining i
Externí odkaz:
http://arxiv.org/abs/2405.18041
Autor:
Springenberg, Jost Tobias, Abdolmaleki, Abbas, Zhang, Jingwei, Groth, Oliver, Bloesch, Michael, Lampe, Thomas, Brakel, Philemon, Bechtle, Sarah, Kapturowski, Steven, Hafner, Roland, Heess, Nicolas, Riedmiller, Martin
We show that offline actor-critic reinforcement learning can scale to large models - such as transformers - and follows similar scaling laws as supervised learning. We find that offline actor-critic algorithms can outperform strong, supervised, behav
Externí odkaz:
http://arxiv.org/abs/2402.05546
Autor:
Bhardwaj, Mohak, Lampe, Thomas, Neunert, Michael, Romano, Francesco, Abdolmaleki, Abbas, Byravan, Arunkumar, Wulfmeier, Markus, Riedmiller, Martin, Buchli, Jonas
Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale. However, domains such as fluid dynamical systems exhibit complex dynamic phenomena that are hard to simulate
Externí odkaz:
http://arxiv.org/abs/2402.06102
Autor:
Lampe, Thomas, Abdolmaleki, Abbas, Bechtle, Sarah, Huang, Sandy H., Springenberg, Jost Tobias, Bloesch, Michael, Groth, Oliver, Hafner, Roland, Hertweck, Tim, Neunert, Michael, Wulfmeier, Markus, Zhang, Jingwei, Nori, Francesco, Heess, Nicolas, Riedmiller, Martin
Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient t
Externí odkaz:
http://arxiv.org/abs/2312.11374
Autor:
Anataichuk, Andrii, Moch, Sven-Olaf, Abdolmaleki, Hamed, Amoroso, Simone, Britzger, Daniel, Dattola, Filippo, Fiaschi, Juri, Giuli, Francesco, Glazov, Alexander, Hautmann, Francesco, Luszczak, Agnieszka, Monfared, Sara Taheri, Olness, Fred, Vazzoler, Federico, Zenaiev, Oleksandr
Neutral current Drell-Yan (DY) lepton-pair production is considered in the framework of the Standard Model Effective Field Theory (SMEFT). Using the open-source fit platform xFitter, we investigate the impact of high-statistics measurements of the ne
Externí odkaz:
http://arxiv.org/abs/2310.19638
Autor:
Mishra, Shruti, Anand, Ankit, Hoffmann, Jordan, Heess, Nicolas, Riedmiller, Martin, Abdolmaleki, Abbas, Precup, Doina
We enable reinforcement learning agents to learn successful behavior policies by utilizing relevant pre-existing teacher policies. The teacher policies are introduced as objectives, in addition to the task objective, in a multi-objective policy optim
Externí odkaz:
http://arxiv.org/abs/2308.15470
Autor:
Abdolmaleki, Y., Kucerovsky, D.
We introduce a slight modification of the usual equivariant $KK$-theory. We use this to give a $KK$-theoretical proof of an equivariant index theorem for Dirac-Schrodinger operators on a non-compact manifold of nowhere positive curvature. We incident
Externí odkaz:
http://arxiv.org/abs/2306.13987