Výsledky vyhledávání - "Springenberg, Jost Tobias"

Report

Preference Optimization as Probabilistic Inference

Autor: Abdolmaleki, Abbas, Piot, Bilal, Shahriari, Bobak, Springenberg, Jost Tobias, Hertweck, Tim, Joshi, Rishabh, Oh, Junhyuk, Bloesch, Michael, Lampe, Thomas, Heess, Nicolas, Buchli, Jonas, Riedmiller, Martin

Existing preference optimization methods are mainly designed for directly learning from human feedback with the assumption that paired examples (preferred vs. dis-preferred) are available. In contrast, we propose a method that can leverage unpaired p

Externí odkaz: http://arxiv.org/abs/2410.04166

Zobrazit plný text záznamu

Report

Game On: Towards Language Models as RL Experimenters

Autor: Zhang, Jingwei, Lampe, Thomas, Abdolmaleki, Abbas, Springenberg, Jost Tobias, Riedmiller, Martin

We propose an agent architecture that automates parts of the common reinforcement learning experiment workflow, to enable automated mastery of control domains for embodied agents. To do so, it leverages a VLM to perform some of the capabilities norma

Externí odkaz: http://arxiv.org/abs/2409.03402

Zobrazit plný text záznamu

Report

Imitating Language via Scalable Inverse Reinforcement Learning

Autor: Wulfmeier, Markus, Bloesch, Michael, Vieillard, Nino, Ahuja, Arun, Bornschein, Jorg, Huang, Sandy, Sokolov, Artem, Barnes, Matt, Desjardins, Guillaume, Bewley, Alex, Bechtle, Sarah Maria Elisabeth, Springenberg, Jost Tobias, Momchev, Nikola, Bachem, Olivier, Geist, Matthieu, Riedmiller, Martin

The majority of language model training builds on imitation learning. It covers pretraining, supervised fine-tuning, and affects the starting conditions for reinforcement learning from human feedback (RLHF). The simplicity and scalability of maximum

Externí odkaz: http://arxiv.org/abs/2409.01369

Zobrazit plný text záznamu

Report

Offline Actor-Critic Reinforcement Learning Scales to Large Models

Autor: Springenberg, Jost Tobias, Abdolmaleki, Abbas, Zhang, Jingwei, Groth, Oliver, Bloesch, Michael, Lampe, Thomas, Brakel, Philemon, Bechtle, Sarah, Kapturowski, Steven, Hafner, Roland, Heess, Nicolas, Riedmiller, Martin

We show that offline actor-critic reinforcement learning can scale to large models - such as transformers - and follows similar scaling laws as supervised learning. We find that offline actor-critic algorithms can outperform strong, supervised, behav

Externí odkaz: http://arxiv.org/abs/2402.05546

Zobrazit plný text záznamu

Report

Mastering Stacking of Diverse Shapes with Large-Scale Iterative Reinforcement Learning on Real Robots

Autor: Lampe, Thomas, Abdolmaleki, Abbas, Bechtle, Sarah, Huang, Sandy H., Springenberg, Jost Tobias, Bloesch, Michael, Groth, Oliver, Hafner, Roland, Hertweck, Tim, Neunert, Michael, Wulfmeier, Markus, Zhang, Jingwei, Nori, Francesco, Heess, Nicolas, Riedmiller, Martin

Reinforcement learning solely from an agent's self-generated data is often believed to be infeasible for learning on real robots, due to the amount of data needed. However, if done right, agents learning from real data can be surprisingly efficient t

Externí odkaz: http://arxiv.org/abs/2312.11374

Zobrazit plný text záznamu

Report

RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and lan

Externí odkaz: http://arxiv.org/abs/2306.11706

Zobrazit plný text záznamu

Report

A Generalist Dynamics Model for Control

Autor: Schubert, Ingmar, Zhang, Jingwei, Bruce, Jake, Bechtle, Sarah, Parisotto, Emilio, Riedmiller, Martin, Springenberg, Jost Tobias, Byravan, Arunkumar, Hasenclever, Leonard, Heess, Nicolas

We investigate the use of transformer sequence models as dynamics models (TDMs) for control. We find that TDMs exhibit strong generalization capabilities to unseen environments, both in a few-shot setting, where a generalist TDM is fine-tuned with sm

Externí odkaz: http://arxiv.org/abs/2305.10912

Zobrazit plný text záznamu

Report

Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains

Autor: Zhang, Jingwei, Springenberg, Jost Tobias, Byravan, Arunkumar, Hasenclever, Leonard, Abdolmaleki, Abbas, Rao, Dushyant, Heess, Nicolas, Riedmiller, Martin

In this paper we study the problem of learning multi-step dynamics prediction models (jumpy models) from unlabeled experience and their utility for fast inference of (high-level) plans in downstream tasks. In particular we propose to learn a jumpy mo

Externí odkaz: http://arxiv.org/abs/2302.12617

Zobrazit plný text záznamu

Report

A Generalist Agent

Publikováno v: Transactions on Machine Learning Research, 11/2022, https://openreview.net/forum?id=1ikK0kHjvj

Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment

Externí odkaz: http://arxiv.org/abs/2205.06175

Zobrazit plný text záznamu

Report

How to Spend Your Robot Time: Bridging Kickstarting and Offline Reinforcement Learning for Vision-based Robotic Manipulation

Autor: Lee, Alex X., Devin, Coline, Springenberg, Jost Tobias, Zhou, Yuxiang, Lampe, Thomas, Abdolmaleki, Abbas, Bousmalis, Konstantinos

Reinforcement learning (RL) has been shown to be effective at learning control from experience. However, RL typically requires a large amount of online interaction with the environment. This limits its applicability to real-world settings, such as in

Externí odkaz: http://arxiv.org/abs/2205.03353

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání