Výsledky vyhledávání - "Dorka, Nicolai"

Report

Quantile Regression for Distributional Reward Models in RLHF

Autor: Dorka, Nicolai

Reinforcement learning from human feedback (RLHF) has become a key method for aligning large language models (LLMs) with human preferences through the use of reward models. However, traditional reward models typically generate point estimates, which

Externí odkaz: http://arxiv.org/abs/2409.10164

Zobrazit plný text záznamu

Report

Training a Vision Language Model as Smartphone Assistant

Autor: Dorka, Nicolai, Marecki, Janusz, Anwar, Ammar

Addressing the challenge of a digital assistant capable of executing a wide array of user tasks, our research focuses on the realm of instruction-based mobile device control. We leverage recent advancements in large language models (LLMs) and present

Externí odkaz: http://arxiv.org/abs/2404.08755

Zobrazit plný text záznamu

Report

Improving Deep Dynamics Models for Autonomous Vehicles with Multimodal Latent Mapping of Surfaces

Autor: Vertens, Johan, Dorka, Nicolai, Welschehold, Tim, Thompson, Michael, Burgard, Wolfram

The safe deployment of autonomous vehicles relies on their ability to effectively react to environmental changes. This can require maneuvering on varying surfaces which is still a difficult problem, especially for slippery terrains. To address this i

Externí odkaz: http://arxiv.org/abs/2303.11756

Zobrazit plný text záznamu

Report

Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting

Autor: Dorka, Nicolai, Welschehold, Tim, Burgard, Wolfram

Early stopping based on the validation set performance is a popular approach to find the right balance between under- and overfitting in the context of supervised learning. However, in reinforcement learning, even for supervised sub-problems such as

Externí odkaz: http://arxiv.org/abs/2303.10144

Zobrazit plný text záznamu

Report

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Autor: Dorka, Nicolai, Welschehold, Tim, Boedecker, Joschka, Burgard, Wolfram

Accurate value estimates are important for off-policy reinforcement learning. Algorithms based on temporal difference learning typically are prone to an over- or underestimation bias building up over time. In this paper, we propose a general method c

Externí odkaz: http://arxiv.org/abs/2111.12673

Zobrazit plný text záznamu

Report

Modality-Buffet for Real-Time Object Detection

Autor: Dorka, Nicolai, Meyer, Johannes, Burgard, Wolfram

Real-time object detection in videos using lightweight hardware is a crucial component of many robotic tasks. Detectors using different modalities and with varying computational complexities offer different trade-offs. One option is to have a very li

Externí odkaz: http://arxiv.org/abs/2011.08726

Zobrazit plný text záznamu

Report

Scaling Imitation Learning in Minecraft

Autor: Amiranashvili, Artemij, Dorka, Nicolai, Burgard, Wolfram, Koltun, Vladlen, Brox, Thomas

Imitation learning is a powerful family of techniques for learning sensorimotor coordination in immersive environments. We apply imitation learning to attain state-of-the-art performance on hard exploration problems in the Minecraft environment. We r

Externí odkaz: http://arxiv.org/abs/2007.02701

Zobrazit plný text záznamu

Report

Scheduled Intrinsic Drive: A Hierarchical Take on Intrinsically Motivated Exploration

Autor: Zhang, Jingwei, Wetzel, Niklas, Dorka, Nicolai, Boedecker, Joschka, Burgard, Wolfram

Exploration in sparse reward reinforcement learning remains an open challenge. Many state-of-the-art methods use intrinsic motivation to complement the sparse extrinsic reward signal, giving the agent more opportunities to receive feedback during exp

Externí odkaz: http://arxiv.org/abs/1903.07400

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání