Výsledky vyhledávání - "Levine, Sergey"

Report

What Do Learning Dynamics Reveal About Generalization in LLM Reasoning?

Autor: Kang, Katie, Setlur, Amrith, Ghosh, Dibya, Steinhardt, Jacob, Tomlin, Claire, Levine, Sergey, Kumar, Aviral

Despite the remarkable capabilities of modern large language models (LLMs), the mechanisms behind their problem-solving abilities remain elusive. In this work, we aim to better understand how the learning dynamics of LLM finetuning shapes downstream

Externí odkaz: http://arxiv.org/abs/2411.07681

Zobrazit plný text záznamu

Report

Interactive Dialogue Agents via Reinforcement Learning on Hindsight Regenerations

Autor: Hong, Joey, Lin, Jessica, Dragan, Anca, Levine, Sergey

Recent progress on large language models (LLMs) has enabled dialogue agents to generate highly naturalistic and plausible text. However, current LLM language generation focuses on responding accurately to questions and requests with a single effectiv

Externí odkaz: http://arxiv.org/abs/2411.05194

Zobrazit plný text záznamu

Report

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

Autor: Hong, Joey, Dragan, Anca, Levine, Sergey

Value-based reinforcement learning (RL) can in principle learn effective policies for a wide range of multi-turn problems, from games to dialogue to robotic control, including via offline RL from static previously collected datasets. However, despite

Externí odkaz: http://arxiv.org/abs/2411.05193

Zobrazit plný text záznamu

Report

Learning to Assist Humans without Inferring Rewards

Autor: Myers, Vivek, Ellis, Evan, Levine, Sergey, Eysenbach, Benjamin, Dragan, Anca

Assistive agents should make humans' lives easier. Classically, such assistance is studied through the lens of inverse reinforcement learning, where an assistive agent (e.g., a chatbot, a robot) infers a human's intention and then selects actions to

Externí odkaz: http://arxiv.org/abs/2411.02623

Zobrazit plný text záznamu

Report

$\pi_0$: A Vision-Language-Action Flow Model for General Robot Control

Robot learning holds tremendous promise to unlock the full potential of flexible, general, and dexterous robot systems, as well as to address some of the deepest questions in artificial intelligence. However, bringing robot learning to the level of g

Externí odkaz: http://arxiv.org/abs/2410.24164

Zobrazit plný text záznamu

Report

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Autor: Luo, Jianlan, Xu, Charles, Wu, Jeffrey, Levine, Sergey

Reinforcement learning (RL) holds great promise for enabling autonomous acquisition of complex robotic manipulation skills, but realizing this potential in real-world settings has been challenging. We present a human-in-the-loop vision-based RL syste

Externí odkaz: http://arxiv.org/abs/2410.21845

Zobrazit plný text záznamu

Report

OGBench: Benchmarking Offline Goal-Conditioned RL

Autor: Park, Seohong, Frans, Kevin, Eysenbach, Benjamin, Levine, Sergey

Offline goal-conditioned reinforcement learning (GCRL) is a major problem in reinforcement learning (RL) because it provides a simple, unsupervised, and domain-agnostic way to acquire diverse behaviors and representations from unlabeled data without

Externí odkaz: http://arxiv.org/abs/2410.20092

Zobrazit plný text záznamu

Report

GHIL-Glue: Hierarchical Control with Filtered Subgoal Images

Autor: Hatch, Kyle B., Balakrishna, Ashwin, Mees, Oier, Nair, Suraj, Park, Seohong, Wulfe, Blake, Itkina, Masha, Eysenbach, Benjamin, Levine, Sergey, Kollar, Thomas, Burchfiel, Benjamin

Image and video generative models that are pre-trained on Internet-scale data can greatly increase the generalization capacity of robot learning systems. These models can function as high-level planners, generating intermediate subgoals for low-level

Externí odkaz: http://arxiv.org/abs/2410.20018

Zobrazit plný text záznamu

Report

Prioritized Generative Replay

Autor: Wang, Renhao, Frans, Kevin, Abbeel, Pieter, Levine, Sergey, Efros, Alexei A.

Sample-efficient online reinforcement learning often uses replay buffers to store experience for reuse when updating the value function. However, uniform replay is inefficient, since certain classes of transitions can be more relevant to learning. Wh

Externí odkaz: http://arxiv.org/abs/2410.18082

Zobrazit plný text záznamu

Report

Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration

Autor: Wilcoxson, Max, Li, Qiyang, Frans, Kevin, Levine, Sergey

Unsupervised pretraining has been transformative in many supervised domains. However, applying such ideas to reinforcement learning (RL) presents a unique challenge in that fine-tuning does not involve mimicking task-specific data, but rather explori

Externí odkaz: http://arxiv.org/abs/2410.18076

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání