Výsledky vyhledávání - "Watkins, Olivia"

Report

Autor: Souly, Alexandra, Lu, Qingyuan, Bowen, Dillon, Trinh, Tu, Hsieh, Elvis, Pandey, Sana, Abbeel, Pieter, Svegliato, Justin, Emmons, Scott, Watkins, Olivia, Toyer, Sam

Most jailbreak papers claim the jailbreaks they propose are highly effective, often boasting near-100% attack success rates. However, it is perhaps more common than not for jailbreak developers to substantially exaggerate the effectiveness of their j

Externí odkaz: http://arxiv.org/abs/2402.10260

Zobrazit plný text záznamu

Report

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Autor: Toyer, Sam, Watkins, Olivia, Mendes, Ethan Adrian, Svegliato, Justin, Bailey, Luke, Wang, Tiffany, Ong, Isaac, Elmaaroufi, Karim, Abbeel, Pieter, Darrell, Trevor, Ritter, Alan, Russell, Stuart

While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study thi

Externí odkaz: http://arxiv.org/abs/2311.01011

Zobrazit plný text záznamu

Report

Learning to Model the World with Language

Autor: Lin, Jessy, Du, Yuqing, Watkins, Olivia, Hafner, Danijar, Abbeel, Pieter, Klein, Dan, Dragan, Anca

To interact with humans and act in the world, agents need to understand the range of language that people use and relate it to the visual world. While current agents can learn to execute simple language instructions, we aim to build agents that lever

Externí odkaz: http://arxiv.org/abs/2308.01399

Zobrazit plný text záznamu

Report

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Autor: Fan, Ying, Watkins, Olivia, Du, Yuqing, Liu, Hao, Ryu, Moonkyung, Boutilier, Craig, Abbeel, Pieter, Ghavamzadeh, Mohammad, Lee, Kangwook, Lee, Kimin

Learning from human feedback has been shown to improve text-to-image models. These techniques first learn a reward function that captures what humans care about in the task and then improve the models based on the learned reward function. Even though

Externí odkaz: http://arxiv.org/abs/2305.16381

Zobrazit plný text záznamu

Report

Aligning Text-to-Image Models using Human Feedback

Autor: Lee, Kimin, Liu, Hao, Ryu, Moonkyung, Watkins, Olivia, Du, Yuqing, Boutilier, Craig, Abbeel, Pieter, Ghavamzadeh, Mohammad, Gu, Shixiang Shane

Deep generative models have shown impressive results in text-to-image synthesis. However, current text-to-image models often generate images that are inadequately aligned with text prompts. We propose a fine-tuning method for aligning such models usi

Externí odkaz: http://arxiv.org/abs/2302.12192

Zobrazit plný text záznamu

Report

Guiding Pretraining in Reinforcement Learning with Large Language Models

Autor: Du, Yuqing, Watkins, Olivia, Wang, Zihan, Colas, Cédric, Darrell, Trevor, Abbeel, Pieter, Gupta, Abhishek, Andreas, Jacob

Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by rewarding agents for visiting novel states or transitions, but these m

Externí odkaz: http://arxiv.org/abs/2302.06692

Zobrazit plný text záznamu

Report

Teachable Reinforcement Learning via Advice Distillation

Autor: Watkins, Olivia, Darrell, Trevor, Abbeel, Pieter, Andreas, Jacob, Gupta, Abhishek

Training automated agents to complete complex tasks in interactive environments is challenging: reinforcement learning requires careful hand-engineering of reward functions, imitation learning requires specialized infrastructure and access to a human

Externí odkaz: http://arxiv.org/abs/2203.11197

Zobrazit plný text záznamu

Report

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

Autor: Frost, Julius, Watkins, Olivia, Weiner, Eric, Abbeel, Pieter, Darrell, Trevor, Plummer, Bryan, Saenko, Kate

In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time. Some policy interpretability methods facilitate this by capturing the policy's de

Externí odkaz: http://arxiv.org/abs/2201.12462

Zobrazit plný text záznamu

Report

Auto-Tuned Sim-to-Real Transfer

Autor: Du, Yuqing, Watkins, Olivia, Darrell, Trevor, Abbeel, Pieter, Pathak, Deepak

Policies trained in simulation often fail when transferred to the real world due to the `reality gap' where the simulator is unable to accurately capture the dynamics and visual properties of the real world. Current approaches to tackle this problem,

Externí odkaz: http://arxiv.org/abs/2104.07662

Zobrazit plný text záznamu

Report

Hierarchical Text Generation using an Outline

Autor: Drissi, Mehdi, Watkins, Olivia, Kalita, Jugal

Many challenges in natural language processing require generating text, including language translation, dialogue generation, and speech recognition. For all of these problems, text generation becomes more difficult as the text becomes longer. Current

Externí odkaz: http://arxiv.org/abs/1810.08802

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání