Zobrazeno 1 - 10
of 1 251
pro vyhledávání: '"Dipendra, P."'
Imitation learning (IL) aims to mimic the behavior of an expert in a sequential decision making task by learning from demonstrations, and has been widely applied to robotics, autonomous driving, and autoregressive text generation. The simplest approa
Externí odkaz:
http://arxiv.org/abs/2407.15007
We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optio
Externí odkaz:
http://arxiv.org/abs/2404.15269
We study interactive learning in a setting where the agent has to generate a response (e.g., an action or trajectory) given a context and an instruction. In contrast, to typical approaches that train the system using reward or expert supervision on r
Externí odkaz:
http://arxiv.org/abs/2404.09123
Autor:
Chang, Jonathan D., Zhan, Wenhao, Oertell, Owen, Brantley, Kianté, Misra, Dipendra, Lee, Jason D., Sun, Wen
Reinforcement Learning (RL) from Human Preference-based feedback is a popular paradigm for fine-tuning generative models, which has produced impressive models such as GPT-4 and Claude3 Opus. This framework often consists of two steps: learning a rewa
Externí odkaz:
http://arxiv.org/abs/2404.08495
We study pre-training representations for decision-making using video data, which is abundantly available for tasks such as game agents and software testing. Even though significant empirical advances have been made on this problem, a theoretical und
Externí odkaz:
http://arxiv.org/abs/2403.13765
We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following. To train LFMs, we obtain feedback from Large Language Mod
Externí odkaz:
http://arxiv.org/abs/2402.07876
Recent advancements in deep learning have led to the development of powerful language models (LMs) that excel in various tasks. Despite these achievements, there is still room for improvement, particularly in enhancing reasoning abilities and incorpo
Externí odkaz:
http://arxiv.org/abs/2312.15021
Transformer-based Large Language Models (LLMs) have become a fixture in modern machine learning. Correspondingly, significant resources are allocated towards research that aims to further advance this technology, typically resulting in models of incr
Externí odkaz:
http://arxiv.org/abs/2312.13558
We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions. Learning from language feedba
Externí odkaz:
http://arxiv.org/abs/2312.06853
Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large Language Models (LLMs) for text generation. In particular, recent LLMs such as ChatGPT and GPT-4 can engage in fluent conversations with users after finetuning with
Externí odkaz:
http://arxiv.org/abs/2306.11816