Zobrazeno 1 - 10
of 58
pro vyhledávání: '"Biyik, Erdem"'
Learning from human feedback has gained traction in fields like robotics and natural language processing in recent years. While prior works mostly rely on human feedback in the form of comparisons, language is a preferable modality that provides more
Externí odkaz:
http://arxiv.org/abs/2410.06401
Publikováno v:
CoRL 2024
Most reinforcement learning (RL) methods focus on learning optimal policies over low-level action spaces. While these methods can perform well in their training environments, they lack the flexibility to transfer to new tasks. Instead, RL agents that
Externí odkaz:
http://arxiv.org/abs/2406.17768
Adaptive brain stimulation can treat neurological conditions such as Parkinson's disease and post-stroke motor deficits by influencing abnormal neural activity. Because of patient heterogeneity, each patient requires a unique stimulation policy to ac
Externí odkaz:
http://arxiv.org/abs/2406.06714
Training robots to perform complex control tasks from high-dimensional pixel input using reinforcement learning (RL) is sample-inefficient, because image observations are comprised primarily of task-irrelevant information. By contrast, humans are abl
Externí odkaz:
http://arxiv.org/abs/2403.10940
Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task. Previous works have shown that actively synthesizing preference queries to maximize information gain abo
Externí odkaz:
http://arxiv.org/abs/2403.06003
Autor:
Liang, Anthony, Tennenholtz, Guy, Hsu, Chih-wei, Chow, Yinlam, Bıyık, Erdem, Boutilier, Craig
We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates. We model episode sessions - parts of the episode where the latent state is fixed - an
Externí odkaz:
http://arxiv.org/abs/2402.15957
Data generation and labeling are often expensive in robot learning. Preference-based learning is a concept that enables reliable labeling by querying users with preference questions. Active querying methods are commonly employed in preference-based l
Externí odkaz:
http://arxiv.org/abs/2402.15757
Autor:
Wang, Yufei, Sun, Zhanyi, Zhang, Jesse, Xian, Zhou, Biyik, Erdem, Held, David, Erickson, Zackory
Reward engineering has long been a challenge in Reinforcement Learning (RL) research, as it often requires extensive human effort and iterative processes of trial-and-error to design effective reward functions. In this paper, we propose RL-VLM-F, a m
Externí odkaz:
http://arxiv.org/abs/2402.03681
Autor:
Biyik, Erdem, Yao, Fan, Chow, Yinlam, Haig, Alex, Hsu, Chih-wei, Ghavamzadeh, Mohammad, Boutilier, Craig
Preference elicitation plays a central role in interactive recommender systems. Most preference elicitation approaches use either item queries that ask users to select preferred items from a slate, or attribute queries that ask them to express their
Externí odkaz:
http://arxiv.org/abs/2311.02085
Autor:
Sontakke, Sumedh A, Zhang, Jesse, Arnold, Sébastien M. R., Pertsch, Karl, Bıyık, Erdem, Sadigh, Dorsa, Finn, Chelsea, Itti, Laurent
Reward specification is a notoriously difficult problem in reinforcement learning, requiring extensive expert supervision to design robust reward functions. Imitation learning (IL) methods attempt to circumvent these problems by utilizing expert demo
Externí odkaz:
http://arxiv.org/abs/2310.07899