Zobrazeno 1 - 10
of 2 981
pro vyhledávání: '"W, Bradley"'
Autonomous agents powered by large language models (LLMs) show promising potential in assistive tasks across various domains, including mobile device control. As these agents interact directly with personal information and device settings, ensuring t
Externí odkaz:
http://arxiv.org/abs/2410.17520
Large language models (LLMs) must often respond to highly ambiguous user requests. In such cases, the LLM's best response may be to ask a clarifying question to elicit more information. We observe existing LLMs often respond by presupposing a single
Externí odkaz:
http://arxiv.org/abs/2410.13788
Autor:
Hejna, Joey, Rafailov, Rafael, Sikchi, Harshit, Finn, Chelsea, Niekum, Scott, Knox, W. Bradley, Sadigh, Dorsa
Reinforcement Learning from Human Feedback (RLHF) has emerged as a popular paradigm for aligning models with human intent. Typically RLHF algorithms operate in two phases: first, use human preferences to learn a reward function and second, align the
Externí odkaz:
http://arxiv.org/abs/2310.13639
Autor:
Knox, W. Bradley, Hatgis-Kessell, Stephane, Adalgeirsson, Sigurdur Orn, Booth, Serena, Dragan, Anca, Stone, Peter, Niekum, Scott
We consider algorithms for learning reward functions from human preferences over pairs of trajectory segments, as used in reinforcement learning from human feedback (RLHF). Most recent work assumes that human preferences are generated based only upon
Externí odkaz:
http://arxiv.org/abs/2310.02456
Autor:
McKibben, W. Bradley1 (AUTHOR) wmckibb@ju.edu, Lenz, A. Stephen2 (AUTHOR), Alvero, Arianna1 (AUTHOR)
Publikováno v:
Counseling Outcome Research & Evaluation. Oct2024, p1-15. 15p. 1 Illustration.
Autor:
Knox, W. Bradley, Hatgis-Kessell, Stephane, Booth, Serena, Niekum, Scott, Stone, Peter, Allievi, Alessandro
The utility of reinforcement learning is limited by the alignment of reward functions with the interests of human stakeholders. One promising method for alignment is to learn the reward function from human-generated preferences between pairs of traje
Externí odkaz:
http://arxiv.org/abs/2206.02231
Autor:
Wendel, W. Bradley, author
Publikováno v:
Methodology in Private Law Theory : Between New Private Law and Rechtsdogmatik, 2024.
Externí odkaz:
https://doi.org/10.1093/oso/9780198885306.003.0011