Výsledky vyhledávání

Report

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Autor: Lee, Juyong, Hahm, Dongyoon, Choi, June Suk, Knox, W. Bradley, Lee, Kimin

Autonomous agents powered by large language models (LLMs) show promising potential in assistive tasks across various domains, including mobile device control. As these agents interact directly with personal information and device settings, ensuring t

Externí odkaz: http://arxiv.org/abs/2410.17520

Zobrazit plný text záznamu

Report

Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

Autor: Zhang, Michael J. Q., Knox, W. Bradley, Choi, Eunsol

Large language models (LLMs) must often respond to highly ambiguous user requests. In such cases, the LLM's best response may be to ask a clarifying question to elicit more information. We observe existing LLMs often respond by presupposing a single

Externí odkaz: http://arxiv.org/abs/2410.13788

Zobrazit plný text záznamu

Kniha

Canceling Lawyers : Case Studies of Accountability, Toleration, and Regret. [elektronicky zdroj]

Autor: Wendel, W. Bradley

Externí odkaz: Kolekce e-knih KNAV (Registrovani uzivatele: plny text online 5 minut, dalsi pristup na vyzadani. Registered users: full text online 5 minutes, further access on request.)

Report

Contrastive Preference Learning: Learning from Human Feedback without RL

Autor: Hejna, Joey, Rafailov, Rafael, Sikchi, Harshit, Finn, Chelsea, Niekum, Scott, Knox, W. Bradley, Sadigh, Dorsa

Reinforcement Learning from Human Feedback (RLHF) has emerged as a popular paradigm for aligning models with human intent. Typically RLHF algorithms operate in two phases: first, use human preferences to learn a reward function and second, align the

Externí odkaz: http://arxiv.org/abs/2310.13639

Zobrazit plný text záznamu

Report

Learning Optimal Advantage from Preferences and Mistaking it for Reward

Autor: Knox, W. Bradley, Hatgis-Kessell, Stephane, Adalgeirsson, Sigurdur Orn, Booth, Serena, Dragan, Anca, Stone, Peter, Niekum, Scott

We consider algorithms for learning reward functions from human preferences over pairs of trajectory segments, as used in reinforcement learning from human feedback (RLHF). Most recent work assumes that human preferences are generated based only upon

Externí odkaz: http://arxiv.org/abs/2310.02456

Zobrazit plný text záznamu

Kniha

Ethics and law : an introduction / W. Bradley Wendel, Cornell University. [elektronicky zdroj]

Autor: Wendel, W. Bradley, 1969-, author

Externí odkaz: Kolekce e-knih KNAV

Kniha

Lawyers and fidelity to law [elektronicky zdroj] / W. Bradley Wendel.

Autor: Wendel, W. Bradley, 1969-

Externí odkaz: Kolekce e-knih KNAV

Akademický článek

A Meta-Analysis of the Association Between the Hold Me Tight Program and Couples’ Relationship Adjustment.

Autor: McKibben, W. Bradley¹ (AUTHOR) wmckibb@ju.edu, Lenz, A. Stephen² (AUTHOR), Alvero, Arianna¹ (AUTHOR)

Publikováno v: Counseling Outcome Research & Evaluation. Oct2024, p1-15. 15p. 1 Illustration.

Zobrazit plný text záznamu

Report

Models of human preference for learning reward functions

Autor: Knox, W. Bradley, Hatgis-Kessell, Stephane, Booth, Serena, Niekum, Scott, Stone, Peter, Allievi, Alessandro

The utility of reinforcement learning is limited by the alignment of reward functions with the interests of human stakeholders. One promising method for alignment is to learn the reward function from human-generated preferences between pairs of traje

Externí odkaz: http://arxiv.org/abs/2206.02231

Zobrazit plný text záznamu

Elektronická kniha

How Can You Have Law Without Lawyers? : Legal Formalism, Legality, and the Law Governing Lawyers

Autor: Wendel, W. Bradley, author

Publikováno v: Methodology in Private Law Theory : Between New Private Law and Rechtsdogmatik, 2024.

Externí odkaz: https://doi.org/10.1093/oso/9780198885306.003.0011

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání