Výsledky vyhledávání - "Svegliato, Justin"

Report

Autor: Souly, Alexandra, Lu, Qingyuan, Bowen, Dillon, Trinh, Tu, Hsieh, Elvis, Pandey, Sana, Abbeel, Pieter, Svegliato, Justin, Emmons, Scott, Watkins, Olivia, Toyer, Sam

Most jailbreak papers claim the jailbreaks they propose are highly effective, often boasting near-100% attack success rates. However, it is perhaps more common than not for jailbreak developers to substantially exaggerate the effectiveness of their j

Externí odkaz: http://arxiv.org/abs/2402.10260

Zobrazit plný text záznamu

Report

Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game

Autor: Toyer, Sam, Watkins, Olivia, Mendes, Ethan Adrian, Svegliato, Justin, Bailey, Luke, Wang, Tiffany, Ong, Isaac, Elmaaroufi, Karim, Abbeel, Pieter, Darrell, Trevor, Ritter, Alan, Russell, Stuart

While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study thi

Externí odkaz: http://arxiv.org/abs/2311.01011

Zobrazit plný text záznamu

Report

Active teacher selection for reinforcement learning from human feedback

Autor: Freedman, Rachel, Svegliato, Justin, Wray, Kyle, Russell, Stuart

Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite querying a

Externí odkaz: http://arxiv.org/abs/2310.15288

Zobrazit plný text záznamu

Report

Active Reward Learning from Multiple Teachers

Autor: Barnett, Peter, Freedman, Rachel, Svegliato, Justin, Russell, Stuart

Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system. This human feedback is often a preference comparison, in which the human teacher compares several samples of AI behavior and choos

Externí odkaz: http://arxiv.org/abs/2303.00894

Zobrazit plný text záznamu

Report

Fairness and Sequential Decision Making: Limits, Lessons, and Opportunities

Autor: Nashed, Samer B., Svegliato, Justin, Blodgett, Su Lin

As automated decision making and decision assistance systems become common in everyday life, research on the prevention or mitigation of potential harms that arise from decisions made by these systems has proliferated. However, various research commu

Externí odkaz: http://arxiv.org/abs/2301.05753

Zobrazit plný text záznamu

Report

Agent-aware State Estimation in Autonomous Vehicles

Autor: Parr, Shane, Khatri, Ishan, Svegliato, Justin, Zilberstein, Shlomo

Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce

Externí odkaz: http://arxiv.org/abs/2108.00366

Zobrazit plný text záznamu

Report

Improving Competence for Reliable Autonomy

Autor: Basich, Connor, Svegliato, Justin, Wray, Kyle Hollins, Witwicki, Stefan J., Zilberstein, Shlomo

Publikováno v: EPTCS 319, 2020, pp. 37-53

Given the complexity of real-world, unstructured domains, it is often impossible or impractical to design models that include every feature needed to handle all possible scenarios that an autonomous system may encounter. For an autonomous system to b

Externí odkaz: http://arxiv.org/abs/2007.11740

Zobrazit plný text záznamu

Report

Learning to Optimize Autonomy in Competence-Aware Systems

Autor: Basich, Connor, Svegliato, Justin, Wray, Kyle Hollins, Witwicki, Stefan, Biswas, Joydeep, Zilberstein, Shlomo

Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of

Externí odkaz: http://arxiv.org/abs/2003.07745

Zobrazit plný text záznamu

Akademický článek

Competence-aware systems

Autor: Basich, Connor, Svegliato, Justin, Wray, Kyle H., Witwicki, Stefan, Biswas, Joydeep, Zilberstein, Shlomo

Publikováno v: In Artificial Intelligence March 2023 316

Zobrazit plný text záznamu

Metareasoning for Planning and Execution in Autonomous Systems

Autor: Svegliato, Justin

Metareasoning is the process by which an autonomous system optimizes, specifically monitors and controls, its own planning and execution processes in order to operate more effectively in its environment. As autonomous systems rapidly grow in sophisti

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::9a839caae3391245253ad3ca9c798406

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání