Zobrazeno 1 - 10
of 18
pro vyhledávání: '"Svegliato, Justin"'
Autor:
Souly, Alexandra, Lu, Qingyuan, Bowen, Dillon, Trinh, Tu, Hsieh, Elvis, Pandey, Sana, Abbeel, Pieter, Svegliato, Justin, Emmons, Scott, Watkins, Olivia, Toyer, Sam
Most jailbreak papers claim the jailbreaks they propose are highly effective, often boasting near-100% attack success rates. However, it is perhaps more common than not for jailbreak developers to substantially exaggerate the effectiveness of their j
Externí odkaz:
http://arxiv.org/abs/2402.10260
Autor:
Toyer, Sam, Watkins, Olivia, Mendes, Ethan Adrian, Svegliato, Justin, Bailey, Luke, Wang, Tiffany, Ong, Isaac, Elmaaroufi, Karim, Abbeel, Pieter, Darrell, Trevor, Ritter, Alan, Russell, Stuart
While Large Language Models (LLMs) are increasingly being used in real-world applications, they remain vulnerable to prompt injection attacks: malicious third party prompts that subvert the intent of the system designer. To help researchers study thi
Externí odkaz:
http://arxiv.org/abs/2311.01011
Reinforcement learning from human feedback (RLHF) enables machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite querying a
Externí odkaz:
http://arxiv.org/abs/2310.15288
Reward learning algorithms utilize human feedback to infer a reward function, which is then used to train an AI system. This human feedback is often a preference comparison, in which the human teacher compares several samples of AI behavior and choos
Externí odkaz:
http://arxiv.org/abs/2303.00894
As automated decision making and decision assistance systems become common in everyday life, research on the prevention or mitigation of potential harms that arise from decisions made by these systems has proliferated. However, various research commu
Externí odkaz:
http://arxiv.org/abs/2301.05753
Autonomous systems often operate in environments where the behavior of multiple agents is coordinated by a shared global state. Reliable estimation of the global state is thus critical for successfully operating in a multi-agent setting. We introduce
Externí odkaz:
http://arxiv.org/abs/2108.00366
Autor:
Basich, Connor, Svegliato, Justin, Wray, Kyle Hollins, Witwicki, Stefan J., Zilberstein, Shlomo
Publikováno v:
EPTCS 319, 2020, pp. 37-53
Given the complexity of real-world, unstructured domains, it is often impossible or impractical to design models that include every feature needed to handle all possible scenarios that an autonomous system may encounter. For an autonomous system to b
Externí odkaz:
http://arxiv.org/abs/2007.11740
Autor:
Basich, Connor, Svegliato, Justin, Wray, Kyle Hollins, Witwicki, Stefan, Biswas, Joydeep, Zilberstein, Shlomo
Interest in semi-autonomous systems (SAS) is growing rapidly as a paradigm to deploy autonomous systems in domains that require occasional reliance on humans. This paradigm allows service robots or autonomous vehicles to operate at varying levels of
Externí odkaz:
http://arxiv.org/abs/2003.07745
Autor:
Basich, Connor, Svegliato, Justin, Wray, Kyle H., Witwicki, Stefan, Biswas, Joydeep, Zilberstein, Shlomo
Publikováno v:
In Artificial Intelligence March 2023 316
Autor:
Svegliato, Justin
Metareasoning is the process by which an autonomous system optimizes, specifically monitors and controls, its own planning and execution processes in order to operate more effectively in its environment. As autonomous systems rapidly grow in sophisti
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_________::9a839caae3391245253ad3ca9c798406