Zobrazeno 1 - 10
of 91
pro vyhledávání: '"Jaques, Natasha"'
InvestESG is a novel multi-agent reinforcement learning (MARL) benchmark designed to study the impact of Environmental, Social, and Governance (ESG) disclosure mandates on corporate climate investments. Supported by both PyTorch and GPU-accelerated J
Externí odkaz:
http://arxiv.org/abs/2411.09856
For AI agents to be helpful to humans, they should be able to follow natural language instructions to complete everyday cooperative tasks in human environments. However, real human instructions inherently possess ambiguity, because the human speakers
Externí odkaz:
http://arxiv.org/abs/2409.18073
Reinforcement Learning from Human Feedback (RLHF) is a powerful paradigm for aligning foundation models to human values and preferences. However, current RLHF techniques cannot account for the naturally occurring differences in individual human prefe
Externí odkaz:
http://arxiv.org/abs/2408.10075
Autor:
D'Ambrosio, David B., Abeyruwan, Saminda, Graesser, Laura, Iscen, Atil, Amor, Heni Ben, Bewley, Alex, Reed, Barney J., Reymann, Krista, Takayama, Leila, Tassa, Yuval, Choromanski, Krzysztof, Coumans, Erwin, Jain, Deepali, Jaitly, Navdeep, Jaques, Natasha, Kataoka, Satoshi, Kuang, Yuheng, Lazic, Nevena, Mahjourian, Reza, Moore, Sherry, Oslund, Kenneth, Shankar, Anish, Sindhwani, Vikas, Vanhoucke, Vincent, Vesom, Grace, Xu, Peng, Sanketi, Pannag R.
Achieving human-level speed and performance on real world tasks is a north star for the robotics research community. This work takes a step towards that goal and presents the first learned robot agent that reaches amateur human-level performance in c
Externí odkaz:
http://arxiv.org/abs/2408.03906
Autor:
Abdulhai, Marwa, Serapio-Garcia, Gregory, Crepy, Clément, Valter, Daria, Canny, John, Jaques, Natasha
Moral foundations theory (MFT) is a psychological assessment tool that decomposes human moral reasoning into five factors, including care/harm, liberty/oppression, and sanctity/degradation (Graham et al., 2009). People vary in the weight they place o
Externí odkaz:
http://arxiv.org/abs/2310.15337
Publikováno v:
Proceedings of the National Academy of Sciences; 121(2); 2024
Despite a sea of interpretability methods that can produce plausible explanations, the field has also empirically seen many failure cases of such methods. In light of these results, it remains unclear for practitioners how to use these methods and ch
Externí odkaz:
http://arxiv.org/abs/2212.11870
Autor:
Krishnan, Srivatsan, Jaques, Natasha, Omidshafiei, Shayegan, Zhang, Dan, Gur, Izzeddin, Reddi, Vijay Janapa, Faust, Aleksandra
Microprocessor architects are increasingly resorting to domain-specific customization in the quest for high-performance and energy-efficiency. As the systems grow in complexity, fine-tuning architectural parameters across multiple sub-systems (e.g.,
Externí odkaz:
http://arxiv.org/abs/2211.16385
This paper addresses the problem of inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior. IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accu
Externí odkaz:
http://arxiv.org/abs/2208.04919
Autor:
Gur, Izzeddin, Jaques, Natasha, Miao, Yingjie, Choi, Jongwook, Tiwari, Manoj, Lee, Honglak, Faust, Aleksandra
Many real-world problems are compositional - solving them requires completing interdependent sub-tasks, either in series or in parallel, that can be represented as a dependency graph. Deep reinforcement learning (RL) agents often struggle to learn su
Externí odkaz:
http://arxiv.org/abs/2201.08896
Autor:
Wang, Su, Montgomery, Ceslee, Orbay, Jordi, Birodkar, Vighnesh, Faust, Aleksandra, Gur, Izzeddin, Jaques, Natasha, Waters, Austin, Baldridge, Jason, Anderson, Peter
We study the automatic generation of navigation instructions from 360-degree images captured on indoor routes. Existing generators suffer from poor visual grounding, causing them to rely on language priors and hallucinate objects. Our MARKY-MT5 syste
Externí odkaz:
http://arxiv.org/abs/2111.12872