Zobrazeno 1 - 10
of 3 284
pro vyhledávání: '"Faeze"'
Autor:
Lambert, Nathan, Morrison, Jacob, Pyatkin, Valentina, Huang, Shengyi, Ivison, Hamish, Brahman, Faeze, Miranda, Lester James V., Liu, Alisa, Dziri, Nouha, Lyu, Shane, Gu, Yuling, Malik, Saumya, Graf, Victoria, Hwang, Jena D., Yang, Jiangjiang, Bras, Ronan Le, Tafjord, Oyvind, Wilhelm, Chris, Soldaini, Luca, Smith, Noah A., Wang, Yizhong, Dasigi, Pradeep, Hajishirzi, Hannaneh
Language model post-training is applied to refine behaviors and unlock new skills across a wide range of recent language models, but open recipes for applying these techniques lag behind proprietary ones. The underlying training data and recipes for
Externí odkaz:
http://arxiv.org/abs/2411.15124
Autor:
Rezaei, Keivan, Chandu, Khyathi, Feizi, Soheil, Choi, Yejin, Brahman, Faeze, Ravichander, Abhilasha
Large language models trained on web-scale corpora can memorize undesirable datapoints such as incorrect facts, copyrighted content or sensitive data. Recently, many machine unlearning methods have been proposed that aim to 'erase' these datapoints f
Externí odkaz:
http://arxiv.org/abs/2411.00204
This paper describes a linguistically-motivated approach to the 2024 edition of the BabyLM Challenge (Warstadt et al. 2023). Rather than pursuing a first language learning (L1) paradigm, we approach the challenge from a second language (L2) learning
Externí odkaz:
http://arxiv.org/abs/2410.21254
Autor:
Miranda, Lester James V., Wang, Yizhong, Elazar, Yanai, Kumar, Sachin, Pyatkin, Valentina, Brahman, Faeze, Smith, Noah A., Hajishirzi, Hannaneh, Dasigi, Pradeep
Learning from human feedback has enabled the alignment of language models (LMs) with human preferences. However, directly collecting human preferences can be expensive, time-consuming, and can have high variance. An appealing alternative is to distil
Externí odkaz:
http://arxiv.org/abs/2410.19133
Autor:
Zhou, Xuhui, Kim, Hyunwoo, Brahman, Faeze, Jiang, Liwei, Zhu, Hao, Lu, Ximing, Xu, Frank, Lin, Bill Yuchen, Choi, Yejin, Mireshghallah, Niloofar, Bras, Ronan Le, Sap, Maarten
AI agents are increasingly autonomous in their interactions with human users and tools, leading to increased interactional safety risks. We present HAICOSYSTEM, a framework examining AI agent safety within diverse and complex social interactions. HAI
Externí odkaz:
http://arxiv.org/abs/2409.16427
Autor:
Su, Zhe, Zhou, Xuhui, Rangreji, Sanketh, Kabra, Anubha, Mendelsohn, Julia, Brahman, Faeze, Sap, Maarten
To be safely and successfully deployed, LLMs must simultaneously satisfy truthfulness and utility goals. Yet, often these two goals compete (e.g., an AI agent assisting a used car salesman selling a car with flaws), partly due to ambiguous or mislead
Externí odkaz:
http://arxiv.org/abs/2409.09013
We present a principled approach to provide LLM-based evaluation with a rigorous guarantee of human agreement. We first propose that a reliable evaluation method should not uncritically rely on model preferences for pairwise evaluation, but rather as
Externí odkaz:
http://arxiv.org/abs/2407.18370
In neural network (NN) security, safeguarding model integrity and resilience against adversarial attacks has become paramount. This study investigates the application of stochastic computing (SC) as a novel mechanism to fortify NN models. The primary
Externí odkaz:
http://arxiv.org/abs/2407.04861
Autor:
Brahman, Faeze, Kumar, Sachin, Balachandran, Vidhisha, Dasigi, Pradeep, Pyatkin, Valentina, Ravichander, Abhilasha, Wiegreffe, Sarah, Dziri, Nouha, Chandu, Khyathi, Hessel, Jack, Tsvetkov, Yulia, Smith, Noah A., Choi, Yejin, Hajishirzi, Hannaneh
Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of "unsafe" queries, we posit that the scope of noncompliance should be broadened. We int
Externí odkaz:
http://arxiv.org/abs/2407.12043
Autor:
Lee, Jaeyoung, Lu, Ximing, Hessel, Jack, Brahman, Faeze, Yu, Youngjae, Bisk, Yonatan, Choi, Yejin, Gabriel, Saadia
Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to
Externí odkaz:
http://arxiv.org/abs/2407.00369