Zobrazeno 1 - 10
of 216
pro vyhledávání: '"Szlam, Arthur"'
Autor:
Douillard, Arthur, Feng, Qixuan, Rusu, Andrei A., Kuncoro, Adhiguna, Donchev, Yani, Chhaparia, Rachita, Gog, Ionel, Ranzato, Marc'Aurelio, Shen, Jiajun, Szlam, Arthur
Progress in machine learning (ML) has been fueled by scaling neural network models. This scaling has been enabled by ever more heroic feats of engineering, necessary for accommodating ML approaches that require high bandwidth communication between de
Externí odkaz:
http://arxiv.org/abs/2403.10616
Autor:
Douillard, Arthur, Feng, Qixuan, Rusu, Andrei A., Chhaparia, Rachita, Donchev, Yani, Kuncoro, Adhiguna, Ranzato, Marc'Aurelio, Szlam, Arthur, Shen, Jiajun
Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of tightly interconnected accelerators, with devices exchanging gradients and o
Externí odkaz:
http://arxiv.org/abs/2311.08105
Autor:
Lanchantin, Jack, Sukhbaatar, Sainbayar, Synnaeve, Gabriel, Sun, Yuxuan, Srinet, Kavya, Szlam, Arthur
Recent progress in using machine learning models for reasoning tasks has been driven by novel model architectures, large-scale pre-training protocols, and dedicated reasoning datasets for fine-tuning. In this work, to further pursue these advances, w
Externí odkaz:
http://arxiv.org/abs/2309.07974
Autor:
Mohanty, Shrestha, Arabzadeh, Negar, Kiseleva, Julia, Zholus, Artem, Teruel, Milagro, Awadallah, Ahmed, Sun, Yuxuan, Srinet, Kavya, Szlam, Arthur
Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly. This skill is evident from a young age as we acquire new abilities and solve problems by imitating others or following natural l
Externí odkaz:
http://arxiv.org/abs/2305.10783
Large language models have been shown to struggle with multi-step reasoning, and do not retain previous reasoning steps for future use. We propose a simple method for solving both of these problems by allowing the model to take Self-Notes. Unlike rec
Externí odkaz:
http://arxiv.org/abs/2305.00833
Current dialogue research primarily studies pairwise (two-party) conversations, and does not address the everyday setting where more than two speakers converse together. In this work, we both collect and evaluate multi-party conversations to study th
Externí odkaz:
http://arxiv.org/abs/2304.13835
While language models have become more capable of producing compelling language, we find there are still gaps in maintaining consistency, especially when describing events in a dynamically changing world. We study the setting of generating narratives
Externí odkaz:
http://arxiv.org/abs/2301.05746
Autor:
Mohanty, Shrestha, Arabzadeh, Negar, Teruel, Milagro, Sun, Yuxuan, Zholus, Artem, Skrynnik, Alexey, Burtsev, Mikhail, Srinet, Kavya, Panov, Aleksandr, Szlam, Arthur, Côté, Marc-Alexandre, Kiseleva, Julia
Publikováno v:
Interactive Learning for Natural Language Processing NeurIPS 2022 Workshop
Human intelligence can remarkably adapt quickly to new tasks and environments. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural lang
Externí odkaz:
http://arxiv.org/abs/2211.06552
Autor:
Shafiullah, Nur Muhammad Mahi, Paxton, Chris, Pinto, Lerrel, Chintala, Soumith, Szlam, Arthur
We propose CLIP-Fields, an implicit scene model that can be used for a variety of tasks, such as segmentation, instance identification, semantic search over space, and view localization. CLIP-Fields learns a mapping from spatial locations to semantic
Externí odkaz:
http://arxiv.org/abs/2210.05663
Autor:
Shuster, Kurt, Xu, Jing, Komeili, Mojtaba, Ju, Da, Smith, Eric Michael, Roller, Stephen, Ung, Megan, Chen, Moya, Arora, Kushal, Lane, Joshua, Behrooz, Morteza, Ngan, William, Poff, Spencer, Goyal, Naman, Szlam, Arthur, Boureau, Y-Lan, Kambadur, Melanie, Weston, Jason
We present BlenderBot 3, a 175B parameter dialogue model capable of open-domain conversation with access to the internet and a long-term memory, and having been trained on a large number of user defined tasks. We release both the model weights and co
Externí odkaz:
http://arxiv.org/abs/2208.03188