Výsledky vyhledávání

Report

On the Modeling Capabilities of Large Language Models for Sequential Decision Making

Autor: Klissarov, Martin, Hjelm, Devon, Toshev, Alexander, Mazoure, Bogdan

Large pretrained models are showing increasingly better performance in reasoning and planning tasks across different modalities, opening the possibility to leverage them for complex sequential decision making problems. In this paper, we investigate t

Externí odkaz: http://arxiv.org/abs/2410.05656

Zobrazit plný text záznamu

Report

DataComp-LM: In search of the next generation of training sets for language models

Autor: Li, Jeffrey, Fang, Alex, Smyrnis, Georgios, Ivgi, Maor, Jordan, Matt, Gadre, Samir, Bansal, Hritik, Guha, Etash, Keh, Sedrick, Arora, Kushal, Garg, Saurabh, Xin, Rui, Muennighoff, Niklas, Heckel, Reinhard, Mercat, Jean, Chen, Mayee, Gururangan, Suchin, Wortsman, Mitchell, Albalak, Alon, Bitton, Yonatan, Nezhurina, Marianna, Abbas, Amro, Hsieh, Cheng-Yu, Ghosh, Dhruba, Gardner, Josh, Kilian, Maciej, Zhang, Hanlin, Shao, Rulin, Pratt, Sarah, Sanyal, Sunny, Ilharco, Gabriel, Daras, Giannis, Marathe, Kalyani, Gokaslan, Aaron, Zhang, Jieyu, Chandu, Khyathi, Nguyen, Thao, Vasiljevic, Igor, Kakade, Sham, Song, Shuran, Sanghavi, Sujay, Faghri, Fartash, Oh, Sewoong, Zettlemoyer, Luke, Lo, Kyle, El-Nouby, Alaaeldin, Pouransari, Hadi, Toshev, Alexander, Wang, Stephanie, Groeneveld, Dirk, Soldaini, Luca, Koh, Pang Wei, Jitsev, Jenia, Kollar, Thomas, Dimakis, Alexandros G., Carmon, Yair, Dave, Achal, Schmidt, Ludwig, Shankar, Vaishaal

We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretrai

Externí odkaz: http://arxiv.org/abs/2406.11794

Zobrazit plný text záznamu

Report

Grounding Multimodal Large Language Models in Actions

Autor: Szot, Andrew, Mazoure, Bogdan, Agrawal, Harsh, Hjelm, Devon, Kira, Zsolt, Toshev, Alexander

Multimodal Large Language Models (MLLMs) have demonstrated a wide range of capabilities across many domains, including Embodied AI. In this work, we study how to best ground a MLLM into different embodiments and their associated action spaces, with t

Externí odkaz: http://arxiv.org/abs/2406.07904

Zobrazit plný text záznamu

Report

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the v

Externí odkaz: http://arxiv.org/abs/2403.09611

Zobrazit plný text záznamu

Report

JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics Framework

Autor: Toshev, Artur P., Ramachandran, Harish, Erbesdobler, Jonas A., Galletti, Gianluca, Brandstetter, Johannes, Adams, Nikolaus A.

Particle-based fluid simulations have emerged as a powerful tool for solving the Navier-Stokes equations, especially in cases that include intricate physics and free surfaces. The recent addition of machine learning methods to the toolbox for solving

Externí odkaz: http://arxiv.org/abs/2403.04750

Zobrazit plný text záznamu

Report

Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics

Autor: Toshev, Artur P., Erbesdobler, Jonas A., Adams, Nikolaus A., Brandstetter, Johannes

Smoothed particle hydrodynamics (SPH) is omnipresent in modern engineering and scientific disciplines. SPH is a class of Lagrangian schemes that discretize fluid dynamics via finite material points that are tracked through the evolving velocity field

Externí odkaz: http://arxiv.org/abs/2402.06275

Zobrazit plný text záznamu

Report

Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation

Autor: Zhang, Yuhui, McKinzie, Brandon, Gan, Zhe, Shankar, Vaishaal, Toshev, Alexander

Recent advances in image tokenizers, such as VQ-VAE, have enabled text-to-image generation using auto-regressive methods, similar to language modeling. However, these methods have yet to leverage pre-trained language models, despite their adaptabilit

Externí odkaz: http://arxiv.org/abs/2311.16201

Zobrazit plný text záznamu

Report

Large Language Models as Generalizable Policies for Embodied Tasks

Autor: Szot, Andrew, Schwarzer, Max, Agrawal, Harsh, Mazoure, Bogdan, Talbott, Walter, Metcalf, Katherine, Mackraz, Natalie, Hjelm, Devon, Toshev, Alexander

We show that large language models (LLMs) can be adapted to be generalizable policies for embodied visual tasks. Our approach, called Large LAnguage model Reinforcement Learning Policy (LLaRP), adapts a pre-trained frozen LLM to take as input text in

Externí odkaz: http://arxiv.org/abs/2310.17722

Zobrazit plný text záznamu

Report

Data Filtering Networks

Autor: Fang, Alex, Jose, Albin Madappally, Jain, Amit, Schmidt, Ludwig, Toshev, Alexander, Shankar, Vaishaal

Large training sets have become a cornerstone of machine learning and are the foundation for recent advances in language modeling and multimodal learning. While data curation for pre-training is often still ad-hoc, one common paradigm is to first col

Externí odkaz: http://arxiv.org/abs/2309.17425

Zobrazit plný text záznamu

Report

LagrangeBench: A Lagrangian Fluid Mechanics Benchmarking Suite

Autor: Toshev, Artur P., Galletti, Gianluca, Fritz, Fabian, Adami, Stefan, Adams, Nikolaus A.

Machine learning has been successfully applied to grid-based PDE modeling in various scientific applications. However, learned PDE solvers based on Lagrangian particle discretizations, which are the preferred approach to problems with free surfaces o

Externí odkaz: http://arxiv.org/abs/2309.16342

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání