Výsledky vyhledávání - "van Steenkiste, Sjoerd"

Report

How Does Code Pretraining Affect Language Model Task Performance?

Autor: Petty, Jackson, van Steenkiste, Sjoerd, Linzen, Tal

Large language models are increasingly trained on corpora containing both natural language and non-linguistic data like source code. Aside from aiding programming-related tasks, anecdotal evidence suggests that including code in pretraining corpora m

Externí odkaz: http://arxiv.org/abs/2409.04556

Zobrazit plný text záznamu

Report

Benchmarking Vision Language Models for Cultural Understanding

Autor: Nayak, Shravan, Jain, Kanishk, Awal, Rabiul, Reddy, Siva, van Steenkiste, Sjoerd, Hendricks, Lisa Anne, Stańczak, Karolina, Agrawal, Aishwarya

Foundation models and vision-language pre-training have notably advanced Vision Language Models (VLMs), enabling multimodal processing of visual and linguistic data. However, their performance has been typically assessed on general scene understandin

Externí odkaz: http://arxiv.org/abs/2407.10920

Zobrazit plný text záznamu

Report

Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models

Autor: Wu, Ziyi, Rubanova, Yulia, Kabra, Rishabh, Hudson, Drew A., Gilitschenski, Igor, Aytar, Yusuf, van Steenkiste, Sjoerd, Allen, Kelsey R., Kipf, Thomas

We address the problem of multi-object 3D pose control in image diffusion models. Instead of conditioning on a sequence of text tokens, we propose to use a set of per-object representations, Neural Assets, to control the 3D pose of individual objects

Externí odkaz: http://arxiv.org/abs/2406.09292

Zobrazit plný text záznamu

Report

A Systematic Comparison of Syllogistic Reasoning in Humans and Language Models

Autor: Eisape, Tiwalayo, Tessler, MH, Dasgupta, Ishita, Sha, Fei, van Steenkiste, Sjoerd, Linzen, Tal

A central component of rational behavior is logical inference: the process of determining which conclusions follow from a set of premises. Psychologists have documented several ways in which humans' inferences deviate from the rules of logic. Do lang

Externí odkaz: http://arxiv.org/abs/2311.00445

Zobrazit plný text záznamu

Report

The Impact of Depth on Compositional Generalization in Transformer Language Models

Autor: Petty, Jackson, van Steenkiste, Sjoerd, Dasgupta, Ishita, Sha, Fei, Garrette, Dan, Linzen, Tal

To process novel sentences, language models (LMs) must generalize compositionally -- combine familiar elements in new ways. What aspects of a model's structure promote compositional generalization? Focusing on transformers, we test the hypothesis, mo

Externí odkaz: http://arxiv.org/abs/2310.19956

Zobrazit plný text záznamu

Report

DyST: Towards Dynamic Neural Scene Representations on Real-World Videos

Autor: Seitzer, Maximilian, van Steenkiste, Sjoerd, Kipf, Thomas, Greff, Klaus, Sajjadi, Mehdi S. M.

Visual understanding of the world goes beyond the semantics and flat structure of individual images. In this work, we aim to capture both the 3D structure and dynamics of real-world scenes from monocular real-world videos. Our Dynamic Scene Transform

Externí odkaz: http://arxiv.org/abs/2310.06020

Zobrazit plný text záznamu

Report

DORSal: Diffusion for Object-centric Representations of Scenes et al

Autor: Jabri, Allan, van Steenkiste, Sjoerd, Hoogeboom, Emiel, Sajjadi, Mehdi S. M., Kipf, Thomas

Recent progress in 3D scene understanding enables scalable learning of representations across large datasets of diverse scenes. As a consequence, generalization to unseen scenes and objects, rendering novel views from just a single or a handful of in

Externí odkaz: http://arxiv.org/abs/2306.08068

Zobrazit plný text záznamu

Report

Sensitivity of Slot-Based Object-Centric Models to their Number of Slots

Autor: Zimmermann, Roland S., van Steenkiste, Sjoerd, Sajjadi, Mehdi S. M., Kipf, Thomas, Greff, Klaus

Self-supervised methods for learning object-centric representations have recently been applied successfully to various datasets. This progress is largely fueled by slot-based methods, whose ability to cluster visual scenes into meaningful objects hol

Externí odkaz: http://arxiv.org/abs/2305.18890

Zobrazit plný text záznamu

Report

Scaling Vision Transformers to 22 Billion Parameters

The scaling of Transformers has driven breakthrough capabilities for language models. At present, the largest large language models (LLMs) contain upwards of 100B parameters. Vision Transformers (ViT) have introduced the same architecture to image an

Externí odkaz: http://arxiv.org/abs/2302.05442

Zobrazit plný text záznamu

Report

Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames

Autor: Biza, Ondrej, van Steenkiste, Sjoerd, Sajjadi, Mehdi S. M., Elsayed, Gamaleldin F., Mahendran, Aravindh, Kipf, Thomas

Automatically discovering composable abstractions from raw perceptual data is a long-standing challenge in machine learning. Recent slot-based neural networks that learn about objects in a self-supervised manner have made exciting progress in this di

Externí odkaz: http://arxiv.org/abs/2302.04973

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání