Výsledky vyhledávání - "Henaff, Olivier"

Report

A Practitioner's Guide to Continual Multimodal Pretraining

Autor: Roth, Karsten, Udandarao, Vishaal, Dziadzio, Sebastian, Prabhu, Ameya, Cherti, Mehdi, Vinyals, Oriol, Hénaff, Olivier, Albanie, Samuel, Bethge, Matthias, Akata, Zeynep

Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep models updated, research into continual pretraining mainly

Externí odkaz: http://arxiv.org/abs/2408.14471

Zobrazit plný text záznamu

Report

PaliGemma: A versatile 3B VLM for transfer

PaliGemma is an open Vision-Language Model (VLM) that is based on the SigLIP-So400m vision encoder and the Gemma-2B language model. It is trained to be a versatile and broadly knowledgeable base model that is effective to transfer. It achieves strong

Externí odkaz: http://arxiv.org/abs/2407.07726

Zobrazit plný text záznamu

Report

Data curation via joint example selection further accelerates multimodal learning

Autor: Evans, Talfan, Parthasarathy, Nikhil, Merzic, Hamza, Henaff, Olivier J.

Data curation is an essential component of large-scale pretraining. In this work, we demonstrate that jointly selecting batches of data is more effective for learning than selecting examples independently. Multimodal contrastive objectives expose the

Externí odkaz: http://arxiv.org/abs/2406.17711

Zobrazit plný text záznamu

Report

Reflecting on the State of Rehearsal-free Continual Learning with Pretrained Models

Autor: Thede, Lukas, Roth, Karsten, Hénaff, Olivier J., Bethge, Matthias, Akata, Zeynep

With the advent and recent ubiquity of foundation models, continual learning (CL) has recently shifted from continual training from scratch to the continual adaptation of pretrained models, seeing particular success on rehearsal-free CL benchmarks (R

Externí odkaz: http://arxiv.org/abs/2406.09384

Zobrazit plný text záznamu

Report

Memory Consolidation Enables Long-Context Video Understanding

Autor: Balažević, Ivana, Shi, Yuge, Papalampidi, Pinelopi, Chaabouni, Rahma, Koppula, Skanda, Hénaff, Olivier J.

Most transformer-based video encoders are limited to short temporal contexts due to their quadratic complexity. While various attempts have been made to extend this context, this has often come at the cost of both conceptual and computational complex

Externí odkaz: http://arxiv.org/abs/2402.05861

Zobrazit plný text záznamu

Report

Layerwise complexity-matched learning yields an improved model of cortical area V2

Autor: Parthasarathy, Nikhil, Hénaff, Olivier J., Simoncelli, Eero P.

Publikováno v: Transactions on Machine Learning Research, Jun 2024

Human ability to recognize complex visual patterns arises through transformations performed by successive areas in the ventral visual cortex. Deep neural networks trained end-to-end for object recognition approach human capabilities, and offer the be

Externí odkaz: http://arxiv.org/abs/2312.11436

Zobrazit plný text záznamu

Report

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding

Autor: Evans, Talfan, Pathak, Shreya, Merzic, Hamza, Schwarz, Jonathan, Tanno, Ryutaro, Henaff, Olivier J.

Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow. Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples. Despite their appeal, these methods hav

Externí odkaz: http://arxiv.org/abs/2312.05328

Zobrazit plný text záznamu

Report

Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model

Autor: Roth, Karsten, Thede, Lukas, Koepke, Almut Sophia, Vinyals, Oriol, Hénaff, Olivier, Akata, Zeynep

Training deep networks requires various design decisions regarding for instance their architecture, data augmentation, or optimization. In this work, we find these training variations to result in networks learning unique feature sets from the data.

Externí odkaz: http://arxiv.org/abs/2310.17653

Zobrazit plný text záznamu

Report

Towards In-context Scene Understanding

Autor: Balažević, Ivana, Steiner, David, Parthasarathy, Nikhil, Arandjelović, Relja, Hénaff, Olivier J.

In-context learning$\unicode{x2013}$the ability to configure a model's behavior with different prompts$\unicode{x2013}$has revolutionized the field of natural language processing, alleviating the need for task-specific models and paving the way for g

Externí odkaz: http://arxiv.org/abs/2306.01667

Zobrazit plný text záznamu

Report

Three ways to improve feature alignment for open vocabulary detection

Autor: Arandjelović, Relja, Andonian, Alex, Mensch, Arthur, Hénaff, Olivier J., Alayrac, Jean-Baptiste, Zisserman, Andrew

The core problem in zero-shot open vocabulary detection is how to align visual and text features, so that the detector performs well on unseen classes. Previous approaches train the feature pyramid and detection head from scratch, which breaks the vi

Externí odkaz: http://arxiv.org/abs/2303.13518

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání