Zobrazeno 1 - 10
of 1 173
pro vyhledávání: '"A. Cabannes"'
Statistical learning in high-dimensional spaces is challenging without a strong underlying data structure. Recent advances with foundational models suggest that text and image data contain such hidden structures, which help mitigate the curse of dime
Externí odkaz:
http://arxiv.org/abs/2411.01375
This paper introduces a visual sandbox designed to explore the training dynamics of a small-scale transformer model, with the embedding dimension constrained to $d=2$. This restriction allows for a comprehensive two-dimensional visualization of each
Externí odkaz:
http://arxiv.org/abs/2410.24050
Autor:
Sobal, Vlad, Ibrahim, Mark, Balestriero, Randall, Cabannes, Vivien, Bouchacourt, Diane, Astolfi, Pietro, Cho, Kyunghyun, LeCun, Yann
Learning good representations involves capturing the diverse ways in which data samples relate. Contrastive loss - an objective matching related samples - underlies methods from self-supervised to multimodal learning. Contrastive losses, however, can
Externí odkaz:
http://arxiv.org/abs/2407.18134
Autor:
Cabannes, Vivien, Arnal, Charles, Bouaziz, Wassim, Yang, Alice, Charton, Francois, Kempe, Julia
Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains lim
Externí odkaz:
http://arxiv.org/abs/2406.02128
This work focuses on the training dynamics of one associative memory module storing outer products of token embeddings. We reduce this problem to the study of a system of particles, which interact according to properties of the data distribution and
Externí odkaz:
http://arxiv.org/abs/2402.18724
The combination of lightly supervised pre-training and online fine-tuning has played a key role in recent AI developments. These new learning pipelines call for new theoretical frameworks. In this paper, we formalize core aspects of weakly supervised
Externí odkaz:
http://arxiv.org/abs/2402.13079
Autor:
Cabannes, Vivien, Arnal, Charles
Publikováno v:
ICASSP, 2024
The number of sampling methods could be daunting for a practitioner looking to cast powerful machine learning methods to their specific problem. This paper takes a theoretical stance to review and organize many sampling approaches in the ``generative
Externí odkaz:
http://arxiv.org/abs/2311.13845
Learning arguably involves the discovery and memorization of abstract rules. The aim of this paper is to study associative memory mechanisms. Our model is based on high-dimensional matrices consisting of outer products of embeddings, which relates to
Externí odkaz:
http://arxiv.org/abs/2310.02984
Publikováno v:
IEEE Big Data, 2023
The theory of statistical learning has focused on variational objectives expressed on functions. In this note, we discuss motivations to write similar objectives on measures, in particular to discuss out-of-distribution generalization and weakly-supe
Externí odkaz:
http://arxiv.org/abs/2306.11928
Large language models based on transformers have achieved great empirical successes. However, as they are deployed more widely, there is a growing need to better understand their internal mechanisms in order to make them more reliable. These models a
Externí odkaz:
http://arxiv.org/abs/2306.00802