Zobrazeno 1 - 10
of 12 703
pro vyhledávání: '"Kempe A"'
Autor:
Charton, François, Kempe, Julia
We study the performance of transformers as a function of the number of repetitions of training examples with algorithmically generated datasets. On three problems of mathematics: the greatest common divisor, modular multiplication, and matrix eigenv
Externí odkaz:
http://arxiv.org/abs/2410.07041
Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical perf
Externí odkaz:
http://arxiv.org/abs/2410.04840
Large language models (LLMs) are trained on a deluge of text data with limited quality control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as leaking information, fake news or hate speech. Countermeasures, commonly refe
Externí odkaz:
http://arxiv.org/abs/2408.01420
Synthesized data from generative models is increasingly considered as an alternative to human-annotated data for fine-tuning Large Language Models. This raises concerns about model collapse: a drop in performance of models fine-tuned on generated dat
Externí odkaz:
http://arxiv.org/abs/2406.07515
We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization. In classification settings under adversarial perturbations with linear models, we study what type of regulari
Externí odkaz:
http://arxiv.org/abs/2406.04981
Autor:
Cabannes, Vivien, Arnal, Charles, Bouaziz, Wassim, Yang, Alice, Charton, Francois, Kempe, Julia
Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains lim
Externí odkaz:
http://arxiv.org/abs/2406.02128
Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we
Externí odkaz:
http://arxiv.org/abs/2404.19640
In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning. Data pruning offers a solution by removing redundant or uninformative samples from the dataset, whic
Externí odkaz:
http://arxiv.org/abs/2404.05579
Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this
Externí odkaz:
http://arxiv.org/abs/2403.09869
Rankings are ubiquitous across many applications, from search engines to hiring committees. In practice, many rankings are derived from the output of predictors. However, when predictors trained for classification tasks have intrinsic uncertainty, it
Externí odkaz:
http://arxiv.org/abs/2402.09326