Zobrazeno 1 - 10
of 3 444
pro vyhledávání: '"P, Kempe"'
The Max-Flow/Min-Cut problem is a fundamental tool in graph theory, with applications in many domains, including data mining, image segmentation, transportation planning, and many types of assignment problems, in addition to being an essential buildi
Externí odkaz:
http://arxiv.org/abs/2411.10484
Autor:
Lee, David H., Prasad, Aditya, Vuong, Ramiro Deo-Campo, Wang, Tianyu, Han, Eric, Kempe, David
Dynamic programming (DP) is a fundamental and powerful algorithmic paradigm taught in most undergraduate (and many graduate) algorithms classes. DP problems are challenging for many computer science students because they require identifying unique pr
Externí odkaz:
http://arxiv.org/abs/2411.07705
We study the implicit bias of the general family of steepest descent algorithms, which includes gradient descent, sign descent and coordinate descent, in deep homogeneous neural networks. We prove that an algorithm-dependent geometric margin starts i
Externí odkaz:
http://arxiv.org/abs/2410.22069
Regularization, whether explicit in terms of a penalty in the loss or implicit in the choice of algorithm, is a cornerstone of modern machine learning. Indeed, controlling the complexity of the model class is particularly important when data is scarc
Externí odkaz:
http://arxiv.org/abs/2410.16073
Autor:
Charton, François, Kempe, Julia
We study the performance of transformers as a function of the number of repetitions of training examples with algorithmically generated datasets. On three problems of mathematics: the greatest common divisor, modular multiplication, and matrix eigenv
Externí odkaz:
http://arxiv.org/abs/2410.07041
Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical perf
Externí odkaz:
http://arxiv.org/abs/2410.04840
Large language models (LLMs) are trained on a deluge of text data with limited quality control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as leaking information, fake news or hate speech. Countermeasures, commonly refe
Externí odkaz:
http://arxiv.org/abs/2408.01420
Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation.
Externí odkaz:
http://arxiv.org/abs/2406.07515
We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization. In classification settings under adversarial perturbations with linear models, we study what type of regulari
Externí odkaz:
http://arxiv.org/abs/2406.04981
Autor:
Cabannes, Vivien, Arnal, Charles, Bouaziz, Wassim, Yang, Alice, Charton, Francois, Kempe, Julia
Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains lim
Externí odkaz:
http://arxiv.org/abs/2406.02128