Výsledky vyhledávání

Report

iFlow: An Interactive Max-Flow/Min-Cut Algorithms Visualizer

Autor: Ye, Muyang, Xia, Tianrui, Zu, Tianxin, Wang, Qian, Kempe, David

The Max-Flow/Min-Cut problem is a fundamental tool in graph theory, with applications in many domains, including data mining, image segmentation, transportation planning, and many types of assignment problems, in addition to being an essential buildi

Externí odkaz: http://arxiv.org/abs/2411.10484

Zobrazit plný text záznamu

Report

dpvis: A Visual and Interactive Learning Tool for Dynamic Programming

Autor: Lee, David H., Prasad, Aditya, Vuong, Ramiro Deo-Campo, Wang, Tianyu, Han, Eric, Kempe, David

Dynamic programming (DP) is a fundamental and powerful algorithmic paradigm taught in most undergraduate (and many graduate) algorithms classes. DP problems are challenging for many computer science students because they require identifying unique pr

Externí odkaz: http://arxiv.org/abs/2411.07705

Zobrazit plný text záznamu

Report

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

Autor: Tsilivis, Nikolaos, Vardi, Gal, Kempe, Julia

We study the implicit bias of the general family of steepest descent algorithms, which includes gradient descent, sign descent and coordinate descent, in deep homogeneous neural networks. We prove that an algorithm-dependent geometric margin starts i

Externí odkaz: http://arxiv.org/abs/2410.22069

Zobrazit plný text záznamu

Report

On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

Autor: Vilucchio, Matteo, Tsilivis, Nikolaos, Loureiro, Bruno, Kempe, Julia

Regularization, whether explicit in terms of a penalty in the loss or implicit in the choice of algorithm, is a cornerstone of modern machine learning. Indeed, controlling the complexity of the model class is particularly important when data is scarc

Externí odkaz: http://arxiv.org/abs/2410.16073

Zobrazit plný text záznamu

Report

Emergent properties with repeated examples

Autor: Charton, François, Kempe, Julia

We study the performance of transformers as a function of the number of repetitions of training examples with algorithmically generated datasets. On three problems of mathematics: the greatest common divisor, modular multiplication, and matrix eigenv

Externí odkaz: http://arxiv.org/abs/2410.07041

Zobrazit plný text záznamu

Report

Strong Model Collapse

Autor: Dohmatob, Elvis, Feng, Yunzhen, Subramonian, Arjun, Kempe, Julia

Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish the existance of a strong form of the model collapse phenomenon, a critical perf

Externí odkaz: http://arxiv.org/abs/2410.04840

Zobrazit plný text záznamu

Report

Mission Impossible: A Statistical Perspective on Jailbreaking LLMs

Autor: Su, Jingtong, Kempe, Julia, Ullrich, Karen

Large language models (LLMs) are trained on a deluge of text data with limited quality control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as leaking information, fake news or hate speech. Countermeasures, commonly refe

Externí odkaz: http://arxiv.org/abs/2408.01420

Zobrazit plný text záznamu

Report

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification

Autor: Feng, Yunzhen, Dohmatob, Elvis, Yang, Pu, Charton, Francois, Kempe, Julia

Large Language Models (LLM) are increasingly trained on data generated by other LLM, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation.

Externí odkaz: http://arxiv.org/abs/2406.07515

Zobrazit plný text záznamu

Report

The Price of Implicit Bias in Adversarially Robust Generalization

Autor: Tsilivis, Nikolaos, Frank, Natalie, Srebro, Nathan, Kempe, Julia

We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization. In classification settings under adversarial perturbations with linear models, we study what type of regulari

Externí odkaz: http://arxiv.org/abs/2406.04981

Zobrazit plný text záznamu

Report

Iteration Head: A Mechanistic Study of Chain-of-Thought

Autor: Cabannes, Vivien, Arnal, Charles, Bouaziz, Wassim, Yang, Alice, Charton, Francois, Kempe, Julia

Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains lim

Externí odkaz: http://arxiv.org/abs/2406.02128

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání