Výsledky vyhledávání - "Sutherland, P. J."

Report

Understanding Simplicity Bias towards Compositional Mappings via Learning Dynamics

Obtaining compositional mappings is important for the model to generalize well compositionally. To better understand when and how to encourage the model to learn such mappings, we study their uniqueness through different perspectives. Specifically, w

Externí odkaz: http://arxiv.org/abs/2409.09626

Zobrazit plný text záznamu

Report

Learning Deep Kernels for Non-Parametric Independence Testing

Autor: Xu, Nathaniel, Liu, Feng, Sutherland, Danica J.

The Hilbert-Schmidt Independence Criterion (HSIC) is a powerful tool for nonparametric detection of dependence between random variables. It crucially depends, however, on the selection of reasonable kernels; commonly-used choices like the Gaussian ke

Externí odkaz: http://arxiv.org/abs/2409.06890

Zobrazit plný text záznamu

Report

Why Do You Grok? A Theoretical Analysis of Grokking Modular Addition

Autor: Mohamadi, Mohamad Amin, Li, Zhiyuan, Wu, Lei, Sutherland, Danica J.

We present a theoretical explanation of the ``grokking'' phenomenon, where a model generalizes long after overfitting,for the originally-studied problem of modular addition. First, we show that early in gradient descent, when the ``kernel regime'' ap

Externí odkaz: http://arxiv.org/abs/2407.12332

Zobrazit plný text záznamu

Report

Generalized Coverage for More Robust Low-Budget Active Learning

Autor: Bae, Wonho, Noh, Junhyug, Sutherland, Danica J.

The ProbCover method of Yehuda et al. is a well-motivated algorithm for active learning in low-budget regimes, which attempts to "cover" the data distribution with balls of a given radius at selected data points. We demonstrate, however, that the per

Externí odkaz: http://arxiv.org/abs/2407.12212

Zobrazit plný text záznamu

Report

Learning Dynamics of LLM Finetuning

Autor: Ren, Yi, Sutherland, Danica J.

Learning dynamics, which describes how the learning of specific training examples influences the model's predictions on other examples, gives us a powerful tool for understanding the behavior of deep learning systems. We study the learning dynamics o

Externí odkaz: http://arxiv.org/abs/2407.10490

Zobrazit plný text záznamu

Report

Bias Amplification in Language Model Evolution: An Iterated Learning Perspective

Autor: Ren, Yi, Guo, Shangmin, Qiu, Linlu, Wang, Bailin, Sutherland, Danica J.

With the widespread adoption of Large Language Models (LLMs), the prevalence of iterative interactions among these models is anticipated to increase. Notably, recent advancements in multi-round self-improving methods allow LLMs to generate new exampl

Externí odkaz: http://arxiv.org/abs/2404.04286

Zobrazit plný text záznamu

Report

Practical Kernel Tests of Conditional Independence

Autor: Pogodin, Roman, Schrab, Antonin, Li, Yazhe, Sutherland, Danica J., Gretton, Arthur

We describe a data-efficient, kernel-based approach to statistical testing of conditional independence. A major challenge of conditional independence testing, absent in tests of unconditional independence, is to obtain the correct test level (the spe

Externí odkaz: http://arxiv.org/abs/2402.13196

Zobrazit plný text záznamu

Report

A Summary of Known Bounds on the Essential Dimension and Resolvent Degree of Finite Groups

Autor: Sutherland, Alexander J.

We summarize what is currently known about ed$(G)$ and RD$(G)$ for finite groups $G$ over $\mathbb{C}$ (i.e. in characteristic 0). In Appendix A, we also give an argument which improves the known bound on RD(PSL$(2,\mathbb{F}_{11})$).
Comment: 6

Externí odkaz: http://arxiv.org/abs/2312.04430

Zobrazit plný text záznamu

Report

AdaFlood: Adaptive Flood Regularization

Autor: Bae, Wonho, Ren, Yi, Ahmed, Mohamad Osama, Tung, Frederick, Sutherland, Danica J., Oliveira, Gabriel L.

Although neural networks are conventionally optimized towards zero training loss, it has been recently learned that targeting a non-zero training loss threshold, referred to as a flood level, often enables better test time generalization. Current app

Externí odkaz: http://arxiv.org/abs/2311.02891

Zobrazit plný text záznamu

Report

Exploring Active Learning in Meta-Learning: Enhancing Context Set Labeling

Autor: Bae, Wonho, Wang, Jing, Sutherland, Danica J.

Most meta-learning methods assume that the (very small) context set used to establish a new task at test time is passively provided. In some settings, however, it is feasible to actively select which points to label; the potential gain from a careful

Externí odkaz: http://arxiv.org/abs/2311.02879

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání