Výsledky vyhledávání - "Immer, Alexander"

Report

Uncertainty-Penalized Direct Preference Optimization

Autor: Houliston, Sam, Pace, Alizée, Immer, Alexander, Rätsch, Gunnar

Aligning Large Language Models (LLMs) to human preferences in content, style, and presentation is challenging, in part because preferences are varied, context-dependent, and sometimes inherently ambiguous. While successful, Reinforcement Learning fro

Externí odkaz: http://arxiv.org/abs/2410.20187

Zobrazit plný text záznamu

Report

Influence Functions for Scalable Data Attribution in Diffusion Models

Autor: Mlodozeniec, Bruno, Eschenhagen, Runa, Bae, Juhan, Immer, Alexander, Krueger, David, Turner, Richard

Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges in diffusion models b

Externí odkaz: http://arxiv.org/abs/2410.13850

Zobrazit plný text záznamu

Report

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

Autor: Dhahri, Rayen, Immer, Alexander, Charpentier, Betrand, Günnemann, Stephan, Fortuin, Vincent

Neural network sparsification is a promising avenue to save computational time and memory costs, especially in an age where many successful AI models are becoming too large to na\"ively deploy on consumer hardware. While much work has focused on diff

Externí odkaz: http://arxiv.org/abs/2402.15978

Zobrazit plný text záznamu

Report

Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooke

Externí odkaz: http://arxiv.org/abs/2402.00809

Zobrazit plný text záznamu

Report

Uncertainty in Graph Contrastive Learning with Bayesian Neural Networks

Autor: Möllers, Alexander, Immer, Alexander, Isufi, Elvin, Fortuin, Vincent

Graph contrastive learning has shown great promise when labeled data is scarce, but large unlabeled datasets are available. However, it often does not take uncertainty estimation into account. We show that a variational Bayesian neural network approa

Externí odkaz: http://arxiv.org/abs/2312.00232

Zobrazit plný text záznamu

Report

Kronecker-Factored Approximate Curvature for Modern Neural Network Architectures

Autor: Eschenhagen, Runa, Immer, Alexander, Turner, Richard E., Schneider, Frank, Hennig, Philipp

The core components of many modern neural network architectures, such as transformers, convolutional, or graph neural networks, can be expressed as linear layers with $\textit{weight-sharing}$. Kronecker-Factored Approximate Curvature (K-FAC), a seco

Externí odkaz: http://arxiv.org/abs/2311.00636

Zobrazit plný text záznamu

Report

Learning Layer-wise Equivariances Automatically using Gradients

Autor: van der Ouderaa, Tycho F. A., Immer, Alexander, van der Wilk, Mark

Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. However, symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not b

Externí odkaz: http://arxiv.org/abs/2310.06131

Zobrazit plný text záznamu

Report

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

Autor: Meterez, Alexandru, Joudaki, Amir, Orabona, Francesco, Immer, Alexander, Rätsch, Gunnar, Daneshmand, Hadi

Normalization layers are one of the key building blocks for deep neural networks. Several theoretical studies have shown that batch normalization improves the signal propagation, by avoiding the representations from becoming collinear across the laye

Externí odkaz: http://arxiv.org/abs/2310.02012

Zobrazit plný text záznamu

Report

Hodge-Aware Contrastive Learning

Autor: Möllers, Alexander, Immer, Alexander, Fortuin, Vincent, Isufi, Elvin

Simplicial complexes prove effective in modeling data with multiway dependencies, such as data defined along the edges of networks or within other higher-order structures. Their spectrum can be decomposed into three interpretable subspaces via the Ho

Externí odkaz: http://arxiv.org/abs/2309.07364

Zobrazit plný text záznamu

Report

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

Autor: Immer, Alexander, van der Ouderaa, Tycho F. A., van der Wilk, Mark, Rätsch, Gunnar, Schölkopf, Bernhard

Selecting hyperparameters in deep learning greatly impacts its effectiveness but requires manual effort and expertise. Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like st

Externí odkaz: http://arxiv.org/abs/2306.03968

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání