Zobrazeno 1 - 10
of 43
pro vyhledávání: '"Immer, Alexander"'
Aligning Large Language Models (LLMs) to human preferences in content, style, and presentation is challenging, in part because preferences are varied, context-dependent, and sometimes inherently ambiguous. While successful, Reinforcement Learning fro
Externí odkaz:
http://arxiv.org/abs/2410.20187
Autor:
Mlodozeniec, Bruno, Eschenhagen, Runa, Bae, Juhan, Immer, Alexander, Krueger, David, Turner, Richard
Diffusion models have led to significant advancements in generative modelling. Yet their widespread adoption poses challenges regarding data attribution and interpretability. In this paper, we aim to help address such challenges in diffusion models b
Externí odkaz:
http://arxiv.org/abs/2410.13850
Neural network sparsification is a promising avenue to save computational time and memory costs, especially in an age where many successful AI models are becoming too large to na\"ively deploy on consumer hardware. While much work has focused on diff
Externí odkaz:
http://arxiv.org/abs/2402.15978
Autor:
Papamarkou, Theodore, Skoularidou, Maria, Palla, Konstantina, Aitchison, Laurence, Arbel, Julyan, Dunson, David, Filippone, Maurizio, Fortuin, Vincent, Hennig, Philipp, Hernández-Lobato, José Miguel, Hubin, Aliaksandr, Immer, Alexander, Karaletsos, Theofanis, Khan, Mohammad Emtiyaz, Kristiadi, Agustinus, Li, Yingzhen, Mandt, Stephan, Nemeth, Christopher, Osborne, Michael A., Rudner, Tim G. J., Rügamer, David, Teh, Yee Whye, Welling, Max, Wilson, Andrew Gordon, Zhang, Ruqi
In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooke
Externí odkaz:
http://arxiv.org/abs/2402.00809
Graph contrastive learning has shown great promise when labeled data is scarce, but large unlabeled datasets are available. However, it often does not take uncertainty estimation into account. We show that a variational Bayesian neural network approa
Externí odkaz:
http://arxiv.org/abs/2312.00232
The core components of many modern neural network architectures, such as transformers, convolutional, or graph neural networks, can be expressed as linear layers with $\textit{weight-sharing}$. Kronecker-Factored Approximate Curvature (K-FAC), a seco
Externí odkaz:
http://arxiv.org/abs/2311.00636
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. However, symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not b
Externí odkaz:
http://arxiv.org/abs/2310.06131
Autor:
Meterez, Alexandru, Joudaki, Amir, Orabona, Francesco, Immer, Alexander, Rätsch, Gunnar, Daneshmand, Hadi
Normalization layers are one of the key building blocks for deep neural networks. Several theoretical studies have shown that batch normalization improves the signal propagation, by avoiding the representations from becoming collinear across the laye
Externí odkaz:
http://arxiv.org/abs/2310.02012
Simplicial complexes prove effective in modeling data with multiway dependencies, such as data defined along the edges of networks or within other higher-order structures. Their spectrum can be decomposed into three interpretable subspaces via the Ho
Externí odkaz:
http://arxiv.org/abs/2309.07364
Autor:
Immer, Alexander, van der Ouderaa, Tycho F. A., van der Wilk, Mark, Rätsch, Gunnar, Schölkopf, Bernhard
Selecting hyperparameters in deep learning greatly impacts its effectiveness but requires manual effort and expertise. Recent works show that Bayesian model selection with Laplace approximations can allow to optimize such hyperparameters just like st
Externí odkaz:
http://arxiv.org/abs/2306.03968