Zobrazeno 1 - 10
of 2 622
pro vyhledávání: '"Hochreiter, A"'
Large Language Models (LLMs) are increasingly employed in real-world applications, driving the need to evaluate the trustworthiness of their generated text. To this end, reliable uncertainty estimation is essential. Since current LLMs generate text a
Externí odkaz:
http://arxiv.org/abs/2412.15176
While Transformers and other sequence-parallelizable neural network architectures seem like the current state of the art in sequence modeling, they specifically lack state-tracking capabilities. These are important for time-series tasks and logical r
Externí odkaz:
http://arxiv.org/abs/2412.07752
Autor:
Schmidinger, Niklas, Schneckenreiter, Lisa, Seidl, Philipp, Schimunek, Johannes, Hoedt, Pieter-Jan, Brandstetter, Johannes, Mayr, Andreas, Luukkonen, Sohvi, Hochreiter, Sepp, Klambauer, Günter
Language models for biological and chemical sequences enable crucial applications such as drug discovery, protein engineering, and precision medicine. Currently, these language models are predominantly based on Transformer architectures. While Transf
Externí odkaz:
http://arxiv.org/abs/2411.04165
Autor:
Schmied, Thomas, Adler, Thomas, Patil, Vihang, Beck, Maximilian, Pöppel, Korbinian, Brandstetter, Johannes, Klambauer, Günter, Pascanu, Razvan, Hochreiter, Sepp
In recent years, there has been a trend in the field of Reinforcement Learning (RL) towards large action models trained offline on large-scale datasets via sequence modeling. Existing models are primarily based on the Transformer architecture, which
Externí odkaz:
http://arxiv.org/abs/2410.22391
Ensembles of Deep Neural Networks, Deep Ensembles, are widely used as a simple way to boost predictive performance. However, their impact on algorithmic fairness is not well understood yet. Algorithmic fairness investigates how a model's performance
Externí odkaz:
http://arxiv.org/abs/2410.13831
Reliable estimation of predictive uncertainty is crucial for machine learning applications, particularly in high-stakes scenarios where hedging against risks is essential. Despite its significance, a consensus on the correct measurement of predictive
Externí odkaz:
http://arxiv.org/abs/2410.10786
Autor:
Paischer, Fabian, Hauzenberger, Lukas, Schmied, Thomas, Alkin, Benedikt, Deisenroth, Marc Peter, Hochreiter, Sepp
Foundation models (FMs) are pre-trained on large-scale datasets and then fine-tuned on a downstream task for a specific application. The most successful and most commonly used fine-tuning method is to update the pre-trained weights via a low-rank ada
Externí odkaz:
http://arxiv.org/abs/2410.07170
Autor:
Schmied, Thomas, Paischer, Fabian, Patil, Vihang, Hofmarcher, Markus, Pascanu, Razvan, Hochreiter, Sepp
In-context learning (ICL) is the ability of a model to learn a new task by observing a few exemplars in its context. While prevalent in NLP, this capability has recently also been observed in Reinforcement Learning (RL) settings. Prior in-context RL
Externí odkaz:
http://arxiv.org/abs/2410.07071
Humans excel at abstracting data and constructing \emph{reusable} concepts, a capability lacking in current continual learning systems. The field of object-centric learning addresses this by developing abstract representations, or slots, from data wi
Externí odkaz:
http://arxiv.org/abs/2410.00728
Learning agents with reinforcement learning is difficult when dealing with long trajectories that involve a large number of states. To address these learning problems effectively, the number of states can be reduced by abstract representations that c
Externí odkaz:
http://arxiv.org/abs/2410.00704