Výsledky vyhledávání - "Hochreiter, A"

Report

Rethinking Uncertainty Estimation in Natural Language Generation

Autor: Aichberger, Lukas, Schweighofer, Kajetan, Hochreiter, Sepp

Large Language Models (LLMs) are increasingly employed in real-world applications, driving the need to evaluate the trustworthiness of their generated text. To this end, reliable uncertainty estimation is essential. Since current LLMs generate text a

Externí odkaz: http://arxiv.org/abs/2412.15176

Zobrazit plný text záznamu

Report

FlashRNN: Optimizing Traditional RNNs on Modern Hardware

Autor: Pöppel, Korbinian, Beck, Maximilian, Hochreiter, Sepp

While Transformers and other sequence-parallelizable neural network architectures seem like the current state of the art in sequence modeling, they specifically lack state-tracking capabilities. These are important for time-series tasks and logical r

Externí odkaz: http://arxiv.org/abs/2412.07752

Zobrazit plný text záznamu

Report

Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences

Autor: Schmidinger, Niklas, Schneckenreiter, Lisa, Seidl, Philipp, Schimunek, Johannes, Hoedt, Pieter-Jan, Brandstetter, Johannes, Mayr, Andreas, Luukkonen, Sohvi, Hochreiter, Sepp, Klambauer, Günter

Language models for biological and chemical sequences enable crucial applications such as drug discovery, protein engineering, and precision medicine. Currently, these language models are predominantly based on Transformer architectures. While Transf

Externí odkaz: http://arxiv.org/abs/2411.04165

Zobrazit plný text záznamu

Report

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Autor: Schmied, Thomas, Adler, Thomas, Patil, Vihang, Beck, Maximilian, Pöppel, Korbinian, Brandstetter, Johannes, Klambauer, Günter, Pascanu, Razvan, Hochreiter, Sepp

In recent years, there has been a trend in the field of Reinforcement Learning (RL) towards large action models trained offline on large-scale datasets via sequence modeling. Existing models are primarily based on the Transformer architecture, which

Externí odkaz: http://arxiv.org/abs/2410.22391

Zobrazit plný text záznamu

Report

The Disparate Benefits of Deep Ensembles

Autor: Schweighofer, Kajetan, Arnaiz-Rodriguez, Adrian, Hochreiter, Sepp, Oliver, Nuria

Ensembles of Deep Neural Networks, Deep Ensembles, are widely used as a simple way to boost predictive performance. However, their impact on algorithmic fairness is not well understood yet. Algorithmic fairness investigates how a model's performance

Externí odkaz: http://arxiv.org/abs/2410.13831

Zobrazit plný text záznamu

Report

On Information-Theoretic Measures of Predictive Uncertainty

Autor: Schweighofer, Kajetan, Aichberger, Lukas, Ielanskyi, Mykyta, Hochreiter, Sepp

Reliable estimation of predictive uncertainty is crucial for machine learning applications, particularly in high-stakes scenarios where hedging against risks is essential. Despite its significance, a consensus on the correct measurement of predictive

Externí odkaz: http://arxiv.org/abs/2410.10786

Zobrazit plný text záznamu

Report

One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation

Autor: Paischer, Fabian, Hauzenberger, Lukas, Schmied, Thomas, Alkin, Benedikt, Deisenroth, Marc Peter, Hochreiter, Sepp

Foundation models (FMs) are pre-trained on large-scale datasets and then fine-tuned on a downstream task for a specific application. The most successful and most commonly used fine-tuning method is to update the pre-trained weights via a low-rank ada

Externí odkaz: http://arxiv.org/abs/2410.07170

Zobrazit plný text záznamu

Report

Retrieval-Augmented Decision Transformer: External Memory for In-context RL

Autor: Schmied, Thomas, Paischer, Fabian, Patil, Vihang, Hofmarcher, Markus, Pascanu, Razvan, Hochreiter, Sepp

In-context learning (ICL) is the ability of a model to learn a new task by observing a few exemplars in its context. While prevalent in NLP, this capability has recently also been observed in Reinforcement Learning (RL) settings. Prior in-context RL

Externí odkaz: http://arxiv.org/abs/2410.07071

Zobrazit plný text záznamu

Report

Simplified priors for Object-Centric Learning

Autor: Patil, Vihang, Radler, Andreas, Klotz, Daniel, Hochreiter, Sepp

Humans excel at abstracting data and constructing \emph{reusable} concepts, a capability lacking in current continual learning systems. The field of object-centric learning addresses this by developing abstract representations, or slots, from data wi

Externí odkaz: http://arxiv.org/abs/2410.00728

Zobrazit plný text záznamu

Report

Contrastive Abstraction for Reinforcement Learning

Autor: Patil, Vihang, Hofmarcher, Markus, Rumetshofer, Elisabeth, Hochreiter, Sepp

Learning agents with reinforcement learning is difficult when dealing with long trajectories that involve a large number of states. To address these learning problems effectively, the number of states can be reduced by abstract representations that c

Externí odkaz: http://arxiv.org/abs/2410.00704

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání