Zobrazeno 1 - 10
of 62
pro vyhledávání: '"Udluft, Steffen"'
The analysis of variance (ANOVA) decomposition offers a systematic method to understand the interaction effects that contribute to a specific decision output. In this paper we introduce Neural-ANOVA, an approach to decompose neural networks into glas
Externí odkaz:
http://arxiv.org/abs/2408.12319
This paper explores the use of model-based offline reinforcement learning with long model rollouts. While some literature criticizes this approach due to compounding errors, many practitioners have found success in real-world applications. The paper
Externí odkaz:
http://arxiv.org/abs/2407.11751
This paper presents the first algorithm for model-based offline quantum reinforcement learning and demonstrates its functionality on the cart-pole benchmark. The model and the policy to be optimized are each implemented as variational quantum circuit
Externí odkaz:
http://arxiv.org/abs/2404.10017
Publikováno v:
2023 IEEE Symposium Series on Computational Intelligence
Offline reinforcement learning provides a viable approach to obtain advanced control strategies for dynamical systems, in particular when direct interaction with the environment is not available. In this paper, we introduce a conceptual extension for
Externí odkaz:
http://arxiv.org/abs/2308.06127
Recently, offline RL algorithms have been proposed that remain adaptive at runtime. For example, the LION algorithm \cite{lion} provides the user with an interface to set the trade-off between behavior cloning and optimality w.r.t. the estimated retu
Externí odkaz:
http://arxiv.org/abs/2306.09744
Safe Policy Improvement (SPI) is an important technique for offline reinforcement learning in safety critical applications as it improves the behavior policy with a high probability. We classify various SPI approaches from the literature into two gro
Externí odkaz:
http://arxiv.org/abs/2208.00724
We present a full implementation and simulation of a novel quantum reinforcement learning method. Our work is a detailed and formal proof of concept for how quantum algorithms can be used to solve reinforcement learning problems and shows that, given
Externí odkaz:
http://arxiv.org/abs/2206.04741
Offline reinforcement learning algorithms still lack trust in practice due to the risk that the learned policy performs worse than the original policy that generated the dataset or behaves in an unexpected way that is unfamiliar to the user. At the s
Externí odkaz:
http://arxiv.org/abs/2205.10629
Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy. Building on SPI with Soft Baseline Bootstrapping (Soft-SPIBB) by Nadjahi et al., we identify theoretical iss
Externí odkaz:
http://arxiv.org/abs/2201.12175
Offline reinforcement learning (RL) Algorithms are often designed with environments such as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists. We compare model-free, model-based, as well as hybrid offline RL approach
Externí odkaz:
http://arxiv.org/abs/2201.05433