Zobrazeno 1 - 10
of 16
pro vyhledávání: '"Suilen, Marnix"'
Autor:
Galesloot, Maris F. L., Suilen, Marnix, Simão, Thiago D., Carr, Steven, Spaan, Matthijs T. J., Topcu, Ufuk, Jansen, Nils
Robust partially observable Markov decision processes (robust POMDPs) extend classical POMDPs to handle additional uncertainty on the transition and observation probabilities via so-called uncertainty sets. Policies for robust POMDPs must not only be
Externí odkaz:
http://arxiv.org/abs/2408.08770
Markov Decision Processes (MDPs) model systems with uncertain transition dynamics. Multiple-environment MDPs (MEMDPs) extend MDPs. They intuitively reflect finite sets of MDPs that share the same state and action spaces but differ in the transition d
Externí odkaz:
http://arxiv.org/abs/2407.07006
Partially observable Markov decision processes (POMDPs) rely on the key assumption that probability distributions are precisely known. Robust POMDPs (RPOMDPs) alleviate this concern by defining imprecise probabilities, referred to as uncertainty sets
Externí odkaz:
http://arxiv.org/abs/2405.04941
Autor:
Wienhöft, Patrick, Suilen, Marnix, Simão, Thiago D., Dubslaff, Clemens, Baier, Christel, Jansen, Nils
In an offline reinforcement learning setting, the safe policy improvement (SPI) problem aims to improve the performance of a behavior policy according to which sample data has been generated. State-of-the-art approaches to SPI require a high number o
Externí odkaz:
http://arxiv.org/abs/2305.07958
This position paper reflects on the state-of-the-art in decision-making under uncertainty. A classical assumption is that probabilities can sufficiently capture all uncertainty in a system. In this paper, the focus is on the uncertainty that goes bey
Externí odkaz:
http://arxiv.org/abs/2303.05848
We study safe policy improvement (SPI) for partially observable Markov decision processes (POMDPs). SPI is an offline reinforcement learning (RL) problem that assumes access to (1) historical data about an environment, and (2) the so-called behavior
Externí odkaz:
http://arxiv.org/abs/2301.04939
Markov decision processes (MDPs) are formal models commonly used in sequential decision-making. MDPs capture the stochasticity that may arise, for instance, from imprecise actuators via probabilities in the transition function. However, in data-drive
Externí odkaz:
http://arxiv.org/abs/2205.15827
Publikováno v:
Nasa Formal Methods (NFM) 2021
We study a smart grid with wind power and battery storage. Traditionally, day-ahead planning aims to balance demand and wind power, yet actual wind conditions often deviate from forecasts. Short-term flexibility in storage and generation fills potent
Externí odkaz:
http://arxiv.org/abs/2101.12496
Autor:
Cubuktepe, Murat, Jansen, Nils, Junges, Sebastian, Marandi, Ahmadreza, Suilen, Marnix, Topcu, Ufuk
Uncertain partially observable Markov decision processes (uPOMDPs) allow the probabilistic transition and observation functions of standard POMDPs to belong to a so-called uncertainty set. Such uncertainty, referred to as epistemic uncertainty, captu
Externí odkaz:
http://arxiv.org/abs/2009.11459
We study the problem of policy synthesis for uncertain partially observable Markov decision processes (uPOMDPs). The transition probability function of uPOMDPs is only known to belong to a so-called uncertainty set, for instance in the form of probab
Externí odkaz:
http://arxiv.org/abs/2001.08174