Zobrazeno 1 - 10
of 6 172
pro vyhledávání: '"P. Petrik"'
Autor:
Cvetković Sonja S.
Publikováno v:
Baština, Vol 2019, Iss 48, Pp 387-397 (2019)
The paper analyzes the visual aspect of the score cover page and the musical content of the Marš Miloša S. Milojevića (Miloš S. Milojević`s March) for piano (1881), composed by Vićentije Petrik, a Czech musician who worked in Serbia at the end
Externí odkaz:
https://doaj.org/article/5d478984ead34765b67818eaacdda671
In Markov decision processes (MDPs), quantile risk measures such as Value-at-Risk are a standard metric for modeling RL agents' preferences for certain outcomes. This paper proposes a new Q-learning algorithm for quantile optimization in MDPs with st
Externí odkaz:
http://arxiv.org/abs/2410.24128
We develop a generic policy gradient method with the global optimality guarantee for robust Markov Decision Processes (MDPs). While policy gradient methods are widely used for solving dynamic decision problems due to their scalable and efficient natu
Externí odkaz:
http://arxiv.org/abs/2410.22114
Optimizing risk-averse objectives in discounted MDPs is challenging because most models do not admit direct dynamic programming equations and require complex history-dependent policies. In this paper, we show that the risk-averse {\em total reward cr
Externí odkaz:
http://arxiv.org/abs/2408.17286
Autor:
Su, Xihong, Petrik, Marek
Multi-model Markov decision process (MMDP) is a promising framework for computing policies that are robust to parameter uncertainty in MDPs. MMDPs aim to find a policy that maximizes the expected return over a distribution of MDP models. Because MMDP
Externí odkaz:
http://arxiv.org/abs/2407.06329
Autor:
Jens Soeren Torrau
Publikováno v:
Journal of Social Science Education, Vol 20, Iss 4 (2022)
Externí odkaz:
https://doaj.org/article/3e0ceccb0c9441bb85e1b67fc6895263
In reinforcement learning, robust policies for high-stakes decision-making problems with limited data are usually computed by optimizing the \emph{percentile criterion}. The percentile criterion is approximately solved by constructing an \emph{ambigu
Externí odkaz:
http://arxiv.org/abs/2404.05055
Off-policy Evaluation (OPE) methods are a crucial tool for evaluating policies in high-stakes domains such as healthcare, where exploration is often infeasible, unethical, or expensive. However, the extent to which such methods can be trusted under a
Externí odkaz:
http://arxiv.org/abs/2404.04714
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.
Autor:
Petrik, Jan, Bambach, Markus
Publikováno v:
Journal of Manufacturing Processes 121 (2024) 193-204
This study presents a novel method for microstructure control in closed die hot forging that combines Model Predictive Control (MPC) with a developed machine learning model called DeepForge. DeepForge uses an architecture that combines 1D convolution
Externí odkaz:
http://arxiv.org/abs/2402.16119