Zobrazeno 1 - 7
of 7
pro vyhledávání: '"Kaledin, Maxim"'
Policy-gradient methods in Reinforcement Learning(RL) are very universal and widely applied in practice but their performance suffers from the high variance of the gradient estimate. Several procedures were proposed to reduce it including actor-criti
Externí odkaz:
http://arxiv.org/abs/2206.06827
Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to es
Externí odkaz:
http://arxiv.org/abs/2002.01268
Autor:
Belomestny, Denis1,2 (AUTHOR), Kaledin, Maxim2 (AUTHOR), Schoenmakers, John3 (AUTHOR) schoenma@wias-berlin.de
Publikováno v:
Mathematical Finance. Oct2020, Vol. 30 Issue 4, p1591-1616. 26p.
In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corre
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1c5303e57b0a64b45e1dc859cacb7843
Autor:
Efimov, Alexander I.1,2 (AUTHOR) efimov@mccme.ru
Publikováno v:
Inventiones Mathematicae. Nov2020, Vol. 222 Issue 2, p667-694. 28p.
Autor:
Efimov, Alexander I.
Publikováno v:
Journal of the European Mathematical Society (EMS Publishing); 2020, Vol. 22 Issue 9, p2879-2942, 64p
Autor:
Ademir Hujdurović, Klavdija Kutnar, Dragan MaruÅ¡iÄ, Å tefko MiklaviÄ, Tomaž Pisanski, Primož Å parl
The European Congress of Mathematics, held every four years, is a well-established major international mathematical event. Following those in Paris (1992), Budapest (1996), Barcelona (2000), Stockholm (2004), Amsterdam (2008), Kraków (2012), and B