Zobrazeno 1 - 4
of 4
pro vyhledávání: '"Kaledin, Maxim"'
Policy-gradient methods in Reinforcement Learning(RL) are very universal and widely applied in practice but their performance suffers from the high variance of the gradient estimate. Several procedures were proposed to reduce it including actor-criti
Externí odkaz:
http://arxiv.org/abs/2206.06827
Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to es
Externí odkaz:
http://arxiv.org/abs/2002.01268
Autor:
Belomestny, Denis1,2 (AUTHOR), Kaledin, Maxim2 (AUTHOR), Schoenmakers, John3 (AUTHOR) schoenma@wias-berlin.de
Publikováno v:
Mathematical Finance. Oct2020, Vol. 30 Issue 4, p1591-1616. 26p.
In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corre
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1c5303e57b0a64b45e1dc859cacb7843