Výsledky vyhledávání - "Hong, Sungee"

Report

Distributional Off-policy Evaluation with Bellman Residual Minimization

Autor: Hong, Sungee, Qi, Zhengling, Wong, Raymond K. W.

We study distributional off-policy evaluation (OPE), of which the goal is to learn the distribution of the return for a target policy using offline data generated by a different policy. The theoretical foundation of many existing work relies on the s

Externí odkaz: http://arxiv.org/abs/2402.01900

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání