Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Dima, Simon"'
In dynamic programming and reinforcement learning, the policy for the sequential decision making of an agent in a stochastic environment is usually determined by expressing the goal as a scalar reward function and seeking a policy that maximizes the
Externí odkaz:
http://arxiv.org/abs/2408.04385
Autor:
Vlad, Jeni Laura, Vlad, Dorel, Hrehoret, Dolna, Popescu, Irinel, Constantinescu, Alexandra, Calugaroiu, Carmen, Dima, Simon
Publikováno v:
In International Journal of Infectious Diseases 2008 12 Supplement 2:S8-S8