Výsledky vyhledávání - "Saux, P. (Patrick)"

Risk-aware linear bandits with convex loss

Autor: Saux, P. (Patrick), Maillard, O-A. (Odalric-Ambrym)

In decision-making problems such as the multi-armed bandit, an agent learns sequentially by optimizing a certain feedback. While the mean reward criterion has been extensively studied, other measures that reflect an aversion to adverse outcomes, such

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=od______4198::e7c1bdd2f695b248ccf6b7e237ba0e21
http://hdl.handle.net/20.500.12210/79797

Zobrazit plný text záznamu

From Optimality to Robustness:Dirichlet Sampling Strategies in Stochastic Bandits

Autor: Baudry, D. (Dorian), Saux, P. (Patrick), Maillard, O-A. (Odalric-Ambrym)

The stochastic multi-arm bandit problem has been extensively studied under standard assumptions on the arm's distribution (e.g bounded with known support, exponential family, etc). These assumptions are suitable for many real-world problems but somet

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=od______4198::b5171ea11ffe4ca0d076a9bff57668c7
http://hdl.handle.net/20.500.12210/57852

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání