Zobrazeno 1 - 10
of 175
pro vyhledávání: '"Clement, Julien"'
In this paper we investigate the tractability of robust Markov Decision Processes (RMDPs) under various structural assumptions on the uncertainty set. Surprisingly, we show that in all generality (i.e. without any assumption on the instantaneous rewa
Externí odkaz:
http://arxiv.org/abs/2411.08435
Optimizing risk-averse objectives in discounted MDPs is challenging because most models do not admit direct dynamic programming equations and require complex history-dependent policies. In this paper, we show that the risk-averse {\em total reward cr
Externí odkaz:
http://arxiv.org/abs/2408.17286
Autor:
Cai, Yang, Farina, Gabriele, Grand-Clément, Julien, Kroer, Christian, Lee, Chung-Wei, Luo, Haipeng, Zheng, Weiqiang
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games, both in theory and practice. Particularly popular algorithms include optimistic multiplicative weights update (OMWU) and optimistic gradient-desc
Externí odkaz:
http://arxiv.org/abs/2406.10631
In this paper, we introduce the first algorithmic framework for Blackwell approachability on the sequence-form polytope, the class of convex polytopes capturing the strategies of players in extensive-form games (EFGs). This leads to a new class of re
Externí odkaz:
http://arxiv.org/abs/2403.04680
Robust Markov Decision Processes (RMDPs) are a widely used framework for sequential decision-making under parameter uncertainty. RMDPs have been extensively studied when the objective is to maximize the discounted return, but little is known for aver
Externí odkaz:
http://arxiv.org/abs/2312.03618
Autor:
Cai, Yang, Farina, Gabriele, Grand-Clément, Julien, Kroer, Christian, Lee, Chung-Wei, Luo, Haipeng, Zheng, Weiqiang
Algorithms based on regret matching, specifically regret matching$^+$ (RM$^+$), and its variants are the most popular approaches for solving large-scale two-player zero-sum games in practice. Unlike algorithms such as optimistic gradient descent asce
Externí odkaz:
http://arxiv.org/abs/2311.00676
Regret Matching+ (RM+) and its variants are important algorithms for solving large-scale games. However, a theoretical understanding of their success in practice is still a mystery. Moreover, recent advances on fast convergence in games are limited t
Externí odkaz:
http://arxiv.org/abs/2305.14709
Autor:
Grand-Clément, Julien, Petrik, Marek
Publikováno v:
Advances in Neural Information Processing Systems (Neurips), 2023
We introduce the Blackwell discount factor for Markov Decision Processes (MDPs). Classical objectives for MDPs include discounted, average, and Blackwell optimality. Many existing approaches to computing average-optimal policies solve for discounted
Externí odkaz:
http://arxiv.org/abs/2302.00036
Autor:
Clément, Julien, Genitrini, Antoine
For three decades binary decision diagrams, a data structure efficiently representing Boolean functions, have been widely used in many distinct contexts like model verification, machine learning, cryptography and also resolution of combinatorial prob
Externí odkaz:
http://arxiv.org/abs/2211.04938
Autor:
Grand-Clément, Julien, Petrik, Marek
Robust Markov decision processes (MDPs) are used for applications of dynamic optimization in uncertain environments and have been studied extensively. Many of the main properties and algorithms of MDPs, such as value iteration and policy iteration, e
Externí odkaz:
http://arxiv.org/abs/2209.10187