Výsledky vyhledávání - "Ramponi, Giorgia"

Report

Improved Sample Complexity for Global Convergence of Actor-Critic Algorithms

Autor: Kumar, Navdeep, Agrawal, Priyank, Ramponi, Giorgia, Levy, Kfir Yehuda, Mannor, Shie

In this paper, we establish the global convergence of the actor-critic algorithm with a significantly improved sample complexity of $O(\epsilon^{-3})$, advancing beyond the existing local convergence results. Previous works provide local convergence

Externí odkaz: http://arxiv.org/abs/2410.08868

Zobrazit plný text záznamu

Report

Preference Elicitation for Offline Reinforcement Learning

Autor: Pace, Alizée, Schölkopf, Bernhard, Rätsch, Gunnar, Ramponi, Giorgia

Applying reinforcement learning (RL) to real-world problems is often made challenging by the inability to interact with the environment and the difficulty of designing reward functions. Offline RL addresses the first challenge by considering access t

Externí odkaz: http://arxiv.org/abs/2406.18450

Zobrazit plný text záznamu

Report

Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes

Autor: Thoma, Vinzenz, Pasztor, Barna, Krause, Andreas, Ramponi, Giorgia, Hu, Yifan

In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (

Externí odkaz: http://arxiv.org/abs/2406.01575

Zobrazit plný text záznamu

Report

Truly No-Regret Learning in Constrained MDPs

Autor: Müller, Adrian, Alatur, Pragnya, Cevher, Volkan, Ramponi, Giorgia, He, Niao

Constrained Markov decision processes (CMDPs) are a common way to model safety constraints in reinforcement learning. State-of-the-art methods for efficiently solving CMDPs are based on primal-dual algorithms. For these algorithms, all currently know

Externí odkaz: http://arxiv.org/abs/2402.15776

Zobrazit plný text záznamu

Report

Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement Learning

Autor: Mutti, Mirco, De Santi, Riccardo, Restelli, Marcello, Marx, Alexander, Ramponi, Giorgia

Posterior sampling allows exploitation of prior knowledge on the environment's transition dynamics to improve the sample efficiency of reinforcement learning. The prior is typically specified as a class of parametric distributions, the design of whic

Externí odkaz: http://arxiv.org/abs/2310.07518

Zobrazit plný text záznamu

Report

On Imitation in Mean-field Games

Autor: Ramponi, Giorgia, Kolev, Pavel, Pietquin, Olivier, He, Niao, Laurière, Mathieu, Geist, Matthieu

We explore the problem of imitation learning (IL) in the context of mean-field games (MFGs), where the goal is to imitate the behavior of a population of agents following a Nash equilibrium policy according to some unknown payoff function. IL in MFGs

Externí odkaz: http://arxiv.org/abs/2306.14799

Zobrazit plný text záznamu

Report

Provably Learning Nash Policies in Constrained Markov Potential Games

Autor: Alatur, Pragnya, Ramponi, Giorgia, He, Niao, Krause, Andreas

Multi-agent reinforcement learning (MARL) addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world instances, the agents may not only want to optimize their objectives, but a

Externí odkaz: http://arxiv.org/abs/2306.07749

Zobrazit plný text záznamu

Report

Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

Autor: Müller, Adrian, Alatur, Pragnya, Ramponi, Giorgia, He, Niao

Constrained Markov Decision Processes (CMDPs) are one of the common ways to model safe reinforcement learning problems, where constraint functions model the safety objectives. Lagrangian-based dual or primal-dual algorithms provide efficient methods

Externí odkaz: http://arxiv.org/abs/2306.07001

Zobrazit plný text záznamu

Report

Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions

Autor: Terpin, Antonio, Lanzetti, Nicolas, Yardim, Batuhan, Dörfler, Florian, Ramponi, Giorgia

Policy Optimization (PO) algorithms have been proven particularly suited to handle the high-dimensionality of real-world continuous control tasks. In this context, Trust Region Policy Optimization methods represent a popular approach to stabilize the

Externí odkaz: http://arxiv.org/abs/2210.11137

Zobrazit plný text záznamu

Report

Do you pay for Privacy in Online learning?

Autor: Sanyal, Amartya, Ramponi, Giorgia

Online learning, in the mistake bound model, is one of the most fundamental concepts in learning theory. Differential privacy, instead, is the most widely used statistical concept of privacy in the machine learning community. It is thus clear that de

Externí odkaz: http://arxiv.org/abs/2210.04817

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání