Zobrazeno 1 - 10
of 102
pro vyhledávání: '"Mitra, Aritra"'
We consider a setting involving $N$ agents, where each agent interacts with an environment modeled as a Markov Decision Process (MDP). The agents' MDPs differ in their reward functions, capturing heterogeneous objectives/tasks. The collective goal of
Externí odkaz:
http://arxiv.org/abs/2409.05291
Autor:
Maity, Sreejeet, Mitra, Aritra
Recently, there has been a surge of interest in analyzing the non-asymptotic behavior of model-free reinforcement learning algorithms. However, the performance of such algorithms in non-ideal environments, such as in the presence of corrupted rewards
Externí odkaz:
http://arxiv.org/abs/2409.03237
Recent research endeavours have theoretically shown the beneficial effect of cooperation in multi-agent reinforcement learning (MARL). In a setting involving $N$ agents, this beneficial effect usually comes in the form of an $N$-fold linear convergen
Externí odkaz:
http://arxiv.org/abs/2407.20441
Autor:
Fabbro, Nicolò Dal, Adibi, Arman, Poor, H. Vincent, Kulkarni, Sanjeev R., Mitra, Aritra, Pappas, George J.
We consider a setting in which $N$ agents aim to speedup a common Stochastic Approximation (SA) problem by acting in parallel and communicating with a central server. We assume that the up-link transmissions to the server are subject to asynchronous
Externí odkaz:
http://arxiv.org/abs/2403.17247
Autor:
Mitra, Aritra
We study the finite-time convergence of TD learning with linear function approximation under Markovian sampling. Existing proofs for this setting either assume a projection step in the algorithm to simplify the analysis, or require a fairly intricate
Externí odkaz:
http://arxiv.org/abs/2403.02476
Autor:
Adibi, Arman, Fabbro, Nicolo Dal, Schenato, Luca, Kulkarni, Sanjeev, Poor, H. Vincent, Pappas, George J., Hassani, Hamed, Mitra, Aritra
Motivated by applications in large-scale and multi-agent reinforcement learning, we study the non-asymptotic performance of stochastic approximation (SA) schemes with delayed updates under Markovian sampling. While the effect of delays has been exten
Externí odkaz:
http://arxiv.org/abs/2402.11800
Federated reinforcement learning (FRL) has emerged as a promising paradigm for reducing the sample complexity of reinforcement learning tasks by exploiting information from different agents. However, when each agent interacts with a potentially diffe
Externí odkaz:
http://arxiv.org/abs/2401.15273
Consider a linear quadratic regulator (LQR) problem being solved in a model-free manner using the policy gradient approach. If the gradient of the quadratic cost is being transmitted across a rate-limited channel, both the convergence and the rate of
Externí odkaz:
http://arxiv.org/abs/2401.01258
We study a model-free federated linear quadratic regulator (LQR) problem where M agents with unknown, distinct yet similar dynamics collaboratively learn an optimal policy to minimize an average quadratic cost while keeping their data private. To exp
Externí odkaz:
http://arxiv.org/abs/2308.11743
Delays and asynchrony are inevitable in large-scale machine-learning problems where communication plays a key role. As such, several works have extensively analyzed stochastic optimization with delayed gradients. However, as far as we are aware, no a
Externí odkaz:
http://arxiv.org/abs/2307.06886