Zobrazeno 1 - 10
of 330
pro vyhledávání: '"Borkar, P S"'
Autor:
Borkar, Vivek S
Two time scale stochastic approximation is analyzed when the iterates on either or both time scales do not necessarily converge.
Comment: 6 pages
Comment: 6 pages
Externí odkaz:
http://arxiv.org/abs/2412.19872
We study the Lagrangian Index Policy (LIP) for restless multi-armed bandits with long-run average reward. In particular, we compare the performance of LIP with the performance of the Whittle Index Policy (WIP), both heuristic policies known to be asy
Externí odkaz:
http://arxiv.org/abs/2412.12641
We address the problem of user association in a dense millimeter wave (mmWave) network, in which each arriving user brings a file containing a random number of packets and each time slot is divided into multiple mini-slots. This problem is an instanc
Externí odkaz:
http://arxiv.org/abs/2403.09279
Autor:
Chandak, Siddharth, Borkar, Vivek S.
We derive a concentration bound of the type `for all $n \geq n_0$ for some $n_0$' for TD(0) with linear function approximation. We work with online TD learning with samples from a single sample path of the underlying Markov chain. This makes our anal
Externí odkaz:
http://arxiv.org/abs/2312.10424
Autor:
Borkar, Vivek S., Akarsh, Adit
Oberman gave a stochastic control formulation of the problem of estimating the convex envelope of a non-convex function. Based on this, we develop a reinforcement learning scheme to approximate the convex envelope, using a variant of Q-learning for c
Externí odkaz:
http://arxiv.org/abs/2311.14421
Autor:
Keval, Keshav P., Borkar, Vivek S.
In this paper, we propose a reinforcement learning algorithm to solve a multi-agent Markov decision process (MMDP). The goal, inspired by Blackwell's Approachability Theorem, is to lower the time average cost of each agent to below a pre-specified ag
Externí odkaz:
http://arxiv.org/abs/2311.12613
The Internet of Things (IoT) is emerging as a critical technology to connect resource-constrained devices such as sensors and actuators as well as appliances to the Internet. In this paper, we propose a novel methodology for node cardinality estimati
Externí odkaz:
http://arxiv.org/abs/2310.18664
In this article we prove under suitable assumptions that the marginals of any solution to a relaxed controlled martingale problem on a Polish space $E$ can be mimicked by a Markovian solution of a Markov-relaxed controlled martingale problem. We also
Externí odkaz:
http://arxiv.org/abs/2309.00488
Autor:
Biswas, Anup, Borkar, Vivek S.
Risk-sensitive control has received considerable interest since the seminal work of Howard and Matheson [120] because of its ability to account for fluctuations about the mean, its connection with $H_\infty$ control, and its application to financial
Externí odkaz:
http://arxiv.org/abs/2301.00224
Motivated by the novel paradigm developed by Van Roy and coauthors for reinforcement learning in arbitrary non-Markovian environments, we propose a related formulation and explicitly pin down the error caused by non-Markovianity of observations when
Externí odkaz:
http://arxiv.org/abs/2211.01595