Výsledky vyhledávání - "Shalabh Bhatnagar"

Analyzing Approximate Value Iteration Algorithms

Autor: Arunselvan Ramaswamy, Shalabh Bhatnagar

Publikováno v: Mathematics of Operations Research. 47:2138-2159

In this paper, we consider the stochastic iterative counterpart of the value iteration scheme wherein only noisy and possibly biased approximations of the Bellman operator are available. We call this counterpart the approximate value iteration (AVI)

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::17e73047748b4ead1e00b7ed6ce5b054
https://doi.org/10.1287/moor.2021.1202

Zobrazit plný text záznamu

Generalized Second-Order Value Iteration in Markov Decision Processes

Autor: Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar

Publikováno v: IEEE Transactions on Automatic Control. 67:4241-4247

Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive at the op

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c8da96240508ffeedc4ef888677af7cd
https://doi.org/10.1109/tac.2021.3112851

Zobrazit plný text záznamu

Stochastic Approximation With Iterate-Dependent Markov Noise Under Verifiable Conditions in Compact State Space With the Stability of Iterates Not Ensured

Autor: Prasenjit Karmakar, Shalabh Bhatnagar

Publikováno v: IEEE Transactions on Automatic Control. 66:5941-5954

This paper compiles several aspects of the dynamics of stochastic approximation algorithms with Markov iterate-dependent noise when the iterates are not known to be stable beforehand. We achieve the same by extending the lock-in probability (i.e. the

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::c6e973b8a044db43c5a3db68f884ef68
https://doi.org/10.1109/tac.2021.3057299

Zobrazit plný text záznamu

Data Efficient Safe Reinforcement Learning

Autor: Sindhu Padakandla, Prabuchandran K J, Sourav Ganguly, Shalabh Bhatnagar

Publikováno v: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::56657d038f0035cbd387077f6620298e
https://doi.org/10.1109/smc53654.2022.9945313

Zobrazit plný text záznamu

Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge

Autor: Shalabh Bhatnagar, Sindhu Padakandla, Abhik Singla

Publikováno v: IEEE Transactions on Intelligent Transportation Systems. 22:107-118

This paper presents our method for enabling a UAV quadrotor, equipped with a monocular camera, to autonomously avoid collisions with obstacles in unstructured and unknown indoor environments. When compared to obstacle avoidance in ground vehicular ro

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::daba209a03f9a80b6b187e3158086a7c
https://doi.org/10.1109/tits.2019.2954952

Zobrazit plný text záznamu

Deep Reinforcement Learning for Optimal Traffic Control

Autor: Rajasekhar Nannapaneni, Raghavendra V. Kulkarni, Shalabh Bhatnagar

Publikováno v: Algorithms for Intelligent Systems ISBN: 9789811696497

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::fe0fdbe048e45588edeeab386747560e
https://doi.org/10.1007/978-981-16-9650-3_4

Zobrazit plný text záznamu

Co-operative Multi-agent Twin Delayed DDPG for Robust Phase Duration Optimization of Large Road Networks

Autor: Priya Shanmugasundaram, Shalabh Bhatnagar

Publikováno v: Lecture Notes in Computer Science ISBN: 9783031229527

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::dc8384f4865dff819f36ee5b42154e84
https://doi.org/10.1007/978-3-031-22953-4_6

Zobrazit plný text záznamu

Robust Traffic Signal Timing Control using Multiagent Twin Delayed Deep Deterministic Policy Gradients

Autor: Priya Shanmugasundaram, Shalabh Bhatnagar

Publikováno v: Proceedings of the 14th International Conference on Agents and Artificial Intelligence.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::75fd0d9947c6bddfd3e615f69aac12ad
https://doi.org/10.5220/0010889300003116

Zobrazit plný text záznamu

Novel First Order Bayesian Optimization with an Application to Reinforcement Learning

Autor: Chandramouli Kamanchi, K J Prabuchandran, Santosh Penubothula, Shalabh Bhatnagar

Publikováno v: Applied Intelligence. 51:1565-1579

Zeroth Order Bayesian Optimization (ZOBO) methods optimize an unknown function based on its black-box evaluations at the query locations. Unlike most optimization procedures, ZOBO methods fail to utilize gradient information even when it is available

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::7cc90bd08dfbae7c96d8374a0db3401e
https://doi.org/10.1007/s10489-020-01896-w

Zobrazit plný text záznamu

Generalized Speedy Q-Learning

Autor: Chandramouli Kamanchi, Indu John, Shalabh Bhatnagar

Publikováno v: IEEE Control Systems Letters. 4:524-529

In this paper, we derive a generalization of the Speedy Q-learning (SQL) algorithm that was proposed in the Reinforcement Learning (RL) literature to handle slow convergence of Watkins' Q-learning. In most RL algorithms such as Q-learning, the Bellma

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::530771884fb67920790853d968525205
https://doi.org/10.1109/lcsys.2020.2970555

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání