Zobrazeno 1 - 10
of 40
pro vyhledávání: '"Kara, Ali Devran"'
We study an approximation method for partially observed Markov decision processes (POMDPs) with continuous spaces. Belief MDP reduction, which has been the standard approach to study POMDPs requires rigorous approximation methods for practical applic
Externí odkaz:
http://arxiv.org/abs/2410.02895
Finding optimal policies for Partially Observable Markov Decision Processes (POMDPs) is challenging due to their uncountable state spaces when transformed into fully observable Markov Decision Processes (MDPs) using belief states. Traditional methods
Externí odkaz:
http://arxiv.org/abs/2409.04351
The average cost optimality is known to be a challenging problem for partially observable stochastic control, with few results available beyond the finite state, action, and measurement setup, for which somewhat restrictive conditions are available.
Externí odkaz:
http://arxiv.org/abs/2312.14111
Autor:
Kara, Ali Devran, Yuksel, Serdar
As a primary contribution, we present a convergence theorem for stochastic iterations, and in particular, Q-learning iterates, under a general, possibly non-Markovian, stochastic environment. Our conditions for convergence involve an ergodicity and a
Externí odkaz:
http://arxiv.org/abs/2311.00123
Autor:
Kara, Ali Devran, Yuksel, Serdar
For infinite-horizon average-cost criterion problems, there exist relatively few rigorous approximation and reinforcement learning results. In this paper, for such problems, we present several approximation and reinforcement learning results for Mark
Externí odkaz:
http://arxiv.org/abs/2308.07591
We study a multi-agent mean field type control problem in discrete time where the agents aim to find a socially optimal strategy and where the state and action spaces for the agents are assumed to be continuous. The agents are only weakly coupled thr
Externí odkaz:
http://arxiv.org/abs/2211.09633
Autor:
Bayraktar, Erhan, Kara, Ali Devran
We study a Q learning algorithm for continuous time stochastic control problems. The proposed algorithm uses the sampled state process by discretizing the state and control action spaces under piece-wise constant control processes. We show that the a
Externí odkaz:
http://arxiv.org/abs/2203.07499
Reinforcement learning algorithms often require finiteness of state and action spaces in Markov decision processes (MDPs) (also called controlled Markov chains) and various efforts have been made in the literature towards the applicability of such al
Externí odkaz:
http://arxiv.org/abs/2111.06781
Autor:
Kara, Ali Devran, Yuksel, Serdar
In this paper, for POMDPs, we provide the convergence of a Q learning algorithm for control policies using a finite history of past observations and control actions, and, consequentially, we establish near optimality of such limit Q functions under e
Externí odkaz:
http://arxiv.org/abs/2103.12158
Autor:
Kara, Ali Devran, Yuksel, Serdar
In the theory of Partially Observed Markov Decision Processes (POMDPs), existence of optimal policies have in general been established via converting the original partially observed stochastic control problem to a fully observed one on the belief spa
Externí odkaz:
http://arxiv.org/abs/2010.07452