A reinforcement learning neural network for adaptive control of Markov chains

Autor:	G. Santharam, P.S. Sastry
Rok vydání:	1997
Předmět:	Mathematical optimization Adaptive control Markov chain Learning automata Artificial neural network Computer science Variable-order Markov model Q-learning Markov process Markov model Computer Science Applications Human-Computer Interaction symbols.namesake Control and Systems Engineering Control theory symbols Reinforcement learning Markov decision process Electrical and Electronic Engineering Stochastic neural network Electrical Engineering Software
Zdroj:	IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans. 27:588-600
ISSN:	1083-4427
DOI:	10.1109/3468.618258
Popis:	In this paper we consider the problem of reinforcement learning in a dynamically changing environment. In this context, we study the problem of adaptive control of finite-state Markov chains with a finite number of controls, The transition and payoff structures are unknown, The objective is to find an optimal policy which maximizes the expected total discounted payoff over the infinite horizon, A stochastic neural network model is suggested for the controller. The parameters of the neural nee, which determine a random control strategy, are updated at each instant using a simple learning scheme, This learning scheme involves estimation of some relevant parameters using an adaptive critic, It is proved that the controller asymptotically chooses an optimal action in each state of the Markov chain with a high probability.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::02333ebdf762e237b45bef92bf9500d7 https://doi.org/10.1109/3468.618258 Zobrazit plný text záznamu