Autor: |
Phansalkar, VV, Thathachar, MAL |
Jazyk: |
angličtina |
Rok vydání: |
1992 |
Předmět: |
|
Zdroj: |
IndraStra Global. |
ISSN: |
2381-3652 |
Popis: |
A feedforward network composed of units of teams of parametrised learning autmata is considered as a mode2 of a reinforcement learning system. The parameters of each learning automaton are updated using an algorithm consisting of a gradient following term and a random perturbation term. The algorithm is approximated by the Ldngevin equation and it is shown that it converges to the global pnaximum. The algorithm is decentralised and the units do not have any information exchange during updating . Simulation results on a pattern recognation problem show that reasonable rates of convergence can be obtained. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|