A Nearer Optimal and Faster Trained Value Iteration ADP for Discrete-Time Nonlinear Systems

Autor:	Junping Hu, Gen Yang, Zhicheng Hou, Gong Zhang, Wenlin Yang, Weijun Wang
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	ADP value iteration genetic algorithm trigger mechanism Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 9, Pp 14933-14944 (2021)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2021.3051984
Popis:	Adaptive dynamic programming (ADP) is generally implemented using three neural networks: model network, action network, and critic network. In the conventional works of the value iteration ADP, the model network is initialized randomly and trained by the backpropagation algorithm, whose results are easy to get trapped in a local minimum; both the critic network and action network are trained in each outer-loop, which is time-consuming. To approximate the optimal control policy more accurately and decrease the value iteration ADP training time, we propose a nearer optimal and faster trained value iteration ADP for discrete-time nonlinear systems in this study. First, before training the model network with a backpropagation algorithm, we use a global searching method, i.e., genetic algorithm, to evolve the weights and biases of the neural network for a few generations. Second, in the outer-loop training process, we propose a trigger mechanism to decide whether to train the action network or not, which can save much training time. Examples of both linear and nonlinear systems are induced to verify the superiority of the proposed method compared with the conventional value iteration ADP. The simulation results show that the proposed algorithm can provide a nearer optimal control policy and save more training time than the conventional value iteration ADP.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/0d244ef891684464b56abf209f38778d Zobrazit plný text záznamu View record in DOAJ