Accelerating Fuzzy Actor–Critic Learning via Suboptimal Knowledge for a Multi-Agent Tracking Problem

Autor:	Xiao Wang, Zhe Ma, Lei Mao, Kewu Sun, Xuhui Huang, Changchao Fan, Jiake Li
Rok vydání:	2023
Předmět:	Computer Networks and Communications Hardware and Architecture Control and Systems Engineering Signal Processing suboptimal knowledge fuzzy system actor–critic Apollonius circle Electrical and Electronic Engineering
Zdroj:	Electronics; Volume 12; Issue 8; Pages: 1852
ISSN:	2079-9292
DOI:	10.3390/electronics12081852
Popis:	Multi-agent differential games usually include tracking policies and escaping policies. To obtain the proper policies in unknown environments, agents can learn through reinforcement learning. This typically requires a large amount of interaction with the environment, which is time-consuming and inefficient. However, if one can obtain an estimated model based on some prior knowledge, the control policy can be obtained based on suboptimal knowledge. Although there exists an error between the estimated model and the environment, the suboptimal guided policy will avoid unnecessary exploration; thus, the learning process can be significantly accelerated. Facing the problem of tracking policy optimization for multiple pursuers, this study proposed a new form of fuzzy actor–critic learning algorithm based on suboptimal knowledge (SK-FACL). In the SK-FACL, the information about the environment that can be obtained is abstracted as an estimated model, and the suboptimal guided policy is calculated based on the Apollonius circle. The guided policy is combined with the fuzzy actor–critic learning algorithm, improving the learning efficiency. Considering the ground game of two pursuers and one evader, the experimental results verified the advantages of the SK-FACL in reducing tracking error, adapting model error and adapting to sudden changes made by the evader compared with pure knowledge control and the pure fuzzy actor–critic learning algorithm.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1688aca2ce9091b1584f4c4911e12752 https://doi.org/10.3390/electronics12081852 Zobrazit plný text záznamu