Reducing Q-Value Estimation Bias via Mutual Estimation and Softmax Operation in MADRL

Autor:	Zheng Li, Xinkai Chen, Jiaqing Fu, Ning Xie, Tingting Zhao
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	reinforcement learning game AI multi-agent Q-network mutual estimation softmax bellman operation reinforcement learning environment Industrial engineering. Management engineering T55.4-60.8 Electronic computers. Computer science QA75.5-76.95
Zdroj:	Algorithms, Vol 17, Iss 1, p 36 (2024)
Druh dokumentu:	article
ISSN:	1999-4893
DOI:	10.3390/a17010036
Popis:	With the development of electronic game technology, the content of electronic games presents a larger number of units, richer unit attributes, more complex game mechanisms, and more diverse team strategies. Multi-agent deep reinforcement learning shines brightly in this type of team electronic game, achieving results that surpass professional human players. Reinforcement learning algorithms based on Q-value estimation often suffer from Q-value overestimation, which may seriously affect the performance of AI in multi-agent scenarios. We propose a multi-agent mutual evaluation method and a multi-agent softmax method to reduce the estimation bias of Q values in multi-agent scenarios, and have tested them in both the particle multi-agent environment and the multi-agent tank environment we constructed. The multi-agent tank environment we have built has achieved a good balance between experimental verification efficiency and multi-agent game task simulation. It can be easily extended for different multi-agent cooperation or competition tasks. We hope that it can be promoted in the research of multi-agent deep reinforcement learning.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/d43e94fb9d9b4d298262709578cb83e5 Zobrazit plný text záznamu View record in DOAJ Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.