MP-TD3: Multi-Pool Prioritized Experience Replay-Based Asynchronous Twin Delayed Deep Deterministic Policy Gradient Algorithm

Autor:	Wenwen Tan, Detian Huang
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Asynchronous reinforcement learning twin delayed deep deterministic algorithm prioritized experience replay TD-error Electrical engineering. Electronics. Nuclear engineering TK1-9971
Zdroj:	IEEE Access, Vol 12, Pp 105268-105280 (2024)
Druh dokumentu:	article
ISSN:	2169-3536
DOI:	10.1109/ACCESS.2024.3435949
Popis:	The prioritized experience replay mechanisms have achieved remarkable success in accelerating the convergence of reinforcement learning algorithms. However, applying traditional prioritized experience replay mechanisms directly to asynchronous reinforcement learning leads to slow convergence, due to the difficulty for an agent to utilize excellent experiences obtained by other agents interacting with the environment. To address the above issue, we propose a Multi-pool Prioritized experience replay-based asynchronous Twin Delayed Deep Deterministic policy gradient algorithm (MP-TD3). Specifically, a multi-pool prioritized experience replay mechanism is proposed to strengthen the experience interactions among different agents to accelerate the network convergence. Then, a global-pool self-cleaning mechanism based on sample diversity and a global-pool self-cleaning mechanism based on TD-errors are designed to overcome the deficiency that the samples suffer from high redundancy and low information content in the global-pool, respectively. Finally, a multi-batch sampling mechanism is investigated to further reduce the training time. Extensive experiments validate that the proposed MP-TD3 significantly improve the convergence speed and performance compared with state-of-the-art methods.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/c592122041094bcba1a144b711138711 Zobrazit plný text záznamu View record in DOAJ