Approximating Nash equilibrium for anti-UAV jamming Markov game using a novel event-triggered multi-agent reinforcement learning.
Autor: | Feng Z; School of Information and Communication Engineering, Hainan University, Haikou, 570228, China; State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, 570228, China. Electronic address: 13523011824@163.com., Huang M; School of Information and Communication Engineering, Hainan University, Haikou, 570228, China; State Key Laboratory of Marine Resource Utilization in South China Sea, Haikou, 570228, China. Electronic address: huangmx09@163.com., Wu Y; School of Information and Communication Engineering, Hainan University, Haikou, 570228, China. Electronic address: wyuanyuan82@163.com., Wu D; Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China; School of Information and Communication Engineering, Hainan University, Haikou, 570228, China. Electronic address: hainuwudi@163.com., Cao J; School of Mathematics, Southeast University, Nanjing, 210096, China; Yonsei Frontier Lab, Yonsei University, Seoul 03722, South Korea. Electronic address: jdcao@seu.edu.cn., Korovin I; Scientific Research Institute of Multiprocessor Computer Systems, Southern Federal University, 2, Chekhov st., Taganrog, 347928, Russia. Electronic address: korovin_yakov@mail.ru., Gorbachev S; Russian Academy of Engineering, 9, building 4, Gazetny pereulok, Moscow, 125009, Russia. Electronic address: hanuman1000@mail.ru., Gorbacheva N; Scientific Research Institute of Multiprocessor Computer Systems, Southern Federal University, 2, Chekhov st., Taganrog, 347928, Russia. Electronic address: nadia7@sibmail.com. |
---|---|
Jazyk: | angličtina |
Zdroj: | Neural networks : the official journal of the International Neural Network Society [Neural Netw] 2023 Apr; Vol. 161, pp. 330-342. Date of Electronic Publication: 2023 Feb 02. |
DOI: | 10.1016/j.neunet.2022.12.022 |
Abstrakt: | In the downlink communication, it is currently challenging for ground users to cope with the uncertain interference from aerial intelligent jammers. The cooperation and competition between ground users and unmanned aerial vehicle (UAV) jammers leads to a Markov game problem of anti-UAV jamming. Therefore, a model-free method is adopted based on multi-agent reinforcement learning (MARL) to handle the Markov game. However, the benchmark MARL strategies suffer from dimension explosion and local optimal convergence. To solve these issues, a novel event-triggered multi-agent proximal policy optimization algorithm with Beta strategy (ETMAPPO) is proposed in this paper, which aims to reduce the dimension of information transmission and improve the efficiency of policy convergence. In this event-triggering mechanism, agents can learn to obtain appropriate observation in different moment, thereby reducing the transmission of valueless information. Beta operator is used to optimize the action search. It expands the search scope of policy space. Ablation simulations show that the proposed strategy achieves better global benefits with fewer dimension of information than benchmark algorithms. In addition, the convergence performance verifies that the well-trained ETMAPPO has the capability to achieve stable jamming strategies and stable anti-jamming strategies. This approximately constitutes the Nash equilibrium of the anti-jamming Markov game. Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. (Copyright © 2023 Elsevier Ltd. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |