Abstrakt: |
With the continuous development of space rendezvous technology, more and more attention has been paid to the study of spacecraft orbital pursuit-evasion differential game. Therefore, we propose a pursuit-evasion game algorithm based on branching improved Deep Q Networks to obtain a space rendezvous strategy with non-cooperative target. Firstly, we transform the optimal control of space rendezvous between spacecraft and non-cooperative target into a survivable differential game problem. Next, in order to solve this game problem, we construct Nash equilibrium strategy and test its existence and uniqueness. Then, in order to avoid the dimensional disaster of Deep Q Networks in the continuous behavior space, we construct a TSK fuzzy inference model to represent the continuous space. Finally, in order to solve the complex and timeconsuming self-learning problem of discrete action sets, we improve Deep Q Networks algorithm, and propose a branching architecture with multiple groups of parallel neural Networks and shared decision modules. The simulation results show that the algorithm achieves the combination of optimal control and game theory, and further improves the learning ability of discrete behaviors. The algorithm has the comparative advantage of continuous space behavior decision, can effectively deal with the continuous space chase game problem, and provides a new idea for the solution of spacecraft orbit pursuit-evasion strategy. [ABSTRACT FROM AUTHOR] |