Designing Policy Network with Deep Learning in Turn-Based Strategy Games
Autor: | Kokolo Ikeda, Tomihiro Kimura |
---|---|
Rok vydání: | 2020 |
Předmět: |
050101 languages & linguistics
Computer science business.industry Deep learning 05 social sciences Monte Carlo tree search Inference 02 engineering and technology Field (computer science) Turns rounds and time-keeping systems in games Recurrent neural network 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing 0501 psychology and cognitive sciences State (computer science) Artificial intelligence Computational problem business |
Zdroj: | Lecture Notes in Computer Science ISBN: 9783030658823 ACG |
DOI: | 10.1007/978-3-030-65883-0_12 |
Popis: | Research on artificial intelligence (AI) has experienced a substantial stride since the advent of the AlphaGo, progressing the application of deep learning techniques for the game application. However, significant research is still unpublished in the field of turn-based strategy games, owing to the complexity of the game structure and its computational problem. To apply deep learning to turn-based strategy games, a policy network created from match data was developed from learning game records. The neural network design used as a policy network is integrated into the turn-based strategy games, using a recurrent neural network to reduce the number of output neurons and to divide the output structure into original positions, destinations, and attack positions. Using the state and action data as a database, the game data are generated from the learning map based on the competition with the Monte Carlo Tree Search (MCTS) algorithm. However, the produced policy network demonstrates a superior performance against the MCTS algorithm with a winning rate of over \(50\%\) on the learning maps, and over \(40\%\) on the validation maps. In the game, the thinking time for the deep learning is extremely short since this it is performed by inference only, whereas MCTS thinking the time is approximately 5 to 10 s per move. |
Databáze: | OpenAIRE |
Externí odkaz: |