Multi-Agent Cooperation Based on Reinforcement Learning with Internal Reward in Maze Problem

Autor: Fumito Uwano, Naoki Tatebe, Yusuke Tajima, Masaya Nakata, Tim Kovacs, Keiki Takadama
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: SICE Journal of Control, Measurement, and System Integration, Vol 11, Iss 4, Pp 321-330 (2018)
Druh dokumentu: article
ISSN: 1884-9970
DOI: 10.9746/jcmsi.11.321
Popis: This paper introduces a reinforcement learning technique with an internal reward for a multi-agent cooperation task. The proposed methods is an extension of Q-learning which changes the ordinary (external) reward to the internal reward for agent-cooperation. Specifically, we propose here two Q-learning methods, both of which employ the internal reward for the less or no communication. To guarantee the effectiveness of the proposed methods, we theoretically derived the mechanisms that solve the following questions: (1) how the internal rewards should be set to guarantee the cooperation among the agents under the condition of less and no communication; and (2) how the values of the cooperative behaviors types (i.e., the varieties of the cooperative behaviors of the agents) should be updated under the condition of no communication. The intensive simulations on the maze problem for the agent-cooperation task have been revealed that our two proposed methods successfully enable the agents to acquire their cooperative behaviors even in less or no communication, while the conventional method (Q-learning) always fails to acquire such behaviors.
Databáze: Directory of Open Access Journals