A comparison between UCB and UCB-Tuned as selection policies in GGP.

Autor:	Francisco-Valencia, Iván, Marcial-Romero, José Raymundo, Valdovinos-Rosas, Rosa María, Pinto, David, Singh, Vivek
Předmět:	GOVERNMENT policy CONFIDENCE COMPARATIVE studies
Zdroj:	Journal of Intelligent & Fuzzy Systems; 2019, Vol. 36 Issue 5, p5073-5079, 7p
Abstrakt:	In this paper, we present a comparative analysis of two selection policies in the General Game Playing (GGP) context: Upper Confidence Bound (UCB) and Upper Confidence Bound Tuned (UCB-Tuned). The aim of the analysis is to identify which policy has the best performance in terms of victories in the GGP domain, a measure used in most of literature with other policies. In order to carry out the comparison, two agents were programmed using the GGP-base framework and the Monte Carlo Tree Search (MCTS) method. The games Breakthrough, Knightthrough and Connect Four were used as experimental scenarios, not compared previously to the best of our knowledge. The results show that UCB-Tuned is better when less than 100 simulations are used in MCTS; however, when 1000 simulations are used, both policies have similar performance. [ABSTRACT FROM AUTHOR]
Databáze:	Complementary Index
Externí odkaz:	Zobrazit plný text záznamu Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.