Zobrazeno 1 - 10
of 47
pro vyhledávání: '"Wei, Ting Han"'
Autor:
Tsai, Yun-Jui, Wei, Ting Han, Lin, Chi-Huang, Shih, Chung-Chin, Guei, Hung, Wu, I-Chen, Wu, Ti-Rong
Game solving is the process of finding the theoretical outcome for a game, assuming that all player choices are optimal. This paper focuses on a technique that can reduce the heuristic search space significantly for 7x7 Killall-Go. In Go and Killall-
Externí odkaz:
http://arxiv.org/abs/2411.05565
We propose Expected Work Search (EWS), a new game solving algorithm. EWS combines win rate estimation, as used in Monte Carlo Tree Search, with proof size estimation, as used in Proof Number Search. The search efficiency of EWS stems from minimizing
Externí odkaz:
http://arxiv.org/abs/2405.05594
Autor:
Kohankhaki, Farnaz, Aghakasiri, Kiarash, Zhang, Hongming, Wei, Ting-Han, Gao, Chao, Müller, Martin
Monte Carlo Tree Search (MCTS) is an immensely popular search-based framework used for decision making. It is traditionally applied to domains where a perfect simulation model of the environment is available. We study and improve MCTS in the context
Externí odkaz:
http://arxiv.org/abs/2312.11348
Game solving is a similar, yet more difficult task than mastering a game. Solving a game typically means to find the game-theoretic value (outcome given optimal play), and optionally a full strategy to follow in order to achieve that outcome. The Alp
Externí odkaz:
http://arxiv.org/abs/2311.07178
Autor:
Wu, Ti-Rong, Guei, Hung, Peng, Pei-Chiun, Huang, Po-Wei, Wei, Ting Han, Shih, Chung-Chin, Tsai, Yun-Jui
This paper presents MiniZero, a zero-knowledge learning framework that supports four state-of-the-art algorithms, including AlphaZero, MuZero, Gumbel AlphaZero, and Gumbel MuZero. While these algorithms have demonstrated super-human performance in ma
Externí odkaz:
http://arxiv.org/abs/2310.11305
This paper describes a Relevance-Zone pattern table (RZT) that can be used to replace a traditional transposition table. An RZT stores exact game values for patterns that are discovered during a Relevance-Zone-Based Search (RZS), which is the current
Externí odkaz:
http://arxiv.org/abs/2212.13922
Goal-achieving problems are puzzles that set up a specific situation with a clear objective. An example that is well-studied is the category of life-and-death (L&D) problems for Go, which helps players hone their skill of identifying region safety. M
Externí odkaz:
http://arxiv.org/abs/2112.02563
Sim-to-real, a term that describes where a model is trained in a simulator then transferred to the real world, is a technique that enables faster deep reinforcement learning (DRL) training. However, differences between the simulator and the real worl
Externí odkaz:
http://arxiv.org/abs/2011.05617
AlphaZero has been very successful in many games. Unfortunately, it still consumes a huge amount of computing resources, the majority of which is spent in self-play. Hyperparameter tuning exacerbates the training cost since each hyperparameter config
Externí odkaz:
http://arxiv.org/abs/2003.06212
Many of the strongest game playing programs use a combination of Monte Carlo tree search (MCTS) and deep neural networks (DNN), where the DNNs are used as policy or value evaluators. Given a limited budget, such as online playing or during the self-p
Externí odkaz:
http://arxiv.org/abs/1905.13521