Reinforcement Learning: Playing Tic-Tac-Toe

Autor:	Zoe Liu, Allison Liu, Benjamin Chang, Jeffrey Huang, Jocelyn Ho
Rok vydání:	2022
Předmět:	General Medicine General Chemistry
DOI:	10.36227/techrxiv.20407575
Popis:	Machine learning constructs computer systems that develop through experience. Applications surround disciplines in daily life ranging from malware filtering to image recognition. Recent research has shifted towards maximizing efficiency in decision-making, creating algorithms that quickly and accurately process patterns to generate insight. This research focuses on reinforcement learning, a paradigm of machine learning that makes decisions through maximizing reward. Specifically, we use Q-learning – a model-free reinforcement learning algorithm – to assign scores for different decisions given the unique states of the problem. Widyantoro et al. (2009) has studied the effect of Q-learning on learning to play Tic-Tac-Toe. However, the study yielded a win/tie rate of less than 50 percent. We believe that does not represent an effective algorithm to fully exploit the benefits of Q-learning. In the same environment, this research aims to close the gaps in the effectiveness of Q-learning while minimizing human input. Data were processed by setting the epsilon value as 0.9 to ensure randomness, then consecutively decrease with a constant rate as possible states increase. The program played 300,000 games against its previous version, eventually securing a win/tie rate of approximately 90 percent. Future directions include improving the efficiency of Q-learning algorithms and applying the research in practical fields.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::052c5a9beba61d730cfcb75448732136 https://doi.org/10.36227/techrxiv.20407575 Zobrazit plný text záznamu