On-chip trainable hardware-based deep Q-networks approximating a backpropagation algorithm

Autor:	Soochang Lee, Seongbin Oh, Sung Yun Woo, Jangsaeng Kim, Jong-Ho Lee, Won-Mook Kang, Jong-Ho Bae, Byung-Gook Park, Dongseok Kwon, Chul-Heung Kim
Rok vydání:	2021
Předmět:	010302 applied physics Spiking neural network Dependency (UML) Artificial neural network Computer science business.industry 02 engineering and technology 01 natural sciences Backpropagation Flash memory Artificial Intelligence 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Reinforcement learning 020201 artificial intelligence & image processing business Software Computer hardware
Zdroj:	Neural Computing and Applications. 33:9391-9402
ISSN:	1433-3058 0941-0643
DOI:	10.1007/s00521-021-05699-z
Popis:	Reinforcement learning (RL) using deep Q-networks (DQNs) has shown performance beyond the human level in a number of complex problems. In addition, many studies have focused on bio-inspired hardware-based spiking neural networks (SNNs) given the capabilities of these technologies to realize both parallel operation and low power consumption. Here, we propose an on-chip training method for DQNs applicable to hardware-based SNNs. Because the conventional backpropagation (BP) algorithm is approximated, a performance evaluation based on two simple games shows that the proposed system achieves performance similar to that of a software-based system. The proposed training method can minimize memory usage and reduce power consumption and area occupation levels. In particular, for simple problems, the memory dependency can be significantly reduced given that high performance is achieved without using replay memory. Furthermore, we investigate the effect of the nonlinearity characteristics and two types of variation of non-ideal synaptic devices on the performance outcomes. In this work, thin-film transistor (TFT)-type flash memory cells are used as synaptic devices. A simulation is also conducted using fully connected neural network with non-leaky integrated-and-fire (I&F) neurons. The proposed system shows strong immunity to device variations because an on-chip training scheme is adopted.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::7e72d0b27eeafeedd6c212d07e10d637 https://doi.org/10.1007/s00521-021-05699-z Zobrazit plný text záznamu Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.