A self-learning cognitive architecture exploiting causality from rewards.
Autor: | Li H; Computational NeuroEngineering Laboratory, Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32601, United States of America. Electronic address: hongmingli@ufl.edu., Dou R; Computational NeuroEngineering Laboratory, Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32601, United States of America., Keil A; Department of Psychology and Center for the Study of Emotion & Attention, University of Florida, Gainesville, FL 32611, United States of America., Principe JC; Computational NeuroEngineering Laboratory, Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32601, United States of America. |
---|---|
Jazyk: | angličtina |
Zdroj: | Neural networks : the official journal of the International Neural Network Society [Neural Netw] 2022 Jun; Vol. 150, pp. 274-292. Date of Electronic Publication: 2022 Mar 08. |
DOI: | 10.1016/j.neunet.2022.02.029 |
Abstrakt: | Inspired by the human vision system and learning, we propose a novel cognitive architecture that understands the content of raw videos in terms of objects without using labels. The architecture achieves four objectives: (1) Decomposing raw frames in objects by exploiting foveal vision and memory. (2) Describing the world by projecting objects on an internal canvas. (3) Extracting relevant objects from the canvas by analyzing the causal relation between objects and rewards. (4) Exploiting the information of relevant objects to facilitate the reinforcement learning (RL) process. In order to speed up learning, and better identify objects that produce rewards, the architecture implements learning by causality from the perspective of Wiener and Granger using object trajectories stored in working memory and the time series of external rewards. A novel non-parametric estimator of directed information using Renyi's entropy is designed and tested. Experiments on three environments show that our architecture extracts most of relevant objects. It can be thought of as 'understanding' the world in an object-oriented way. As a consequence, our architecture outperforms state-of-the-art deep reinforcement learning in terms of training speed and transfer learning. Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. (Copyright © 2022 Elsevier Ltd. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |