Reservoir computing model of prefrontal cortex creates novel combinations of previous navigation sequences from hippocampal place-cell replay with spatial reward propagation

Autor: Cazin, Nicolas, Llofriu Alonso, Martin, Scleidorovich Chiodi, Pablo, Pelc, Tatiana, Harland, Bruce, Weitzenfeld, Alfredo, Fellous, Jean-Marc, Dominey, Peter Ford
Přispěvatelé: Cognition, Action, et Plasticité Sensorimotrice [Dijon - U1093] (CAPS), Université de Bourgogne (UB)-Institut National de la Santé et de la Recherche Médicale (INSERM), Robot Cognition Laboratory [Dijon] (RCL), Université de Bourgogne (UB)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Bourgogne (UB)-Institut National de la Santé et de la Recherche Médicale (INSERM), Department of Computer Science and Engineering [Tampa, FL, États-Unis], University of South Florida [Tampa] (USF), Department of Psychology [Tucson, AZ, États-Unis], University of Arizona, All authors received support from the CRCNS NFS-ANR grant Spaquence, N˚1429929., ANR-14-NEUC-0005,SPAQUENCE,A replay-driven model of spatial sequence learning in the Hippocampus-PFC network using reservoir computing(2014), Bodescot, Myriam, Recherche collaborative en neurosciences computationnelles - A replay-driven model of spatial sequence learning in the Hippocampus-PFC network using reservoir computing - - SPAQUENCE2014 - ANR-14-NEUC-0005 - CRCNS - VALID
Jazyk: angličtina
Rok vydání: 2019
Předmět:
Social Sciences
Neocortex
Hippocampus
Learning and Memory
Animal Cells
Medicine and Health Sciences
Psychology
Biology (General)
Problem Solving
Projections
Mammals
Neurons
Behavior
Animal

Applied Mathematics
Simulation and Modeling
Brain
Eukaryota
Animal Models
Reactivation
Experimental Organism Systems
Vertebrates
Physical Sciences
[SDV.NEU]Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]
Anatomy
Cellular Types
Algorithms
State
Research Article
Midline Thalamus
Reverse Replay
QH301-705.5
Neural Computation
Prefrontal Cortex
Research and Analysis Methods
Rodents
Model Organisms
Reward
Animals
Learning
Computer Simulation
[SDV.NEU] Life Sciences [q-bio]/Neurons and Cognition [q-bio.NC]
Experience
Organisms
Cognitive Psychology
Systems
Biology and Life Sciences
Cell Biology
Rats
Neostriatum
Cellular Neuroscience
Amniotes
Animal Studies
Cognitive Science
Mathematics
Neuroscience
Zdroj: PLoS Computational Biology
PLoS Computational Biology, Public Library of Science, 2019, 15 (7), pp.e1006624. ⟨10.1371/journal.pcbi.1006624⟩
PLoS Computational Biology, Vol 15, Iss 7, p e1006624 (2019)
ISSN: 1553-734X
1553-7358
DOI: 10.1371/journal.pcbi.1006624⟩
Popis: As rats learn to search for multiple sources of food or water in a complex environment, they generate increasingly efficient trajectories between reward sites. Such spatial navigation capacity involves the replay of hippocampal place-cells during awake states, generating small sequences of spatially related place-cell activity that we call “snippets”. These snippets occur primarily during sharp-wave-ripples (SWRs). Here we focus on the role of such replay events, as the animal is learning a traveling salesperson task (TSP) across multiple trials. We hypothesize that snippet replay generates synthetic data that can substantially expand and restructure the experience available and make learning more optimal. We developed a model of snippet generation that is modulated by reward, propagated in the forward and reverse directions. This implements a form of spatial credit assignment for reinforcement learning. We use a biologically motivated computational framework known as ‘reservoir computing’ to model prefrontal cortex (PFC) in sequence learning, in which large pools of prewired neural elements process information dynamically through reverberations. This PFC model consolidates snippets into larger spatial sequences that may be later recalled by subsets of the original sequences. Our simulation experiments provide neurophysiological explanations for two pertinent observations related to navigation. Reward modulation allows the system to reject non-optimal segments of experienced trajectories, and reverse replay allows the system to “learn” trajectories that it has not physically experienced, both of which significantly contribute to the TSP behavior.
Author summary As rats search for multiple sources of food in a complex environment, they generate increasingly efficient trajectories between reward sites, across multiple trials. This spatial navigation optimization behavior can be measured in the laboratory using a traveling salesperson task (TSP). This likely involves the coordinated replay of place-cell “snippets” between successive trials. We hypothesize that “snippets” can be used by the prefrontal cortex (PFC) to implement a form of reward-modulated reinforcement learning. Our simulation experiments provide neurophysiological explanations for two pertinent observations related to navigation. Reward modulation allows the system to reject non-optimal segments of experienced trajectories, and reverse replay allows the system to “learn” trajectories that it has not physically experienced, both of which significantly contribute to the TSP behavior.
Databáze: OpenAIRE