RLMViz: Interpreting Deep Reinforcement Learning Memory

Autor:	Jaunet, Theo, Vuillemot, Romain, Wolf, Christian
Přispěvatelé:	Jaunet, Théo, Situated Interaction, Collaboration, Adaptation and Learning (SICAL), Laboratoire d'InfoRmatique en Image et Systèmes d'information (LIRIS), Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-École Centrale de Lyon (ECL), Université de Lyon-Université Lumière - Lyon 2 (UL2)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Université Lumière - Lyon 2 (UL2), Extraction de Caractéristiques et Identification (imagine)
Jazyk:	angličtina
Rok vydání:	2019
Předmět:	[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] [INFO]Computer Science [cs] [INFO] Computer Science [cs] [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
Zdroj:	Journée Visu 2019 Journée Visu 2019, May 2019, Paris, France
Popis:	National audience; We present RLMViz, a visual analytics interface to interpret the internal memory of an agent (e.g., a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated before each action of the agent moving in an environment. This memory is not trivial to understand, and is referred to as a black box, which only inputs (images) and outputs (actions) are understood, but not its inner workings. Using RLMViz, experts can form hypothesis on this memory and derive rules based on the agent's decisions to interpret them, and gain an understanding towards why errors have been made and improve future training process. We report on the main features of RLMViz which are memory navigation and contextualization techniques using time-lines juxtapositions. We also present our early findings using the VizDoom simulator, a standard benchmark for DRL navigation scenarios.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::2a9c1981889f7143580895ac9b712e56 https://hal.archives-ouvertes.fr/hal-02140902 Zobrazit plný text záznamu