Autonomous Sensorimotor Learning for Sound Source Localization by a Humanoid Robot

Autor: Nguyen, Quan, Girin, Laurent, Bailly, Gérard, Elisei, Frédéric, Nguyen, Duc-Canh
Přispěvatelé: GIPSA - Cognitive Robotics, Interactive Systems, & Speech Processing (GIPSA-CRISSP), Département Parole et Cognition (GIPSA-DPC), Grenoble Images Parole Signal Automatique (GIPSA-lab ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Grenoble Images Parole Signal Automatique (GIPSA-lab ), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019])-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), GIPSA-Services (GIPSA-Services), Bailly, Gérard
Jazyk: angličtina
Rok vydání: 2018
Předmět:
Zdroj: IROS 2018-Workshop on Crossmodal Learning for Intelligent Robotics in conjunction with IEEE/RSJ IROS
IROS 2018-Workshop on Crossmodal Learning for Intelligent Robotics in conjunction with IEEE/RSJ IROS, Oct 2018, Madrid, Spain
Popis: International audience; We consider the problem of learning to localize a speech source using a humanoid robot equipped with a binaural hearing system. We aim to map binaural audio features into the relative angle between the robot's head direction and the target source direction based on a sensorimotor training framework. To this end, we make the following contributions: (i) a procedure to automatically collect and label audio and motor data for sensorimotor training; (ii) the use of a convolutional neural network (CNN) trained with white noise signal and ground truth relative source direction. Experimental evaluation with speech signals shows that the CNN can localize the speech source even without an explicit algorithm for dealing with missing spectral features.
Databáze: OpenAIRE