The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

Autor:	Jon Barker, Jan Trmal, Shinji Watanabe, Emmanuel Vincent
Přispěvatelé:	University of Sheffield [Sheffield], Center for Language and Speech Processing [Baltimore] (CLSP), Johns Hopkins University (JHU), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), ANR-16-CE33-0006,VOCADOM,Commande vocale robuste adaptée à la personne et au contexte pour l'autonomie à domicile(2016)
Jazyk:	angličtina
Rok vydání:	2018
Předmět:	FOS: Computer and information sciences Sound (cs.SD) Reverberation Microphone array noise Computer science Microphone Computer Science - Artificial Intelligence reverberation Speech recognition 02 engineering and technology robust ASR Computer Science - Sound microphone array [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] 030507 speech-language pathology & audiology 03 medical and health sciences Audio and Speech Processing (eess.AS) conversational speech FOS: Electrical engineering electronic engineering information engineering 0202 electrical engineering electronic engineering information engineering Signal processing 020206 networking & telecommunications Speech enhancement Artificial Intelligence (cs.AI) 13. Climate action 'CHiME' challenge Language model 0305 other medical science Binaural recording Electrical Engineering and Systems Science - Audio and Speech Processing
Zdroj:	Interspeech 2018-19th Annual Conference of the International Speech Communication Association Interspeech 2018-19th Annual Conference of the International Speech Communication Association, Sep 2018, Hyderabad, India INTERSPEECH
Popis:	International audience; The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning. This paper introduces the 5th CHiME Challenge, which considers the task of distant multi-microphone conversational ASR in real home environments. Speech material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech and recorded by 6 Kinect microphone arrays and 4 binaural microphone pairs. The challenge features a single-array track and a multiple-array track and, for each track, distinct rankings will be produced for systems focusing on robustness with respect to distant-microphone capture vs. systems attempting to address all aspects of the task including conversational language modeling. We discuss the rationale for the challenge and provide a detailed description of the data collection procedure, the task, and the baseline systems for array synchronization, speech enhancement, and conventional and end-to-end ASR.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::40f3c5f936ca2eea2faffe993aa09f4a https://hal.inria.fr/hal-01744021/document Zobrazit plný text záznamu