The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
Autor: | Jon Barker, Jan Trmal, Shinji Watanabe, Emmanuel Vincent |
---|---|
Přispěvatelé: | University of Sheffield [Sheffield], Center for Language and Speech Processing [Baltimore] (CLSP), Johns Hopkins University (JHU), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), ANR-16-CE33-0006,VOCADOM,Commande vocale robuste adaptée à la personne et au contexte pour l'autonomie à domicile(2016) |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
FOS: Computer and information sciences
Sound (cs.SD) Reverberation Microphone array noise Computer science Microphone Computer Science - Artificial Intelligence reverberation Speech recognition 02 engineering and technology robust ASR Computer Science - Sound microphone array [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] 030507 speech-language pathology & audiology 03 medical and health sciences Audio and Speech Processing (eess.AS) conversational speech FOS: Electrical engineering electronic engineering information engineering 0202 electrical engineering electronic engineering information engineering Signal processing 020206 networking & telecommunications Speech enhancement Artificial Intelligence (cs.AI) 13. Climate action 'CHiME' challenge Language model 0305 other medical science Binaural recording Electrical Engineering and Systems Science - Audio and Speech Processing |
Zdroj: | Interspeech 2018-19th Annual Conference of the International Speech Communication Association Interspeech 2018-19th Annual Conference of the International Speech Communication Association, Sep 2018, Hyderabad, India INTERSPEECH |
Popis: | International audience; The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning. This paper introduces the 5th CHiME Challenge, which considers the task of distant multi-microphone conversational ASR in real home environments. Speech material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech and recorded by 6 Kinect microphone arrays and 4 binaural microphone pairs. The challenge features a single-array track and a multiple-array track and, for each track, distinct rankings will be produced for systems focusing on robustness with respect to distant-microphone capture vs. systems attempting to address all aspects of the task including conversational language modeling. We discuss the rationale for the challenge and provide a detailed description of the data collection procedure, the task, and the baseline systems for array synchronization, speech enhancement, and conventional and end-to-end ASR. |
Databáze: | OpenAIRE |
Externí odkaz: |