The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

Autor: Jon Barker, Jan Trmal, Shinji Watanabe, Emmanuel Vincent
Přispěvatelé: University of Sheffield [Sheffield], Center for Language and Speech Processing [Baltimore] (CLSP), Johns Hopkins University (JHU), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), ANR-16-CE33-0006,VOCADOM,Commande vocale robuste adaptée à la personne et au contexte pour l'autonomie à domicile(2016)
Jazyk: angličtina
Rok vydání: 2018
Předmět:
FOS: Computer and information sciences
Sound (cs.SD)
Reverberation
Microphone array
noise
Computer science
Microphone
Computer Science - Artificial Intelligence
reverberation
Speech recognition
02 engineering and technology
robust ASR
Computer Science - Sound
microphone array
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
030507 speech-language pathology & audiology
03 medical and health sciences
Audio and Speech Processing (eess.AS)
conversational speech
FOS: Electrical engineering
electronic engineering
information engineering

0202 electrical engineering
electronic engineering
information engineering

Signal processing
020206 networking & telecommunications
Speech enhancement
Artificial Intelligence (cs.AI)
13. Climate action
'CHiME' challenge
Language model
0305 other medical science
Binaural recording
Electrical Engineering and Systems Science - Audio and Speech Processing
Zdroj: Interspeech 2018-19th Annual Conference of the International Speech Communication Association
Interspeech 2018-19th Annual Conference of the International Speech Communication Association, Sep 2018, Hyderabad, India
INTERSPEECH
Popis: International audience; The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning. This paper introduces the 5th CHiME Challenge, which considers the task of distant multi-microphone conversational ASR in real home environments. Speech material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech and recorded by 6 Kinect microphone arrays and 4 binaural microphone pairs. The challenge features a single-array track and a multiple-array track and, for each track, distinct rankings will be produced for systems focusing on robustness with respect to distant-microphone capture vs. systems attempting to address all aspects of the task including conversational language modeling. We discuss the rationale for the challenge and provide a detailed description of the data collection procedure, the task, and the baseline systems for array synchronization, speech enhancement, and conventional and end-to-end ASR.
Databáze: OpenAIRE