Wiki-En-ASR-Adapt: Large-scale synthetic dataset for English ASR Customization

Autor:	Antonova, Alexandra
Rok vydání:	2023
Předmět:	Electrical Engineering and Systems Science - Audio and Speech Processing Computer Science - Computation and Language Computer Science - Sound
Druh dokumentu:	Working Paper
Popis:	We present a first large-scale public synthetic dataset for contextual spellchecking customization of automatic speech recognition (ASR) with focus on diverse rare and out-of-vocabulary (OOV) phrases, such as proper names or terms. The proposed approach allows creating millions of realistic examples of corrupted ASR hypotheses and simulate non-trivial biasing lists for the customization task. Furthermore, we propose injecting two types of ``hard negatives" to the simulated biasing lists in training examples and describe our procedures to automatically mine them. We report experiments with training an open-source customization model on the proposed dataset and show that the injection of hard negative biasing phrases decreases WER and the number of false alarms. Comment: Accepted to IEEE ASRU 2023
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2309.17267 Zobrazit plný text záznamu View this record from Arxiv