Spot keywords from very noisy and mixed speech

Autor: Shi, Ying, Wang, Dong, Li, Lantian, Han, Jiqing, Yin, Shi
Rok vydání: 2023
Předmět:
Druh dokumentu: Working Paper
Popis: Most existing keyword spotting research focuses on conditions with slight or moderate noise. In this paper, we try to tackle a more challenging task: detecting keywords buried under strong interfering speech (10 times higher than the keyword in amplitude), and even worse, mixed with other keywords. We propose a novel Mix Training (MT) strategy that encourages the model to discover low-energy keywords from noisy and mixed speech. Experiments were conducted with a vanilla CNN and two EfficientNet (B0/B2) architectures. The results evaluated with the Google Speech Command dataset demonstrated that the proposed mix training approach is highly effective and outperforms standard data augmentation and mixup training.
Comment: Interspeech 2023
Databáze: arXiv