Autor: |
Jung, Jaemin, Kim, Youkyum, Park, Jihwan, Lim, Youshin, Kim, Byeong-Yeol, Jang, Youngjoon, Chung, Joon Son |
Rok vydání: |
2022 |
Předmět: |
|
Druh dokumentu: |
Working Paper |
Popis: |
The goal of this work is to detect new spoken terms defined by users. While most previous works address Keyword Spotting (KWS) as a closed-set classification problem, this limits their transferability to unseen terms. The ability to define custom keywords has advantages in terms of user experience. In this paper, we propose a metric learning-based training strategy for user-defined keyword spotting. In particular, we make the following contributions: (1) we construct a large-scale keyword dataset with an existing speech corpus and propose a filtering method to remove data that degrade model training; (2) we propose a metric learning-based two-stage training strategy, and demonstrate that the proposed method improves the performance on the user-defined keyword spotting task by enriching their representations; (3) to facilitate the fair comparison in the user-defined KWS field, we propose unified evaluation protocol and metrics. Our proposed system does not require an incremental training on the user-defined keywords, and outperforms previous works by a significant margin on the Google Speech Commands dataset using the proposed as well as the existing metrics. |
Databáze: |
arXiv |
Externí odkaz: |
|