Personalization Strategies for End-to-End Speech Recognition Systems

Autor:	Yile Gu, Gautam Tiwari, Ivan Bulyko, Shashank Kalmane, Linda Liu, Andreas Stolcke, Guitang Lan, Xiangyang Huang, Denis Filimonov, Ankur Gandhe, Ariya Rastrow, Aditya Gourav
Rok vydání:	2021
Předmět:	FOS: Computer and information sciences Signal processing Computer Science - Computation and Language End-to-end principle Symbol table Computer science Speech recognition Scalable algorithms Computation and Language (cs.CL) Decoding methods Oracle Personalization Conjunction (grammar)
Zdroj:	ICASSP
DOI:	10.48550/arxiv.2102.07739
Popis:	The recognition of personalized content, such as contact names, remains a challenging problem for end-to-end speech recognition systems. In this work, we demonstrate how first and second-pass rescoring strategies can be leveraged together to improve the recognition of such words. Following previous work, we use a shallow fusion approach to bias towards recognition of personalized content in the first-pass decoding. We show that such an approach can improve personalized content recognition by up to 16% with minimum degradation on the general use case. We describe a fast and scalable algorithm that enables our biasing models to remain at the word-level, while applying the biasing at the subword level. This has the advantage of not requiring the biasing models to be dependent on any subword symbol table. We also describe a novel second-pass de-biasing approach: used in conjunction with a first-pass shallow fusion that optimizes on oracle WER, we can achieve an additional 14% improvement on personalized content recognition, and even improve accuracy for the general use case by up to 2.5%. Comment: 5 pages, 5 tables, 1 figure
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::b478fe456c395d84b6ae8a16f4a1ab7a Zobrazit plný text záznamu