Personalized speech recognition on mobile devices

Autor:	Raziel Alvarez, Francoise Beaufays, Rohit Prabhavalkar, Alexander H. Gruenstein, Kanishka Rao, Carolina Parada, Ian McGraw, David Rybach, Ouais Alsharif, Montse Gonzalez Arenas, Hasim Sak
Rok vydání:	2016
Předmět:	FOS: Computer and information sciences Sound (cs.SD) Vocabulary Computer science Speech recognition media_common.quotation_subject Word error rate 02 engineering and technology Machine learning computer.software_genre Computer Science - Sound Machine Learning (cs.LG) 030507 speech-language pathology & audiology 03 medical and health sciences 020204 information systems 0202 electrical engineering electronic engineering information engineering media_common Computer Science - Computation and Language Voice activity detection business.industry Acoustic model Computer Science - Learning Memory footprint Language model Artificial intelligence 0305 other medical science business Computation and Language (cs.CL) computer
Zdroj:	ICASSP
DOI:	10.1109/icassp.2016.7472820
Popis:	We describe a large vocabulary speech recognition system that is accurate, has low latency, and yet has a small enough memory and computational footprint to run faster than real-time on a Nexus 5 Android smartphone. We employ a quantized Long Short-Term Memory (LSTM) acoustic model trained with connectionist temporal classification (CTC) to directly predict phoneme targets, and further reduce its memory footprint using an SVD-based compression scheme. Additionally, we minimize our memory footprint by using a single language model for both dictation and voice command domains, constructed using Bayesian interpolation. Finally, in order to properly handle device-specific information, such as proper names and other context-dependent information, we inject vocabulary items into the decoder graph and bias the language model on-the-fly. Our system achieves 13.5% word error rate on an open-ended dictation task, running with a median speed that is seven times faster than real-time.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::dae983dd89cc96eaf0287640b8ced70d https://doi.org/10.1109/icassp.2016.7472820 Zobrazit plný text záznamu