High quality agreement-based semi-supervised training data for acoustic modeling

Autor:	Felix de Chaumont Quitry, Eugene Weinstein, Asa Oines, Pedro J. Moreno
Rok vydání:	2016
Předmět:	Correctness Computer science business.industry Speech recognition media_common.quotation_subject SIGNAL (programming language) 020206 networking & telecommunications 02 engineering and technology computer.software_genre 01 natural sciences Agreement Data modeling ComputingMethodologies_PATTERNRECOGNITION 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Quality (business) Artificial intelligence Heuristics business 010301 acoustics computer Supervised training Utterance Natural language processing media_common
Zdroj:	SLT
Popis:	This paper describes a new technique to automatically obtain large high-quality training speech corpora for acoustic modeling. Traditional approaches select utterances based on confidence thresholds and other heuristics. We propose instead to use an ensemble approach: we transcribe each utterance using several recognizers, and only keep those on which they agree. The recognizers we use are trained on data from different dialects of the same language, and this diversity leads them to make different mistakes in transcribing speech utterances. In this work we show, however, that when they agree, this is an extremely strong signal that the transcript is correct. This allows us to produce automatically transcribed speech corpora that are superior in transcript correctness even to those manually transcribed by humans. Furthermore, we show that using the produced semi-supervised data sets, we can train new acoustic models which outperform those trained solely on previously available data sets.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::66da1d4a5a69b299a352f6654f4662bf https://doi.org/10.1109/slt.2016.7846323 Zobrazit plný text záznamu