Word Characters and Phone Pronunciation Embedding for ASR Confidence Classifier

Autor: Kshitiz Kumar, Yifan Gong, Tasos Anastasakos
Rok vydání: 2019
Předmět:
Zdroj: ICASSP
Popis: Confidence classifier is an integral component of an automatic speech recognition (ASR) system. These classifiers predict the accuracy of an ASR hypothesis by associating a confidence score in [0,1] range, where larger score implies higher probability of the hypothesis being correct. Confidence scores have significant applications in ASR system design, training data selection, model adaptation, and other ASR applications. In this work we focus on word embedding features to improve confidence classifier, and introduce character and phone embeddings as confidence features. We motivate these features in the context of representing and factorizing acoustic scores along the proposed features. We evaluate our work on large scale ASR tasks, and demonstrate significant improvement in the confidence performance with the proposed features. At our typical operating point, we report 8% relative reduction in false alarm (FA) for limited vocabulary enUS Xbox task, and 9.9% relative reduction in FA for large vocabulary enUS server task. We also conducted server experiments for our proposed features in combination with natural language Glove embeddings, and improved the overall relative reduction in FA to 16%.
Databáze: OpenAIRE