Morphosyntactic Processing of N-Best Lists for Improved Recognition and Confidence Measure Computation
Autor: | Stéphane Huet, Pascale Sébillot, Guillaume Gravier |
---|---|
Přispěvatelé: | Multimedia content-based indexing (TEXMEX), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), Huet, Stéphane, Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique |
Jazyk: | angličtina |
Rok vydání: | 2007 |
Předmět: |
Computer science
[INFO.INFO-TS] Computer Science [cs]/Signal and Image Processing online learning Word error rate user adaptation 02 engineering and technology confidence measures computer.software_genre 01 natural sciences [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] Transcription (linguistics) [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing 0103 physical sciences 0202 electrical engineering electronic engineering information engineering 010301 acoustics Index Terms: speech recognition parts of speech business.industry Natural language generation 020206 networking & telecommunications Part of speech natural language generation adversarial bandit Recurrent neural network [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL] recurrent neural network Artificial intelligence Language model business computer Natural language processing Sentence |
Zdroj: | 8th Annual Conference of the International Speech Communication Association (Interspeech) 8th Annual Conference of the International Speech Communication Association (Interspeech), 2007, Antwerp, Belgium. pp.1741-1744 INTERSPEECH |
Popis: | International audience; We study the use of morphosyntactic knowledge to process N-best lists. We propose a new score function that combines the parts of speech (POS), language model, and acoustic scores at the sentence level. Experimental results, obtained for French broadcast news transcription, show a significant improvement of the word error rate with various decoding criteria commonly used in speech recognition. Interestingly, we observed more grammatical transcriptions, which translates into a better sentence error rate. Finally, we show that POS knowledge also improves posterior based confidence measures. |
Databáze: | OpenAIRE |
Externí odkaz: |