Combining acoustic name spotting and continuous context models to improve spoken person name recognition in speech

Autor: Richard Dufour, Corinne Fredouille, Georges Linarès, Benjamin Bigot, Gregory Senay
Přispěvatelé: Laboratoire Informatique d'Avignon (LIA), Avignon Université (AU)-Centre d'Enseignement et de Recherche en Informatique - CERI, Fredouille, Corinne
Rok vydání: 2013
Předmět:
Zdroj: INTERSPEECH
Interspeech 2013
Interspeech 2013, Aug 2013, Lyon, France
DOI: 10.21437/interspeech.2013-572
Popis: International audience; Retrieving pronounced person names in spoken documents is a critical problematic in the context of audiovisual content indexing. In this paper, we present a cascading strategy for two methods dedicated to spoken name recognition in speech. The first method is an acoustic name spotting in phoneme confusion networks. It is based on a phonetic edition distance criterion based on phoneme probabilities held in confusion networks. The second method is a continuous context modelling approach applied on the 1-best transcription output. It relies on a probabilistic modelling of name-to-context dependencies. We assume that the combination of these methods, based on different types of information, may improve spoken name recognition performance. This assumption is studied through experiments done on a set of audiovisual documents from the development set of the REPERE challenge. Results report that combining acoustic and linguistic methods produces an absolute gain of 3% in terms of F-measure compared to the best system taken alone.
Databáze: OpenAIRE