Popis: |
This paper summarizes recent activities at LIMSI in speech recognition and its applications. The LIMSI recognizer, which has state‐of‐the‐art performance, uses phone‐based continuous density HMM for acoustic modeling, and word and class backoff n‐grams for language modeling. Acoustic model adaptation techniques (PMC, MAP, MLLR) are used to reduce the mismatch between test and training conditions, in particular through unsupervised adaptation. While the main goal of speech recognition is to provide a transcription of the speech signal as a sequence of words, the same basic technology can be applied to other areas. This recent work is oriented towards automatic systems for information access and for automatic indexation of audiovisual data. In the former case, the speaker‐independent, continuous speech recognizer is embedded in a spoken language dialog system for travel information on an information kiosk (ESPRIT MASK) or by telephone (LE ARISE). User trials highlight the importance of appropriate dialog strategies for satisfactory performance. The latter case is a challenging application for LVCSR as the data contain segments of various acoustic and linguistic natures (prepared/spontaneous speech, studio/telephone quality, background noise/music). Word error rates on this data (about 30%) appear to be sufficient for indexation and retrieval purposes. |