Extraction of visual features for lipreading

Autor:	Richard P. Harvey, Stephen Cox, Iain Matthews, Timothy F. Cootes, J.A. Bangham
Rok vydání:	2002
Předmět:	Computer science business.industry Applied Mathematics Speech recognition Feature extraction Pattern recognition Audio-visual speech recognition Intelligibility (communication) Active appearance model Computational Theory and Mathematics Artificial Intelligence Principal component analysis Computer Vision and Pattern Recognition Artificial intelligence Hidden Markov model business Software
Zdroj:	IEEE Transactions on Pattern Analysis and Machine Intelligence. 24:198-213
ISSN:	0162-8828
DOI:	10.1109/34.982900
Popis:	The multimodal nature of speech is often ignored in human-computer interaction, but lip deformations and other body motion, such as those of the head, convey additional information. We integrate speech cues from many sources and this improves intelligibility, especially when the acoustic signal is degraded. The paper shows how this additional, often complementary, visual speech information can be used for speech recognition. Three methods for parameterizing lip image sequences for recognition using hidden Markov models are compared. Two of these are top-down approaches that fit a model of the inner and outer lip contours and derive lipreading features from a principal component analysis of shape or shape and appearance, respectively. The third, bottom-up, method uses a nonlinear scale-space analysis to form features directly from the pixel intensity. All methods are compared on a multitalker visual speech recognition task of isolated letters.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::9bde887c45b4ecb6acc7517d63ed8557 https://doi.org/10.1109/34.982900 Zobrazit plný text záznamu