Výsledky vyhledávání

Multimodal speaker localization in a probabilistic framework

Publikováno v: Scopus-Elsevier

A multimodal probabilistic framework is proposed for the problem of finding the active speaker in a video sequence. We localize the current speaker's mouth in the image by using the video and the audio channels together. We propose a novel visual fea

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ac7decee33dab96c523ad418a538b271

Zobrazit plný text záznamu

AUDIO-VISUAL SPEECH RECOGNITION WITH A HYBRID SVM-HMM SYSTEM

Autor: Gurban, M., Jean-Philippe Thiran

Publikováno v: Scopus-Elsevier

Traditional speech recognition systems use Gaussian mixture models to obtain the likelihoods of individual phonemes, which are then used as state emission probabilities in hidden Markov models representing the words. In hybrid systems, the Gaussian m

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fa920825f703f2ece007bbcb61e99f0a

Zobrazit plný text záznamu

Conference

Selecting relevant visual features for speechreading.

Autor: Estellers, V., Gurban, M., Thiran, J.P.

Publikováno v: 2009 16th IEEE International Conference on Image Processing (ICIP); 2009, p1433-1436, 4p

Zobrazit plný text záznamu

Conference

Relevant Feature Selection for Audio-Visual Speech Recognition.

Autor: Drugman, T., Gurban, M., Thiran, J.-P.

Publikováno v: 2007 IEEE 9th Workshop on Multimedia Signal Processing; 2007, p179-182, 4p

Zobrazit plný text záznamu

Using entropy as a stream reliability estimate for audio-visual speech recognition

Autor: Gurban, M., Jean-Philippe Thiran

Publikováno v: Scopus-Elsevier

We present a method for dynamically integrating audio-visual information for speech recognition, based on the estimated reliability of the audio and visual streams. Our method uses an information theoretic measure, the entropy derived from the state

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::919bc362a2b589bae7f6a7d87cc0f65d
http://www.scopus.com/inward/record.url?eid=2-s2.0-77956479500&partnerID=MN8TOARS

Zobrazit plný text záznamu

An information theoretic perspective on multimodal signal processing

Autor: Gurban, M., Thiran, J.

Multimodal signals can be defined in general as signals originating from the same physical source, but acquired through different devices, techniques or protocols. This applies for example to audio-visual signals, medical or satellite images. Underst

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=od_______185::be1974146aa8eb9853d5a3626e9d64e4
https://infoscience.epfl.ch/record/87194

Zobrazit plný text záznamu

Multimodal Speaker Localization from Omnidirectional Videos

Autor: Reuse, P., Gurban, M., Austvoll, I., Jean-Philippe Thiran

Publikováno v: Scopus-Elsevier

The use of omnidirectional cameras for videoconferencing promises to simplify the hardware setup necessary for large groups of participants. We investigate the use of a multimodal speaker detection algorithm on audio-visual sequences captured with su

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bcf0f307a2956a88a2c33f261869b7ac
https://infoscience.epfl.ch/record/140633

Zobrazit plný text záznamu

Low-Dimensional Motion Features for Audio-Visual Speech Recognition

Autor: Carboneras, A. V., Gurban, M., Jean-Philippe Thiran

Publikováno v: Scopus-Elsevier

Audio-visual speech recognition promises to improve the performance of speech recognizers, especially when the audio is corrupted, by adding information from the visual modality, more specifically, from the video of the speaker. However, the number o

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::906d1d98301d95db977193a46320f03c
https://infoscience.epfl.ch/record/109488

Zobrazit plný text záznamu

Définition et sélection d’attributs visuels pour la reconnaissance audio-visuelle de la parole

Autor: Thiran, Jean-Philippe, Valles, A., Drugman, T., Gurban, M.

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=od_______185::918d08bf85735d5a98a517da57460a09
https://infoscience.epfl.ch/record/109494

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Vyhledávací nástroje:

Upřesnit hledání