Výsledky vyhledávání - "Ahmed Hussen Abdelaziz"

Audiovisual Speech Synthesis using Tacotron2

Autor: Yannis Stylianou, Chloe Seivwright, Anushree Prasanna Kumar, Gabriele Fanelli, Sachin Kajareker, Justin G. Binder, Ahmed Hussen Abdelaziz

Publikováno v: ICMI

Audiovisual speech synthesis is the problem of synthesizing a talking face while maximizing the coherency of the acoustic and visual speech. In this paper, we propose and compare two audiovisual speech synthesis systems for 3D face models. The first

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6aaaa8b48908015f0df33129c25d56e3
https://doi.org/10.1145/3462244.3479883

Zobrazit plný text záznamu

Comparing Fusion Models for DNN-Based Audiovisual Continuous Speech Recognition

Autor: Ahmed Hussen Abdelaziz

Publikováno v: IEEE/ACM Transactions on Audio, Speech, and Language Processing. 26:475-484

Audiovisual fusion is one of the most challenging tasks that continues to attract substantial research interest in the field of audiovisual automatic speech recognition (AV-ASR). In the last few decades, many approaches for integrating the audio and

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::13fbe9e5c139f4a8f15d079a9e2ece9f
https://doi.org/10.1109/taslp.2017.2783545

Zobrazit plný text záznamu

On the Role of Visual Cues in Audiovisual Speech Enhancement

Autor: Zakaria Aldeneh, Ahmed Hussen Abdelaziz, Devang Naik, Erik Marchi, Barry-John Theobald, Anushree Prasanna Kumar, Sachin S. Kajarekar

Publikováno v: ICASSP

We present an introspection of an audiovisual speech enhancement model. In particular, we focus on interpreting how a neural audiovisual speech enhancement model uses visual cues to improve the quality of the target speech signal. We show that visual

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::363bfbf343bf136a979b620c28c30afb

Zobrazit plný text záznamu

Modality Dropout for Improved Performance-driven Talking Faces

Autor: Reinhard Knothe, Ahmed Hussen Abdelaziz, Paul R. Dixon, Sachin Kajareker, Barry-John Theobald, Nicholas Apostoloff

Publikováno v: ICMI

We describe our novel deep learning approach for driving animated faces using both acoustic and visual information. In particular, speech-related facial movements are generated using audiovisual information, and non-speech facial movements are genera

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::d67fff2efd9a0040c35647cbbc38ebfa

Zobrazit plný text záznamu

Speaker-Independent Speech-Driven Visual Speech Synthesis using Domain-Adapted Acoustic Models

Autor: Ahmed Hussen Abdelaziz, Sachin Kajareker, Nicholas Apostoloff, Barry-John Theobald, Justin G. Binder, Gabriele Fanelli, Thibaut Weise, Paul R. Dixon

Publikováno v: ICMI

Speech-driven visual speech synthesis involves mapping features extracted from acoustic speech to the corresponding lip animation controls for a face model. This mapping can take many forms, but a powerful approach is to use deep neural networks (DNN

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::1616cba3688ea185530f249d300e81f4
https://doi.org/10.1145/3340555.3353745

Zobrazit plný text záznamu

General hybrid framework for uncertainty-decoding-based automatic speech recognition systems

Autor: Ahmed Hussen Abdelaziz, Dorothea Kolossa

Publikováno v: Speech Communication. 79:1-13

Uncertainty decoding has recently been successful in improving automatic speech recognition performance in noisy environments by considering the pre-processed feature vectors not as deterministic but rather as random variables containing estimation e

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::2d7c4610fd41a221e0e712162376a5d2
https://doi.org/10.1016/j.specom.2016.02.002

Zobrazit plný text záznamu

NTCD-TIMIT: A New Database and Baseline for Noise-Robust Audio-Visual Speech Recognition

Autor: Ahmed Hussen Abdelaziz

Publikováno v: INTERSPEECH

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::a1035b403f58f41c95f9517139a6f140
https://doi.org/10.21437/interspeech.2017-860

Zobrazit plný text záznamu

Turbo Decoders for Audio-Visual Continuous Speech Recognition

Autor: Ahmed Hussen Abdelaziz

Publikováno v: INTERSPEECH

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::cd9912efbeb0761740efed7b54a6f9f0
https://doi.org/10.21437/interspeech.2017-799

Zobrazit plný text záznamu

Improving acoustic modeling using audio-visual speech

Autor: Ahmed Hussen Abdelaziz

Publikováno v: ICME

Reliable visual features that encode the articulator movements of speakers can dramatically improve the decoding accuracy of automatic speech recognition systems when combined with the corresponding acoustic signals. In this paper, a novel framework

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::40d7180d43529bc6763a06929e16ac9d
https://doi.org/10.1109/icme.2017.8019294

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání