Výsledky vyhledávání

Time-Domain Speech Separation Networks With Graph Encoding Auxiliary

Autor: Tingting Wang, Zexu Pan, Meng Ge, Zhen Yang, Haizhou Li

Publikováno v: IEEE Signal Processing Letters. 30:110-114

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::cdaa10081fd51c7c130b84eb8b7aa95a
https://doi.org/10.1109/lsp.2023.3243764

Zobrazit plný text záznamu

ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding Inpainting

Autor: Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li

The speaker extraction technique seeks to single out the voice of a target speaker from the interfering voices in a speech mixture. Typically an auxiliary reference of the target speaker is used to form voluntary attention. Either a pre-recorded utte

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::fc62c1f2234c7147ac7863106176422a
http://arxiv.org/abs/2211.00109

Zobrazit plný text záznamu

Speaker Extraction with Co-Speech Gestures Cue

Autor: Zexu Pan, Xinyuan Qian, Haizhou Li

Speaker extraction seeks to extract the clean speech of a target speaker from a multi-talker mixture speech. There have been studies to use a pre-recorded speech sample or face image of the target speaker as the speaker cue. In human communication, c

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f9db3bad2aaa3705964781f15ec4851e

Zobrazit plný text záznamu

USEV: Universal Speaker Extraction with Visual Cue

Autor: Zexu Pan, Meng Ge, Haizhou Li

A speaker extraction algorithm seeks to extract the target speaker's speech from a multi-talker speech mixture. The prior studies focus mostly on speaker extraction from a highly overlapped multi-talker speech mixture. However, the target-interferenc

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::e4139bcaca11b265827bba82a4bae5b1
http://arxiv.org/abs/2109.14831

Zobrazit plný text záznamu

Selective Listening by Synchronizing Speech with Lips

Autor: Zexu Pan, Ruijie Tao, Chenglin Xu, Haizhou Li

A speaker extraction algorithm seeks to extract the speech of a target speaker from a multi-talker speech mixture when given a cue that represents the target speaker, such as a pre-enrolled speech utterance, or an accompanying video track. Visual cue

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::53f0c617e2ffe81e6c435fc2aff9697c
http://arxiv.org/abs/2106.07150

Zobrazit plný text záznamu

Multi-target DoA Estimation with an Audio-visual Fusion Mechanism

Autor: Zexu Pan, Haizhou Li, Maulik C. Madhavi, Xinyuan Qian, Jiadong Wang

Publikováno v: ICASSP

Most of the prior studies in the spatial \ac{DoA} domain focus on a single modality. However, humans use auditory and visual senses to detect the presence of sound sources. With this motivation, we propose to use neural networks with audio and visual

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::065f264fcbee317d34f6d335462d23a8
http://arxiv.org/abs/2105.06107

Zobrazit plný text záznamu

Muse: Multi-modal target speaker extraction with visual cues

Autor: Chenglin Xu, Ruijie Tao, Zexu Pan, Haizhou Li

Publikováno v: ICASSP

Speaker extraction algorithm relies on the speech sample from the target speaker as the reference point to focus its attention. Such a reference speech is typically pre-recorded. On the other hand, the temporal synchronization between speech and lip

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::f745a12dfc999ceb73baf63dc2166519
http://arxiv.org/abs/2010.07775

Zobrazit plný text záznamu

Multi-modal Attention for Speech Emotion Recognition

Autor: Zexu Pan, Zhaojie Luo, Jichen Yang, Haizhou Li

Publikováno v: INTERSPEECH

Emotion represents an essential aspect of human speech that is manifested in speech prosody. Speech, visual, and textual cues are complementary in human communication. In this paper, we study a hybrid fusion method, referred to as multi-modal attenti

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::8c63d3cdb8ca724eae47819e38dd50a4
http://arxiv.org/abs/2009.04107

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání