Zobrazeno 1 - 10
of 10
pro vyhledávání: '"Yousefi, Midia"'
Autor:
Le, Chenyang, Qian, Yao, Wang, Dongmei, Zhou, Long, Liu, Shujie, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, Zhao, Sheng, Zeng, Michael
There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeli
Externí odkaz:
http://arxiv.org/abs/2405.17809
Autor:
Zhang, Leying, Qian, Yao, Zhou, Long, Liu, Shujie, Wang, Dongmei, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, He, Lei, Zhao, Sheng, Zeng, Michael
Recent advancements in zero-shot text-to-speech (TTS) modeling have led to significant strides in generating high-fidelity and diverse speech. However, dialogue generation, along with achieving human-like naturalness in speech, continues to be a chal
Externí odkaz:
http://arxiv.org/abs/2404.06690
Target-Speaker Voice Activity Detection (TS-VAD) utilizes a set of speaker profiles alongside an input audio signal to perform speaker diarization. While its superiority over conventional methods has been demonstrated, the method can suffer from erro
Externí odkaz:
http://arxiv.org/abs/2309.12521
Autor:
Yousefi, Midia, Hansen, John H. L.
The goal of speech separation is to extract multiple speech sources from a single microphone recording. Recently, with the advancement of deep learning and availability of large datasets, speech separation has been formulated as a supervised learning
Externí odkaz:
http://arxiv.org/abs/2111.08635
Autor:
Yousefi, Midia, Hanse, John H. L.
This study addresses the problem of single-channel Automatic Speech Recognition of a target speaker within an overlap speech scenario. In the proposed method, the hidden representations in the acoustic model are modulated by speaker auxiliary informa
Externí odkaz:
http://arxiv.org/abs/2111.00320
Autor:
Yousefi, Midia, Hansen, John H. L.
Most current speech technology systems are designed to operate well even in the presence of multiple active speakers. However, most solutions assume that the number of co-current speakers is known. Unfortunately, this information might not always be
Externí odkaz:
http://arxiv.org/abs/2111.00316
Autor:
Yousefi, Midia, Hansen, John H. L.
Naturalistic speech recordings usually contain speech signals from multiple speakers. This phenomenon can degrade the performance of speech technologies due to the complexity of tracing and recognizing individual speakers. In this study, we investiga
Externí odkaz:
http://arxiv.org/abs/2001.09937
Single-microphone, speaker-independent speech separation is normally performed through two steps: (i) separating the specific speech sources, and (ii) determining the best output-label assignment to find the separation error. The second step is the m
Externí odkaz:
http://arxiv.org/abs/1908.01768
Autor:
Yousefi, Midia, Hansen, John H.L.
Publikováno v:
In Speech Communication June 2023 151:76-85
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.