Zobrazeno 1 - 10
of 4 778
pro vyhledávání: '"Wang, Dongmei"'
Autor:
Li, Jiaqi, Wang, Dongmei, Wang, Xiaofei, Qian, Yao, Zhou, Long, Liu, Shujie, Yousefi, Midia, Li, Canrun, Tsai, Chung-Hsien, Xiao, Zhen, Liu, Yanqing, Chen, Junkun, Zhao, Sheng, Li, Jinyu, Wu, Zhizheng, Zeng, Michael
Neural audio codec tokens serve as the fundamental building blocks for speech language model (SLM)-based speech generation. However, there is no systematic understanding on how the codec system affects the speech generation performance of the SLM. In
Externí odkaz:
http://arxiv.org/abs/2409.04016
Autor:
Le, Chenyang, Qian, Yao, Wang, Dongmei, Zhou, Long, Liu, Shujie, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, Zhao, Sheng, Zeng, Michael
There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeli
Externí odkaz:
http://arxiv.org/abs/2405.17809
Autor:
Zhang, Leying, Qian, Yao, Zhou, Long, Liu, Shujie, Wang, Dongmei, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, He, Lei, Zhao, Sheng, Zeng, Michael
Recent advancements in zero-shot text-to-speech (TTS) modeling have led to significant strides in generating high-fidelity and diverse speech. However, dialogue generation, along with achieving human-like naturalness in speech, continues to be a chal
Externí odkaz:
http://arxiv.org/abs/2404.06690
Target-Speaker Voice Activity Detection (TS-VAD) utilizes a set of speaker profiles alongside an input audio signal to perform speaker diarization. While its superiority over conventional methods has been demonstrated, the method can suffer from erro
Externí odkaz:
http://arxiv.org/abs/2309.12521
Autor:
Li, Chenda, Qian, Yao, Chen, Zhuo, Kanda, Naoyuki, Wang, Dongmei, Yoshioka, Takuya, Qian, Yanmin, Zeng, Michael
State-of-the-art large-scale universal speech models (USMs) show a decent automatic speech recognition (ASR) performance across multiple domains and languages. However, it remains a challenge for these models to recognize overlapped speech, which is
Externí odkaz:
http://arxiv.org/abs/2305.18747
Autor:
Li, Chenda, Qian, Yao, Chen, Zhuo, Wang, Dongmei, Yoshioka, Takuya, Liu, Shujie, Qian, Yanmin, Zeng, Michael
Automatic target sound extraction (TSE) is a machine learning approach to mimic the human auditory perception capability of attending to a sound source of interest from a mixture of sources. It often uses a model conditioned on a fixed form of target
Externí odkaz:
http://arxiv.org/abs/2303.08372
This paper describes a speaker diarization model based on target speaker voice activity detection (TS-VAD) using transformers. To overcome the original TS-VAD model's drawback of being unable to handle an arbitrary number of speakers, we investigate
Externí odkaz:
http://arxiv.org/abs/2208.13085
Publikováno v:
Gongye shui chuli, Vol 44, Iss 3, Pp 206-210 (2024)
The reclaimed water of municipal sewage plant after advanced treatment in a power plant is used as makeup of circulating water system. Due to the problems in the advanced treatment system of reclaimed water,such as the output of the mechanical acce
Externí odkaz:
https://doaj.org/article/611b4e5b0d9e439cacc5c9020a919270
Autor:
Wang, Dongmei, Bowman, Dwight, Brown, Heidi, Harrington, Laura, Kaufman, Phillip, McKay, Tanja, Nelson, Charles, Sharp, Julia, Lund, Robert
BACKGROUND:This paper examines the individual factors that influence prevalence rates of canine heartworm in the contiguous United States. A data set provided by the Companion Animal Parasite Council, which contains county-by-county results of over n
Externí odkaz:
http://hdl.handle.net/10150/610230
http://arizona.openrepository.com/arizona/handle/10150/610230
http://arizona.openrepository.com/arizona/handle/10150/610230
Existing multi-channel continuous speech separation (CSS) models are heavily dependent on supervised data - either simulated data which causes data mismatch between the training and real-data testing, or the real transcribed overlapping data, which i
Externí odkaz:
http://arxiv.org/abs/2204.03232