Výsledky vyhledávání - "Wang, Dongmei"

Report

Investigating Neural Audio Codecs for Speech Language Model-Based Speech Generation

Autor: Li, Jiaqi, Wang, Dongmei, Wang, Xiaofei, Qian, Yao, Zhou, Long, Liu, Shujie, Yousefi, Midia, Li, Canrun, Tsai, Chung-Hsien, Xiao, Zhen, Liu, Yanqing, Chen, Junkun, Zhao, Sheng, Li, Jinyu, Wu, Zhizheng, Zeng, Michael

Neural audio codec tokens serve as the fundamental building blocks for speech language model (SLM)-based speech generation. However, there is no systematic understanding on how the codec system affects the speech generation performance of the SLM. In

Externí odkaz: http://arxiv.org/abs/2409.04016

Zobrazit plný text záznamu

Report

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

Autor: Le, Chenyang, Qian, Yao, Wang, Dongmei, Zhou, Long, Liu, Shujie, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, Zhao, Sheng, Zeng, Michael

There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeli

Externí odkaz: http://arxiv.org/abs/2405.17809

Zobrazit plný text záznamu

Report

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

Autor: Zhang, Leying, Qian, Yao, Zhou, Long, Liu, Shujie, Wang, Dongmei, Wang, Xiaofei, Yousefi, Midia, Qian, Yanmin, Li, Jinyu, He, Lei, Zhao, Sheng, Zeng, Michael

Recent advancements in zero-shot text-to-speech (TTS) modeling have led to significant strides in generating high-fidelity and diverse speech. However, dialogue generation, along with achieving human-like naturalness in speech, continues to be a chal

Externí odkaz: http://arxiv.org/abs/2404.06690

Zobrazit plný text záznamu

Report

Profile-Error-Tolerant Target-Speaker Voice Activity Detection

Autor: Wang, Dongmei, Xiao, Xiong, Kanda, Naoyuki, Yousefi, Midia, Yoshioka, Takuya, Wu, Jian

Target-Speaker Voice Activity Detection (TS-VAD) utilizes a set of speaker profiles alongside an input audio signal to perform speaker diarization. While its superiority over conventional methods has been demonstrated, the method can suffer from erro

Externí odkaz: http://arxiv.org/abs/2309.12521

Zobrazit plný text záznamu

Report

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers

Autor: Li, Chenda, Qian, Yao, Chen, Zhuo, Kanda, Naoyuki, Wang, Dongmei, Yoshioka, Takuya, Qian, Yanmin, Zeng, Michael

State-of-the-art large-scale universal speech models (USMs) show a decent automatic speech recognition (ASR) performance across multiple domains and languages. However, it remains a challenge for these models to recognize overlapped speech, which is

Externí odkaz: http://arxiv.org/abs/2305.18747

Zobrazit plný text záznamu

Report

Target Sound Extraction with Variable Cross-modality Clues

Autor: Li, Chenda, Qian, Yao, Chen, Zhuo, Wang, Dongmei, Yoshioka, Takuya, Liu, Shujie, Qian, Yanmin, Zeng, Michael

Automatic target sound extraction (TSE) is a machine learning approach to mimic the human auditory perception capability of attending to a sound source of interest from a mixture of sources. It often uses a model conditioned on a fixed form of target

Externí odkaz: http://arxiv.org/abs/2303.08372

Zobrazit plný text záznamu

Report

Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization

Autor: Wang, Dongmei, Xiao, Xiong, Kanda, Naoyuki, Yoshioka, Takuya, Wu, Jian

This paper describes a speaker diarization model based on target speaker voice activity detection (TS-VAD) using transformers. To overcome the original TS-VAD model's drawback of being unable to handle an arbitrary number of speakers, we investigate

Externí odkaz: http://arxiv.org/abs/2208.13085

Zobrazit plný text záznamu

Akademický článek

Problem diagnosis and analysis of advanced treatment system for reclaimed water in a power plant

Autor: HAN Lin, CHENG Yongming, WANG Dongmei, LI Yajuan, ZHANG Jie

Publikováno v: Gongye shui chuli, Vol 44, Iss 3, Pp 206-210 (2024)

The reclaimed water of municipal sewage plant after advanced treatment in a power plant is used as makeup of circulating water system. Due to the problems in the advanced treatment system of reclaimed water，such as the output of the mechanical acce

Externí odkaz: https://doaj.org/article/611b4e5b0d9e439cacc5c9020a919270

Zobrazit plný text záznamu

Akademický článek

Factors influencing U.S. canine heartworm (Dirofilaria immitis) prevalence

Autor: Wang, Dongmei, Bowman, Dwight, Brown, Heidi, Harrington, Laura, Kaufman, Phillip, McKay, Tanja, Nelson, Charles, Sharp, Julia, Lund, Robert

BACKGROUND:This paper examines the individual factors that influence prevalence rates of canine heartworm in the contiguous United States. A data set provided by the Companion Animal Parasite Council, which contains county-by-county results of over n

Externí odkaz: http://hdl.handle.net/10150/610230
http://arizona.openrepository.com/arizona/handle/10150/610230

Zobrazit plný text záznamu

Report

Leveraging Real Conversational Data for Multi-Channel Continuous Speech Separation

Autor: Wang, Xiaofei, Wang, Dongmei, Kanda, Naoyuki, Eskimez, Sefik Emre, Yoshioka, Takuya

Existing multi-channel continuous speech separation (CSS) models are heavily dependent on supervised data - either simulated data which causes data mismatch between the training and real-data testing, or the real transcribed overlapping data, which i

Externí odkaz: http://arxiv.org/abs/2204.03232

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání