Zobrazeno 1 - 10
of 62
pro vyhledávání: '"Ke, Dengfeng"'
Autor:
Deng, Yayue, Xue, Jinlong, Jia, Yukang, Li, Qifei, Han, Yichen, Wang, Fengping, Gao, Yingming, Ke, Dengfeng, Li, Ya
Conversational speech synthesis (CSS) incorporates historical dialogue as supplementary information with the aim of generating speech that has dialogue-appropriate prosody. While previous methods have already delved into enhancing context comprehensi
Externí odkaz:
http://arxiv.org/abs/2312.10358
Autor:
Ke, Dengfeng, Deng, Yayue, Jia, Yukang, Xue, Jinlong, Luo, Qi, Li, Ya, Sun, Jianqing, Liang, Jiaen, Lin, Binghuai
Regressive Text-to-Speech (TTS) system utilizes attention mechanism to generate alignment between text and acoustic feature sequence. Alignment determines synthesis robustness (e.g, the occurence of skipping, repeating, and collapse) and rhythm via d
Externí odkaz:
http://arxiv.org/abs/2306.02593
Mispronunciation detection and diagnosis (MDD) technology is a key component of computer-assisted pronunciation training system (CAPT). In the field of assessing the pronunciation quality of constrained speech, the given transcriptions can play the r
Externí odkaz:
http://arxiv.org/abs/2206.07289
With the rapid development of neural network architectures and speech processing models, singing voice synthesis with neural networks is becoming the cutting-edge technique of digital music production. In this work, in order to explore how to improve
Externí odkaz:
http://arxiv.org/abs/2108.03008
Single channel speech enhancement is a challenging task in speech community. Recently, various neural networks based methods have been applied to speech enhancement. Among these models, PHASEN and T-GSA achieve state-of-the-art performances on the pu
Externí odkaz:
http://arxiv.org/abs/2105.02509
Recently, end-to-end mispronunciation detection and diagnosis (MD&D) systems has become a popular alternative to greatly simplify the model-building process of conventional hybrid DNN-HMM systems by representing complicated modules with a single deep
Externí odkaz:
http://arxiv.org/abs/2104.08428
Publikováno v:
In Speech Communication March 2024 158
It becomes urgent to design effective anti-spoofing algorithms for vulnerable automatic speaker verification systems due to the advancement of high-quality playback devices. Current studies mainly treat anti-spoofing as a binary classification proble
Externí odkaz:
http://arxiv.org/abs/2006.14563
Formant Tracking Using Dilated Convolutional Networks Through Dense Connection with Gating Mechanism
Formant tracking is one of the most fundamental problems in speech processing. Traditionally, formants are estimated using signal processing methods. Recent studies showed that generic convolutional architectures can outperform recurrent networks on
Externí odkaz:
http://arxiv.org/abs/2005.10803
Sentiment analysis, mostly based on text, has been rapidly developing in the last decade and has attracted widespread attention in both academia and industry. However, the information in the real world usually comes from multiple modalities, such as
Externí odkaz:
http://arxiv.org/abs/1904.08138