Výsledky vyhledávání - "Ke, Dengfeng"

Report

CONCSS: Contrastive-based Context Comprehension for Dialogue-appropriate Prosody in Conversational Speech Synthesis

Autor: Deng, Yayue, Xue, Jinlong, Jia, Yukang, Li, Qifei, Han, Yichen, Wang, Fengping, Gao, Yingming, Ke, Dengfeng, Li, Ya

Conversational speech synthesis (CSS) incorporates historical dialogue as supplementary information with the aim of generating speech that has dialogue-appropriate prosody. While previous methods have already delved into enhancing context comprehensi

Externí odkaz: http://arxiv.org/abs/2312.10358

Zobrazit plný text záznamu

Report

Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis

Autor: Ke, Dengfeng, Deng, Yayue, Jia, Yukang, Xue, Jinlong, Luo, Qi, Li, Ya, Sun, Jianqing, Liang, Jiaen, Lin, Binghuai

Regressive Text-to-Speech (TTS) system utilizes attention mechanism to generate alignment between text and acoustic feature sequence. Alignment determines synthesis robustness (e.g, the occurence of skipping, repeating, and collapse) and rhythm via d

Externí odkaz: http://arxiv.org/abs/2306.02593

Zobrazit plný text záznamu

Report

Text-Aware End-to-end Mispronunciation Detection and Diagnosis

Autor: Peng, Linkai, Gao, Yingming, Lin, Binghuai, Ke, Dengfeng, Xie, Yanlu, Zhang, Jinsong

Mispronunciation detection and diagnosis (MDD) technology is a key component of computer-assisted pronunciation training system (CAPT). In the field of assessing the pronunciation quality of constrained speech, the given transcriptions can play the r

Externí odkaz: http://arxiv.org/abs/2206.07289

Zobrazit plný text záznamu

Report

An Empirical Study on End-to-End Singing Voice Synthesis with Encoder-Decoder Architectures

Autor: Ke, Dengfeng, Lu, Yuxing, Liu, Xudong, Xu, Yanyan, Sun, Jing, Cai, Cheng-Hao

With the rapid development of neural network architectures and speech processing models, singing voice synthesis with neural networks is becoming the cutting-edge technique of digital music production. In this work, in order to explore how to improve

Externí odkaz: http://arxiv.org/abs/2108.03008

Zobrazit plný text záznamu

Report

Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU

Autor: Ke, Dengfeng, Zhang, Jinsong, Xie, Yanlu, Xu, Yanyan, Lin, Binghuai

Single channel speech enhancement is a challenging task in speech community. Recently, various neural networks based methods have been applied to speech enhancement. Among these models, PHASEN and T-GSA achieve state-of-the-art performances on the pu

Externí odkaz: http://arxiv.org/abs/2105.02509

Zobrazit plný text záznamu

Report

A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augmentation Techniques

Autor: Fu, Kaiqi, Lin, Jones, Ke, Dengfeng, Xie, Yanlu, Zhang, Jinsong, Lin, Binghuai

Recently, end-to-end mispronunciation detection and diagnosis (MD&D) systems has become a popular alternative to greatly simplify the model-building process of conventional hybrid DNN-HMM systems by representing complicated modules with a single deep

Externí odkaz: http://arxiv.org/abs/2104.08428

Zobrazit plný text záznamu

Akademický článek

PLDE: A lightweight pooling layer for spoken language recognition

Autor: Li, Zimu, Xu, Yanyan, Ke, Dengfeng, Su, Kaile

Publikováno v: In Speech Communication March 2024 158

Zobrazit plný text záznamu

Report

Dynamically Mitigating Data Discrepancy with Balanced Focal Loss for Replay Attack Detection

Autor: Dou, Yongqiang, Yang, Haocheng, Yang, Maolin, Xu, Yanyan, Ke, Dengfeng

It becomes urgent to design effective anti-spoofing algorithms for vulnerable automatic speaker verification systems due to the advancement of high-quality playback devices. Current studies mainly treat anti-spoofing as a binary classification proble

Externí odkaz: http://arxiv.org/abs/2006.14563

Zobrazit plný text záznamu

Report

Formant Tracking Using Dilated Convolutional Networks Through Dense Connection with Gating Mechanism

Autor: Dai, Wang, Zhang, Jinsong, Gao, Yingming, Wei, Wei, Ke, Dengfeng, Lin, Binghuai, Xie, Yanlu

Formant tracking is one of the most fundamental problems in speech processing. Traditionally, formants are estimated using signal processing methods. Recent studies showed that generic convolutional architectures can outperform recurrent networks on

Externí odkaz: http://arxiv.org/abs/2005.10803

Zobrazit plný text záznamu

Report

Complementary Fusion of Multi-Features and Multi-Modalities in Sentiment Analysis

Autor: Chen, Feiyang, Luo, Ziqian, Xu, Yanyan, Ke, Dengfeng

Sentiment analysis, mostly based on text, has been rapidly developing in the last decade and has attracted widespread attention in both academia and industry. However, the information in the real world usually comes from multiple modalities, such as

Externí odkaz: http://arxiv.org/abs/1904.08138

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání