Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Kyungguen Byun"'
Publikováno v:
IEEE Access, Vol 10, Pp 60362-60372 (2022)
In this study, we proposed a length-normalized representation learning method for speech and text to address the inherent problem of sequence-to-sequence models when the input and output sequences exhibit different lengths. To this end, the represent
Externí odkaz:
https://doaj.org/article/8c9522ecca2b413fb43fa1f72e291214
Publikováno v:
ACSSC
In this paper, we propose an effective way of providing conditional features for a flow-based neural vocoder. Most conventional approaches utilize mel-spectrograms for conditioning neural vocoders, but this significantly increases the size of neural
Publikováno v:
MMSP
This paper proposes speaker-adaptive neural vocoders for parametric text-to-speech (TTS) systems. Recently proposed WaveNet-based neural vocoding systems successfully generate a time sequence of speech signal with an autoregressive framework. However
Publikováno v:
ICASSP
This paper proposes an effective emotion control method for an end-to-end text-to-speech (TTS) system. To flexibly control the distinct characteristic of a target emotion category, it is essential to determine embedding vectors representing the TTS i
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::772663f4b13d07428c1e3775feee7a18
http://arxiv.org/abs/1911.01635
http://arxiv.org/abs/1911.01635
Publikováno v:
2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC).
In this paper, we propose a neural vocoder-based text-to-speech (TTS) system that effectively utilizes a source-filter modeling framework. Although neural vocoder algorithms such as SampleRNN and WaveNet are well-known to generate high-quality speech
Publikováno v:
ICASSP
This paper proposes a novel noise compensation algorithm for a glottal excitation model in a deep learning (DL)-based speech synthesis system. To generate high-quality speech synthesis outputs, the balance between harmonic and noise components of the
Publikováno v:
EUSIPCO
This paper proposes a WaveNet-based neural excitation model (ExcitNet) for statistical parametric speech synthesis systems. Conventional WaveNet-based neural vocoding systems significantly improve the perceptual quality of synthesized speech by stati
Externí odkaz:
https://explore.openaire.eu/search/publication?articleId=doi_dedup___::497be883594178a2a72dffe693c92617
Publikováno v:
DSP
This paper proposes a deep neural network (DNN) based non-intrusive speech quality estimation method in real-time voice communication systems. Since the proposed method only utilizes real-time control protocol (RTCP) information in the receiver side
Publikováno v:
EMBC
This paper proposes a constrained two-layer compression technique for electrocardiogram (ECG) waves, of which encoded parameters can be directly used for the diagnosis of arrhythmia. In the first layer, a single ECG beat is represented by one of the