Výsledky vyhledávání - "Min-Jae Hwang"

TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis

Autor: Eunwoo Song, Min-Jae Hwang, Jae-Min Kim, Ryuichi Yamamoto

Publikováno v: ICASSP

In this paper, we propose a text-to-speech (TTS)-driven data augmentation method for improving the quality of a non-autoregressive (AR) TTS system. Recently proposed non-AR models, such as FastSpeech 2, have successfully achieved fast speech synthesi

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::867cb3922f4ca8e8c0e17fa067f5f51b
http://arxiv.org/abs/2010.13421

Zobrazit plný text záznamu

Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network

Autor: Min-Jae Hwang, Hong-Goo Kang, Ryuichi Yamamoto, Eunwoo Song, Frank K. Soong

Publikováno v: ICASSP

In this paper, we propose an improved LPCNet vocoder using a linear prediction (LP)-structured mixture density network (MDN). The recently proposed LPCNet vocoder has successfully achieved high-quality and lightweight speech synthesis systems by comb

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::95fafdd73a792901939d26bd68afc12d
http://arxiv.org/abs/2001.11686

Zobrazit plný text záznamu

SVD-Based Adaptive QIM Watermarking on Stereo Audio Signals

Autor: Hong-Goo Kang, Min-Jae Hwang, JeeSok Lee, Mi-Suk Lee

Publikováno v: IEEE Transactions on Multimedia. 20:45-54

This paper proposes a blind digital audio water- marking algorithm that utilizes the quantization index modulation (QIM) and the singular value decomposition (SVD) of stereo audio signals. Conventional SVD-based blind audio watermarking algorithms la

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_________::fd48bdafb634d552fca69e1a822fc302
https://doi.org/10.1109/tmm.2017.2721642

Zobrazit plný text záznamu

Parameter Enhancement for MELP Speech Codec in Noisy Communication Environment

Autor: Min-Jae Hwang, Hong-Goo Kang

Publikováno v: INTERSPEECH

In this paper, we propose a deep learning (DL)-based parameter enhancement method for a mixed excitation linear prediction (MELP) speech codec in noisy communication environment. Unlike conventional speech enhancement modules that are designed to obt

Externí odkaz: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::9dec324c154a89ff4c65696316f969a1
https://doi.org/10.21437/interspeech.2019-3249

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání