Zobrazeno 1 - 10
of 97
pro vyhledávání: '"Cho, Hoon Young"'
Autor:
Jung, Youngmoon, Lee, Jinyoung, Lee, Seungjin, Jung, Myunghun, Lee, Yong-Hyeok, Cho, Hoon-Young
Recent advances in flexible keyword spotting (KWS) with text enrollment allow users to personalize keywords without uttering them during enrollment. However, there is still room for improvement in target keyword performance. In this work, we propose
Externí odkaz:
http://arxiv.org/abs/2412.18142
Autor:
Babaev, Nicholas, Tamogashev, Kirill, Saginbaev, Azat, Shchekotov, Ivan, Bae, Hanbin, Sung, Hosang, Lee, WonJun, Cho, Hoon-Young, Andreev, Pavel
In this paper, we address the challenge of speech enhancement in real-world recordings, which often contain various forms of distortion, such as background noise, reverberation, and microphone artifacts. We revisit the use of Generative Adversarial N
Externí odkaz:
http://arxiv.org/abs/2410.05920
Autor:
Bae, Hanbin, Andreev, Pavel, Saginbaev, Azat, Babaev, Nicholas, Lee, Won-Jun, Sung, Hosang, Cho, Hoon-Young
This paper introduces a speech enhancement solution tailored for true wireless stereo (TWS) earbuds on-device usage. The solution was specifically designed to support conversations in noisy environments, with active noise cancellation (ANC) activated
Externí odkaz:
http://arxiv.org/abs/2409.18705
We propose a novel two-stage text-to-speech (TTS) framework with two types of discrete tokens, i.e., semantic and acoustic tokens, for high-fidelity speech synthesis. It features two core components: the Interpreting module, which processes text and
Externí odkaz:
http://arxiv.org/abs/2406.17310
Autor:
Jung, Youngmoon, Lee, Seungjin, Yang, Joon-Young, Roh, Jaeyoung, Han, Chang Woo, Cho, Hoon-Young
In recent years, there has been an increasing focus on user convenience, leading to increased interest in text-based keyword enrollment systems for keyword spotting (KWS). Since the system utilizes text input during the enrollment phase and audio inp
Externí odkaz:
http://arxiv.org/abs/2406.05314
Autor:
Bae, Jae-Sung, Lee, Joun Yeop, Lee, Ji-Hyun, Mun, Seongkyu, Kang, Taehwa, Cho, Hoon-Young, Kim, Chanwoo
Previous works in zero-shot text-to-speech (ZS-TTS) have attempted to enhance its systems by enlarging the training data through crowd-sourcing or augmenting existing speech data. However, the use of low-quality data has led to a decline in the overa
Externí odkaz:
http://arxiv.org/abs/2310.03538
Autor:
Lee, Jihwan, Bae, Jae-Sung, Mun, Seongkyu, Choi, Heejin, Lee, Joun Yeop, Cho, Hoon-Young, Kim, Chanwoo
With the recent developments in cross-lingual Text-to-Speech (TTS) systems, L2 (second-language, or foreign) accent problems arise. Moreover, running a subjective evaluation for such cross-lingual TTS systems is troublesome. The vowel space analysis,
Externí odkaz:
http://arxiv.org/abs/2211.03078
Recently, end-to-end Korean singing voice systems have been designed to generate realistic singing voices. However, these systems still suffer from a lack of robustness in terms of pronunciation accuracy. In this paper, we propose N-Singer, a non-aut
Externí odkaz:
http://arxiv.org/abs/2106.15205
Recent advances in neural multi-speaker text-to-speech (TTS) models have enabled the generation of reasonably good speech quality with a single model and made it possible to synthesize the speech of a speaker with limited training data. Fine-tuning t
Externí odkaz:
http://arxiv.org/abs/2106.15153
Methods for modeling and controlling prosody with acoustic features have been proposed for neural text-to-speech (TTS) models. Prosodic speech can be generated by conditioning acoustic features. However, synthesized speech with a large pitch-shift sc
Externí odkaz:
http://arxiv.org/abs/2106.15123