Výsledky vyhledávání - "Cho, Hoon Young"

Report

Text-Aware Adapter for Few-Shot Keyword Spotting

Autor: Jung, Youngmoon, Lee, Jinyoung, Lee, Seungjin, Jung, Myunghun, Lee, Yong-Hyeok, Cho, Hoon-Young

Recent advances in flexible keyword spotting (KWS) with text enrollment allow users to personalize keywords without uttering them during enrollment. However, there is still room for improvement in target keyword performance. In this work, we propose

Externí odkaz: http://arxiv.org/abs/2412.18142

Zobrazit plný text záznamu

Report

FINALLY: fast and universal speech enhancement with studio-like quality

Autor: Babaev, Nicholas, Tamogashev, Kirill, Saginbaev, Azat, Shchekotov, Ivan, Bae, Hanbin, Sung, Hosang, Lee, WonJun, Cho, Hoon-Young, Andreev, Pavel

In this paper, we address the challenge of speech enhancement in real-world recordings, which often contain various forms of distortion, such as background noise, reverberation, and microphone artifacts. We revisit the use of Generative Adversarial N

Externí odkaz: http://arxiv.org/abs/2410.05920

Zobrazit plný text záznamu

Report

Speech Boosting: Low-Latency Live Speech Enhancement for TWS Earbuds

Autor: Bae, Hanbin, Andreev, Pavel, Saginbaev, Azat, Babaev, Nicholas, Lee, Won-Jun, Sung, Hosang, Cho, Hoon-Young

This paper introduces a speech enhancement solution tailored for true wireless stereo (TWS) earbuds on-device usage. The solution was specifically designed to support conversations in noisy environments, with active noise cancellation (ANC) activated

Externí odkaz: http://arxiv.org/abs/2409.18705

Zobrazit plný text záznamu

Report

High Fidelity Text-to-Speech Via Discrete Tokens Using Token Transducer and Group Masked Language Model

Autor: Lee, Joun Yeop, Jeong, Myeonghun, Kim, Minchan, Lee, Ji-Hyun, Cho, Hoon-Young, Kim, Nam Soo

We propose a novel two-stage text-to-speech (TTS) framework with two types of discrete tokens, i.e., semantic and acoustic tokens, for high-fidelity speech synthesis. It features two core components: the Interpreting module, which processes text and

Externí odkaz: http://arxiv.org/abs/2406.17310

Zobrazit plný text záznamu

Report

Relational Proxy Loss for Audio-Text based Keyword Spotting

Autor: Jung, Youngmoon, Lee, Seungjin, Yang, Joon-Young, Roh, Jaeyoung, Han, Chang Woo, Cho, Hoon-Young

In recent years, there has been an increasing focus on user convenience, leading to increased interest in text-based keyword enrollment systems for keyword spotting (KWS). Since the system utilizes text input during the enrollment phase and audio inp

Externí odkaz: http://arxiv.org/abs/2406.05314

Zobrazit plný text záznamu

Report

Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis

Autor: Bae, Jae-Sung, Lee, Joun Yeop, Lee, Ji-Hyun, Mun, Seongkyu, Kang, Taehwa, Cho, Hoon-Young, Kim, Chanwoo

Previous works in zero-shot text-to-speech (ZS-TTS) have attempted to enhance its systems by enlarging the training data through crowd-sourcing or augmenting existing speech data. However, the use of low-quality data has led to a decline in the overa

Externí odkaz: http://arxiv.org/abs/2310.03538

Zobrazit plný text záznamu

Report

An Empirical Study on L2 Accents of Cross-lingual Text-to-Speech Systems via Vowel Space

Autor: Lee, Jihwan, Bae, Jae-Sung, Mun, Seongkyu, Choi, Heejin, Lee, Joun Yeop, Cho, Hoon-Young, Kim, Chanwoo

With the recent developments in cross-lingual Text-to-Speech (TTS) systems, L2 (second-language, or foreign) accent problems arise. Moreover, running a subjective evaluation for such cross-lingual TTS systems is troublesome. The vowel space analysis,

Externí odkaz: http://arxiv.org/abs/2211.03078

Zobrazit plný text záznamu

Report

N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement

Autor: Lee, Gyeong-Hoon, Kim, Tae-Woo, Bae, Hanbin, Lee, Min-Ji, Kim, Young-Ik, Cho, Hoon-Young

Recently, end-to-end Korean singing voice systems have been designed to generate realistic singing voices. However, these systems still suffer from a lack of robustness in terms of pronunciation accuracy. In this paper, we propose N-Singer, a non-aut

Externí odkaz: http://arxiv.org/abs/2106.15205

Zobrazit plný text záznamu

Report

GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis

Autor: Yang, Jinhyeok, Bae, Jae-Sung, Bak, Taejun, Kim, Youngik, Cho, Hoon-Young

Recent advances in neural multi-speaker text-to-speech (TTS) models have enabled the generation of reasonably good speech quality with a single model and made it possible to synthesize the speech of a speaker with limited training data. Fine-tuning t

Externí odkaz: http://arxiv.org/abs/2106.15153

Zobrazit plný text záznamu

Report

FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

Autor: Bak, Taejun, Bae, Jae-Sung, Bae, Hanbin, Kim, Young-Ik, Cho, Hoon-Young

Methods for modeling and controlling prosody with acoustic features have been proposed for neural text-to-speech (TTS) models. Prosodic speech can be generated by conditioning acoustic features. However, synthesized speech with a large pitch-shift sc

Externí odkaz: http://arxiv.org/abs/2106.15123

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání