Výsledky vyhledávání - "Lee, JunHyeok"

Report

Improving Factuality of 3D Brain MRI Report Generation with Paired Image-domain Retrieval and Text-domain Augmentation

Autor: Lee, Junhyeok, Oh, Yujin, Lee, Dahyoun, Joh, Hyon Keun, Sohn, Chul-Ho, Baik, Sung Hyun, Jung, Cheol Kyu, Park, Jung Hyun, Choi, Kyu Sung, Kim, Byung-Hoon, Ye, Jong Chul

Acute ischemic stroke (AIS) requires time-critical management, with hours of delayed intervention leading to an irreversible disability of the patient. Since diffusion weighted imaging (DWI) using the magnetic resonance image (MRI) plays a crucial ro

Externí odkaz: http://arxiv.org/abs/2411.15490

Zobrazit plný text záznamu

Report

Super Monotonic Alignment Search

Autor: Lee, Junhyeok, Kim, Hyeongju

Monotonic alignment search (MAS), introduced by Glow-TTS, is one of the most popular algorithm in TTS to estimate unknown alignments between text and speech. Since this algorithm needs to search for the most probable alignment with dynamic programmin

Externí odkaz: http://arxiv.org/abs/2409.07704

Zobrazit plný text záznamu

Report

DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance

Autor: Yang, Jinhyeok, Lee, Junhyeok, Choi, Hyeong-Seok, Ji, Seunghun, Kim, Hyeongju, Lee, Juheon

Text-to-Speech (TTS) models have advanced significantly, aiming to accurately replicate human speech's diversity, including unique speaker identities and linguistic nuances. Despite these advancements, achieving an optimal balance between speaker-fid

Externí odkaz: http://arxiv.org/abs/2408.14423

Zobrazit plný text záznamu

Report

JenGAN: Stacked Shifted Filters in GAN-Based Speech Synthesis

Autor: Cho, Hyunjae, Lee, Junhyeok, Jung, Wonbin

Non-autoregressive GAN-based neural vocoders are widely used due to their fast inference speed and high perceptual quality. However, they often suffer from audible artifacts such as tonal artifacts in their generated results. Therefore, we propose Je

Externí odkaz: http://arxiv.org/abs/2406.06111

Zobrazit plný text záznamu

Report

Diversifying and Expanding Frequency-Adaptive Convolution Kernels for Sound Event Detection

Autor: Nam, Hyeonuk, Kim, Seong-Hu, Min, Deokki, Lee, Junhyeok, Park, Yong-Hwa

Frequency dynamic convolution (FDY conv) has shown the state-of-the-art performance in sound event detection (SED) using frequency-adaptive kernels obtained by frequency-varying combination of basis kernels. However, FDY conv lacks an explicit mean t

Externí odkaz: http://arxiv.org/abs/2406.05341

Zobrazit plný text záznamu

Report

REVECA: Adaptive Planning and Trajectory-based Validation in Cooperative Language Agents using Information Relevance and Relative Proximity

Autor: Seo, SeungWon, Noh, SeongRae, Lee, Junhyeok, Lim, SooBin, Lee, Won Hee, Kang, HyeongYeop

We address the challenge of multi-agent cooperation, where agents achieve a common goal by cooperating with decentralized agents under complex partial observations. Existing cooperative agent systems often struggle with efficiently processing continu

Externí odkaz: http://arxiv.org/abs/2405.16751

Zobrazit plný text záznamu

Report

LatentSwap: An Efficient Latent Code Mapping Framework for Face Swapping

Autor: Choi, Changho, Kim, Minho, Lee, Junhyeok, Song, Hyoung-Kyu, Kim, Younggeun, Kim, Seungryong

We propose LatentSwap, a simple face swapping framework generating a face swap latent code of a given generator. Utilizing randomly sampled latent codes, our framework is light and does not require datasets besides employing the pre-trained models, w

Externí odkaz: http://arxiv.org/abs/2402.18351

Zobrazit plný text záznamu

Report

VIFS: An End-to-End Variational Inference for Foley Sound Synthesis

Autor: Lee, Junhyeok, Nam, Hyeonuk, Park, Yong-Hwa

The goal of DCASE 2023 Challenge Task 7 is to generate various sound clips for Foley sound synthesis (FSS) by "category-to-sound" approach. "Category" is expressed by a single index while corresponding "sound" covers diverse and different sound examp

Externí odkaz: http://arxiv.org/abs/2306.05004

Zobrazit plný text záznamu

Report

PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTS

Autor: Lee, Junhyeok, Jung, Wonbin, Cho, Hyunjae, Kim, Jaeyeon, Kim, Jaehwan

Previous pitch-controllable text-to-speech (TTS) models rely on directly modeling fundamental frequency, leading to low variance in synthesized speech. To address this issue, we propose PITS, an end-to-end pitch-controllable TTS model that utilizes v

Externí odkaz: http://arxiv.org/abs/2302.12391

Zobrazit plný text záznamu

Report

Direct Preference-based Policy Optimization without Reward Modeling

Autor: An, Gaon, Lee, Junhyeok, Zuo, Xingdong, Kosaka, Norio, Kim, Kyung-Min, Song, Hyun Oh

Preference-based reinforcement learning (PbRL) is an approach that enables RL agents to learn from preference, which is particularly useful when formulating a reward function is challenging. Existing PbRL methods generally involve a two-step procedur

Externí odkaz: http://arxiv.org/abs/2301.12842

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání