Výsledky vyhledávání - "Mun, Sung Hwan"

Report

EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings

Autor: Mun, Sung Hwan, Han, Min Hyun, Moon, Canyeong, Kim, Nam Soo

In recent years, there have been studies to further improve the end-to-end neural speaker diarization (EEND) systems. This letter proposes the EEND-DEMUX model, a novel framework utilizing demultiplexed speaker embeddings. In this work, we focus on d

Externí odkaz: http://arxiv.org/abs/2312.06065

Zobrazit plný text záznamu

Report

Towards single integrated spoofing-aware speaker verification embeddings

Autor: Mun, Sung Hwan, Shim, Hye-jin, Tak, Hemlata, Wang, Xin, Liu, Xuechen, Sahidullah, Md, Jeong, Myeonghun, Han, Min Hyun, Todisco, Massimiliano, Lee, Kong Aik, Yamagishi, Junichi, Evans, Nicholas, Kinnunen, Tomi, Kim, Nam Soo, Jung, Jee-weon

This study aims to develop a single integrated spoofing-aware speaker verification (SASV) embeddings that satisfy two aspects. First, rejecting non-target speakers' input as well as target speakers' spoofed inputs should be addressed. Second, competi

Externí odkaz: http://arxiv.org/abs/2305.19051

Zobrazit plný text záznamu

Report

Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech

Autor: Choi, Byoung Jin, Jeong, Myeonghun, Kim, Minchan, Mun, Sung Hwan, Kim, Nam Soo

Several recently proposed text-to-speech (TTS) models achieved to generate the speech samples with the human-level quality in the single-speaker and multi-speaker TTS scenarios with a set of pre-defined speakers. However, synthesizing a new speaker's

Externí odkaz: http://arxiv.org/abs/2210.05979

Zobrazit plný text záznamu

Report

Fully Unsupervised Training of Few-shot Keyword Spotting

Autor: Lee, Dongjune, Kim, Minchan, Mun, Sung Hwan, Han, Min Hyun, Kim, Nam Soo

For training a few-shot keyword spotting (FS-KWS) model, a large labeled dataset containing massive target keywords has known to be essential to generalize to arbitrary target keywords with only a few enrollment samples. To alleviate the expensive da

Externí odkaz: http://arxiv.org/abs/2210.02732

Zobrazit plný text záznamu

Report

Disentangled Speaker Representation Learning via Mutual Information Minimization

Autor: Mun, Sung Hwan, Han, Min Hyun, Kim, Minchan, Lee, Dongjune, Kim, Nam Soo

Domain mismatch problem caused by speaker-unrelated feature has been a major topic in speaker recognition. In this paper, we propose an explicit disentanglement framework to unravel speaker-relevant features from speaker-unrelated features via mutual

Externí odkaz: http://arxiv.org/abs/2208.08012

Zobrazit plný text záznamu

Report

Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification

Autor: Mun, Sung Hwan, Jung, Jee-weon, Han, Min Hyun, Kim, Nam Soo

The majority of recent state-of-the-art speaker verification architectures adopt multi-scale processing and frequency-channel attention mechanisms. Convolutional layers of these models typically have a fixed kernel size, e.g., 3 or 5. In this study,

Externí odkaz: http://arxiv.org/abs/2204.01005

Zobrazit plný text záznamu

Report

Bootstrap Equilibrium and Probabilistic Speaker Representation Learning for Self-supervised Speaker Verification

Autor: Mun, Sung Hwan, Han, Min Hyun, Lee, Dongjune, Kim, Jihwan, Kim, Nam Soo

In this paper, we propose self-supervised speaker representation learning strategies, which comprise of a bootstrap equilibrium speaker representation learning in the front-end and an uncertainty-aware probabilistic speaker embedding training in the

Externí odkaz: http://arxiv.org/abs/2112.08929

Zobrazit plný text záznamu

Report

Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium Learning

Autor: Mun, Sung Hwan, Kang, Woo Hyun, Han, Min Hyun, Kim, Nam Soo

In this paper, we propose a simple but powerful unsupervised learning method for speaker recognition, namely Contrastive Equilibrium Learning (CEL), which increases the uncertainty on nuisance factors latent in the embeddings by employing the uniform

Externí odkaz: http://arxiv.org/abs/2010.11433

Zobrazit plný text záznamu

Report

Robust Text-Dependent Speaker Verification via Character-Level Information Preservation for the SdSV Challenge 2020

Autor: Mun, Sung Hwan, Kang, Woo Hyun, Han, Min Hyun, Kim, Nam Soo

This paper describes our submission to Task 1 of the Short-duration Speaker Verification (SdSV) challenge 2020. Task 1 is a text-dependent speaker verification task, where both the speaker and phrase are required to be verified. The submitted systems

Externí odkaz: http://arxiv.org/abs/2010.11408

Zobrazit plný text záznamu

Report

Disentangled speaker and nuisance attribute embedding for robust speaker verification

Autor: Kang, Woo Hyun, Mun, Sung Hwan, Han, Min Hyun, Kim, Nam Soo

Over the recent years, various deep learning-based embedding methods have been proposed and have shown impressive performance in speaker verification. However, as in most of the classical embedding techniques, the deep learning-based methods are know

Externí odkaz: http://arxiv.org/abs/2008.03024

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání