Zobrazeno 1 - 5
of 5
pro vyhledávání: '"Nam, KiHyun"'
This work presents a framework based on feature disentanglement to learn speaker embeddings that are robust to environmental variations. Our framework utilises an auto-encoder as a disentangler, dividing the input speaker embedding into components re
Externí odkaz:
http://arxiv.org/abs/2406.14559
Autor:
Heo, Hee-Soo, Nam, KiHyun, Lee, Bong-Jin, Kwon, Youngki, Lee, Minjae, Kim, You Jin, Chung, Joon Son
In the field of speaker verification, session or channel variability poses a significant challenge. While many contemporary methods aim to disentangle session information from speaker embeddings, we introduce a novel approach using an additional embe
Externí odkaz:
http://arxiv.org/abs/2309.14741
Autor:
Jung, Chaeyoung, Lee, Suyeon, Nam, Kihyun, Rho, Kyeongha, Kim, You Jin, Jang, Youngjoon, Chung, Joon Son
The goal of this work is Active Speaker Detection (ASD), a task to determine whether a person is speaking or not in a series of video frames. Previous works have dealt with the task by exploring network architectures while learning effective represen
Externí odkaz:
http://arxiv.org/abs/2309.12306
The goal of this paper is to learn robust speaker representation for bilingual speaking scenario. The majority of the world's population speak at least two languages; however, most speaker recognition systems fail to recognise the same speaker when s
Externí odkaz:
http://arxiv.org/abs/2211.00437
Autor:
Ha, Jung-Woo, Nam, Kihyun, Kang, Jingu, Lee, Sang-Woo, Yang, Sohee, Jung, Hyunhoon, Kim, Eunmi, Kim, Hyeji, Kim, Soojin, Kim, Hyun Ah, Doh, Kyoungtae, Lee, Chan Kyu, Sung, Nako, Kim, Sunghun
Automatic speech recognition (ASR) via call is essential for various applications, including AI for contact center (AICC) services. Despite the advancement of ASR, however, most publicly available call-based speech corpora such as Switchboard are old
Externí odkaz:
http://arxiv.org/abs/2004.09367