Výsledky vyhledávání

Report

Multi-objective Progressive Clustering for Semi-supervised Domain Adaptation in Speaker Verification

Autor: Li, Ze, Lin, Yuke, Jiang, Ning, Qin, Xiaoyi, Zhao, Guoqing, Wu, Haiying, Li, Ming

Utilizing the pseudo-labeling algorithm with large-scale unlabeled data becomes crucial for semi-supervised domain adaptation in speaker verification tasks. In this paper, we propose a novel pseudo-labeling method named Multi-objective Progressive Cl

Externí odkaz: http://arxiv.org/abs/2310.04760

Zobrazit plný text záznamu

Report

Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification

Autor: Lin, Yuke, Qin, Xiaoyi, Jiang, Ning, Zhao, Guoqing, Li, Ming

It is widely acknowledged that discriminative representation for speaker verification can be extracted from verbal speech. However, how much speaker information that non-verbal vocalization carries is still a puzzle. This paper explores speaker verif

Externí odkaz: http://arxiv.org/abs/2309.14109

Zobrazit plný text záznamu

Report

The DKU-MSXF Speaker Verification System for the VoxCeleb Speaker Recognition Challenge 2023

Autor: Li, Ze, Lin, Yuke, Qin, Xiaoyi, Jiang, Ning, Zhao, Guoqing, Li, Ming

This paper is the system description of the DKU-MSXF System for the track1, track2 and track3 of the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23). For Track 1, we utilize a network structure based on ResNet for training. By constructing a

Externí odkaz: http://arxiv.org/abs/2308.08766

Zobrazit plný text záznamu

Report

The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

Autor: Cheng, Ming, Wang, Weiqing, Qin, Xiaoyi, Lin, Yuke, Jiang, Ning, Zhao, Guoqing, Li, Ming

This paper describes the DKU-MSXF submission to track 4 of the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23). Our system pipeline contains voice activity detection, clustering-based diarization, overlapped speech detection, and target-speak

Externí odkaz: http://arxiv.org/abs/2308.07595

Zobrazit plný text záznamu

Report

VoxBlink: A Large Scale Speaker Verification Dataset on Camera

Autor: Lin, Yuke, Qin, Xiaoyi, Zhao, Guoqing, Cheng, Ming, Jiang, Ning, Wu, Haiyang, Li, Ming

In this paper, we introduce a large-scale and high-quality audio-visual speaker verification dataset, named VoxBlink. We propose an innovative and robust automatic audio-visual data mining pipeline to curate this dataset, which contains 1.45M utteran

Externí odkaz: http://arxiv.org/abs/2308.07056

Zobrazit plný text záznamu

Report

Source Tracing: Detecting Voice Spoofing

Autor: Zhu, Tinglong, Wang, Xingming, Qin, Xiaoyi, Li, Ming

Recent anti-spoofing systems focus on spoofing detection, where the task is only to determine whether the test audio is fake. However, there are few studies putting attention to identifying the methods of generating fake speech. Common spoofing attac

Externí odkaz: http://arxiv.org/abs/2212.08601

Zobrazit plný text záznamu

Report

Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction

Autor: Cheng, Ming, Wang, Weiqing, Zhang, Yucong, Qin, Xiaoyi, Li, Ming

Target-speaker voice activity detection is currently a promising approach for speaker diarization in complex acoustic environments. This paper presents a novel Sequence-to-Sequence Target-Speaker Voice Activity Detection (Seq2Seq-TSVAD) method that c

Externí odkaz: http://arxiv.org/abs/2210.16127

Zobrazit plný text záznamu

Report

Laugh Betrays You? Learning Robust Speaker Representation From Speech Containing Non-Verbal Fragments

Autor: Lin, Yuke, Qin, Xiaoyi, Cui, Huahua, Zhu, Zhenyi, Li, Ming

The success of automatic speaker verification shows that discriminative speaker representations can be extracted from neutral speech. However, as a kind of non-verbal voice, laughter should also carry speaker information intuitively. Thus, this paper

Externí odkaz: http://arxiv.org/abs/2210.16028

Zobrazit plný text záznamu

Report

The DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022

Autor: Qin, Xiaoyi, Li, Na, Lin, Yuke, Ding, Yiwei, Weng, Chao, Su, Dan, Li, Ming

This paper is the system description of the DKU-Tencent System for the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC22). In this challenge, we focus on track1 and track3. For track1, multiple backbone networks are adopted to extract frame-level

Externí odkaz: http://arxiv.org/abs/2210.05092

Zobrazit plný text záznamu

Report

The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 2022

Autor: Wang, Weiqing, Qin, Xiaoyi, Cheng, Ming, Zhang, Yucong, Wang, Kangyue, Li, Ming

This paper discribes the DKU-DukeECE submission to the 4th track of the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22). Our system contains a fused voice activity detection model, a clustering-based diarization model, and a target-speaker vo

Externí odkaz: http://arxiv.org/abs/2210.01677

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání