Výsledky vyhledávání

Report

HLTCOE JHU Submission to the Voice Privacy Challenge 2024

Autor: Xinyuan, Henry Li, Cai, Zexin, Garg, Ashi, Duh, Kevin, García-Perera, Leibny Paola, Khudanpur, Sanjeev, Andrews, Nicholas, Wiesner, Matthew

We present a number of systems for the Voice Privacy Challenge, including voice conversion based systems such as the kNN-VC method and the WavLM voice Conversion method, and text-to-speech (TTS) based systems including Whisper-VITS. We found that whi

Externí odkaz: http://arxiv.org/abs/2409.08913

Zobrazit plný text záznamu

Report

Privacy versus Emotion Preservation Trade-offs in Emotion-Preserving Speaker Anonymization

Autor: Cai, Zexin, Xinyuan, Henry Li, Garg, Ashi, García-Perera, Leibny Paola, Duh, Kevin, Khudanpur, Sanjeev, Andrews, Nicholas, Wiesner, Matthew

Advances in speech technology now allow unprecedented access to personally identifiable information through speech. To protect such information, the differential privacy field has explored ways to anonymize speech while preserving its utility, includ

Externí odkaz: http://arxiv.org/abs/2409.03655

Zobrazit plný text záznamu

Report

The Database and Benchmark for the Source Speaker Tracing Challenge 2024

Autor: Li, Ze, Lin, Yuke, Yao, Tian, Suo, Hongbin, Zhang, Pengyuan, Ren, Yanzhen, Cai, Zexin, Nishizaki, Hiromitsu, Li, Ming

Voice conversion (VC) systems can transform audio to mimic another speaker's voice, thereby attacking speaker verification (SV) systems. However, ongoing studies on source speaker verification (SSV) are hindered by limited data availability and metho

Externí odkaz: http://arxiv.org/abs/2406.04951

Zobrazit plný text záznamu

Report

Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning

Autor: Cai, Danwei, Cai, Zexin, Li, Ming

Speaker representation learning is critical for modern voice recognition systems. While supervised learning techniques require extensive labeled data, unsupervised methodologies can leverage vast unlabeled corpora, offering a scalable solution. This

Externí odkaz: http://arxiv.org/abs/2401.01473

Zobrazit plný text záznamu

Report

The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023

Autor: Cai, Zexin, Wang, Weiqing, Wang, Yikang, Li, Ming

This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023). Our approach involves the utilization of multiple detection systems to identify splicin

Externí odkaz: http://arxiv.org/abs/2308.10281

Zobrazit plný text záznamu

Report

Waveform Boundary Detection for Partially Spoofed Audio

Autor: Cai, Zexin, Wang, Weiqing, Li, Ming

The present paper proposes a waveform boundary detection system for audio spoofing attacks containing partially manipulated segments. Partially spoofed/fake audio, where part of the utterance is replaced, either with synthetic or natural audio clips,

Externí odkaz: http://arxiv.org/abs/2211.00226

Zobrazit plný text záznamu

Report

Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems

Autor: Cai, Danwei, Cai, Zexin, Li, Ming

An automatic speaker verification system aims to verify the speaker identity of a speech signal. However, a voice conversion system could manipulate a person's speech signal to make it sound like another speaker's voice and deceive the speaker verifi

Externí odkaz: http://arxiv.org/abs/2206.09103

Zobrazit plný text záznamu

Report

Invertible Voice Conversion

Autor: Cai, Zexin, Li, Ming

In this paper, we propose an invertible deep learning framework called INVVC for voice conversion. It is designed against the possible threats that inherently come along with voice conversion systems. Specifically, we develop an invertible framework

Externí odkaz: http://arxiv.org/abs/2201.10687

Zobrazit plný text záznamu

Report

SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines

Autor: Zhang, Haozhe, Cai, Zexin, Qin, Xiaoyi, Li, Ming

Nowadays, as more and more systems achieve good performance in traditional voice conversion (VC) tasks, people's attention gradually turns to VC tasks under extreme conditions. In this paper, we propose a novel method for zero-shot voice conversion.

Externí odkaz: http://arxiv.org/abs/2111.03811

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání