Zobrazeno 1 - 10
of 62
pro vyhledávání: '"Cai Zexin"'
Autor:
Xinyuan, Henry Li, Cai, Zexin, Garg, Ashi, Duh, Kevin, García-Perera, Leibny Paola, Khudanpur, Sanjeev, Andrews, Nicholas, Wiesner, Matthew
We present a number of systems for the Voice Privacy Challenge, including voice conversion based systems such as the kNN-VC method and the WavLM voice Conversion method, and text-to-speech (TTS) based systems including Whisper-VITS. We found that whi
Externí odkaz:
http://arxiv.org/abs/2409.08913
Autor:
Cai, Zexin, Xinyuan, Henry Li, Garg, Ashi, García-Perera, Leibny Paola, Duh, Kevin, Khudanpur, Sanjeev, Andrews, Nicholas, Wiesner, Matthew
Advances in speech technology now allow unprecedented access to personally identifiable information through speech. To protect such information, the differential privacy field has explored ways to anonymize speech while preserving its utility, includ
Externí odkaz:
http://arxiv.org/abs/2409.03655
Autor:
Li, Ze, Lin, Yuke, Yao, Tian, Suo, Hongbin, Zhang, Pengyuan, Ren, Yanzhen, Cai, Zexin, Nishizaki, Hiromitsu, Li, Ming
Voice conversion (VC) systems can transform audio to mimic another speaker's voice, thereby attacking speaker verification (SV) systems. However, ongoing studies on source speaker verification (SSV) are hindered by limited data availability and metho
Externí odkaz:
http://arxiv.org/abs/2406.04951
Speaker representation learning is critical for modern voice recognition systems. While supervised learning techniques require extensive labeled data, unsupervised methodologies can leverage vast unlabeled corpora, offering a scalable solution. This
Externí odkaz:
http://arxiv.org/abs/2401.01473
This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023). Our approach involves the utilization of multiple detection systems to identify splicin
Externí odkaz:
http://arxiv.org/abs/2308.10281
The present paper proposes a waveform boundary detection system for audio spoofing attacks containing partially manipulated segments. Partially spoofed/fake audio, where part of the utterance is replaced, either with synthetic or natural audio clips,
Externí odkaz:
http://arxiv.org/abs/2211.00226
An automatic speaker verification system aims to verify the speaker identity of a speech signal. However, a voice conversion system could manipulate a person's speech signal to make it sound like another speaker's voice and deceive the speaker verifi
Externí odkaz:
http://arxiv.org/abs/2206.09103
Autor:
Cai, Zexin, Li, Ming
In this paper, we propose an invertible deep learning framework called INVVC for voice conversion. It is designed against the possible threats that inherently come along with voice conversion systems. Specifically, we develop an invertible framework
Externí odkaz:
http://arxiv.org/abs/2201.10687
Nowadays, as more and more systems achieve good performance in traditional voice conversion (VC) tasks, people's attention gradually turns to VC tasks under extreme conditions. In this paper, we propose a novel method for zero-shot voice conversion.
Externí odkaz:
http://arxiv.org/abs/2111.03811
Publikováno v:
In Engineering Structures 1 May 2024 306