Zobrazeno 1 - 10
of 278
pro vyhledávání: '"Wang, Yikang"'
Current research in synthesized speech detection primarily focuses on the generalization of detection systems to unknown spoofing methods of noise-free speech. However, the performance of anti-spoofing countermeasures (CM) system is often don't work
Externí odkaz:
http://arxiv.org/abs/2407.20111
Few-shot action recognition is an emerging field in computer vision, primarily focused on meta-learning within the same domain. However, challenges arise in real-world scenario deployment, as gathering extensive labeled data within a specific domain
Externí odkaz:
http://arxiv.org/abs/2407.05657
Recently, few-shot action recognition has significantly progressed by learning the feature discriminability and designing suitable comparison methods. Still, there are the following restrictions. (a) Previous works are mainly based on visual mono-mod
Externí odkaz:
http://arxiv.org/abs/2312.01083
This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023). Our approach involves the utilization of multiple detection systems to identify splicin
Externí odkaz:
http://arxiv.org/abs/2308.10281
Finding synthetic artifacts of spoofing data will help the anti-spoofing countermeasures (CMs) system discriminate between spoofed and real speech. The Conformer combines the best of convolutional neural network and the Transformer, allowing it to ag
Externí odkaz:
http://arxiv.org/abs/2307.01546
A reliable voice anti-spoofing countermeasure system needs to robustly protect automatic speaker verification (ASV) systems in various kinds of spoofing scenarios. However, the performance of countermeasure systems could be degraded by channel effect
Externí odkaz:
http://arxiv.org/abs/2211.06546
This paper describes our DKU-OPPO system for the 2022 Spoofing-Aware Speaker Verification (SASV) Challenge. First, we split the joint task into speaker verification (SV) and spoofing countermeasure (CM), these two tasks which are optimized separately
Externí odkaz:
http://arxiv.org/abs/2207.07510
Autor:
Wang, Yikang, Nishizaki, Hiromitsu
In speech-related classification tasks, frequency-domain acoustic features such as logarithmic Mel-filter bank coefficients (FBANK) and cepstral-domain acoustic features such as Mel-frequency cepstral coefficients (MFCC) are often used. However, time
Externí odkaz:
http://arxiv.org/abs/2203.16085
Publikováno v:
In Neurocomputing 14 September 2024 598
Publikováno v:
In Computers and Geotechnics September 2024 173