ConvNeXt Based Neural Network for Audio Anti-Spoofing

Autor:	Ma, Qiaowei, Zhong, Jinghui, Yang, Yitao, Liu, Weiheng, Gao, Ying, Ng, Wing W. Y.
Rok vydání:	2022
Předmět:	Computer Science - Sound Computer Science - Computation and Language Electrical Engineering and Systems Science - Audio and Speech Processing
Druh dokumentu:	Working Paper
Popis:	With the rapid development of speech conversion and speech synthesis algorithms, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks. In recent years, researchers had proposed a number of anti-spoofing methods based on hand-crafted features. However, using hand-crafted features rather than raw waveform will lose implicit information for anti-spoofing. Inspired by the promising performance of ConvNeXt in image classification tasks, we revise the ConvNeXt network architecture and propose a lightweight end-to-end anti-spoofing model. By integrating with the channel attention block and using the focal loss function, the proposed model can focus on the most informative sub-bands of speech representations and the difficult samples that are hard to classify. Experiments show that our proposed system could achieve an equal error rate of 0.64% and min-tDCF of 0.0187 for the ASVSpoof 2019 LA evaluation dataset, which outperforms the state-of-the-art systems. Comment: 6 pages
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2209.06434 Zobrazit plný text záznamu View this record from Arxiv