Zobrazeno 1 - 10
of 151
pro vyhledávání: '"Wen, Zhengqi"'
Autor:
Wang, Zhiyong, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Wang, Xiaopeng, Xie, Yuankun, Qi, Xin, Shi, Shuchen, Lu, Yi, Liu, Yukun, Li, Chenxing, Liu, Xuefei, Li, Guanjun
Speech synthesis technology has posed a serious threat to speaker verification systems. Currently, the most effective fake audio detection methods utilize pretrained models, and integrating features from various layers of pretrained model further enh
Externí odkaz:
http://arxiv.org/abs/2409.11909
Autor:
Qi, Xin, Fu, Ruibo, Wen, Zhengqi, Wang, Tao, Qiang, Chunyu, Tao, Jianhua, Li, Chenxing, Lu, Yi, Shi, Shuchen, Wang, Zhiyong, Wang, Xiaopeng, Xie, Yuankun, Liu, Yukun, Liu, Xuefei, Li, Guanjun
In recent years, speech diffusion models have advanced rapidly. Alongside the widely used U-Net architecture, transformer-based models such as the Diffusion Transformer (DiT) have also gained attention. However, current DiT speech models treat Mel sp
Externí odkaz:
http://arxiv.org/abs/2409.11835
Autor:
Xiong, Chenxu, Fu, Ruibo, Shi, Shuchen, Wen, Zhengqi, Tao, Jianhua, Wang, Tao, Li, Chenxing, Qiang, Chunyu, Xie, Yuankun, Qi, Xin, Li, Guanjun, Yang, Zizheng
Current mainstream audio generation methods primarily rely on simple text prompts, often failing to capture the nuanced details necessary for multi-style audio generation. To address this limitation, the Sound Event Enhanced Prompt Adapter is propose
Externí odkaz:
http://arxiv.org/abs/2409.09381
With the rapid development of deepfake technology, especially the deep audio fake technology, misinformation detection on the social media scene meets a great challenge. Social media data often contains multimodal information which includes audio, vi
Externí odkaz:
http://arxiv.org/abs/2408.12558
Autor:
Xie, Yuankun, Xiong, Chenxu, Wang, Xiaopeng, Wang, Zhiyong, Lu, Yi, Qi, Xin, Fu, Ruibo, Liu, Yukun, Wen, Zhengqi, Tao, Jianhua, Li, Guanjun, Ye, Long
Currently, Audio Language Models (ALMs) are rapidly advancing due to the developments in large language models and audio neural codecs. These ALMs have significantly lowered the barrier to creating deepfake audio, generating highly realistic and dive
Externí odkaz:
http://arxiv.org/abs/2408.10853
Autor:
Qi, Xin, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Shi, Shuchen, Lu, Yi, Wang, Zhiyong, Wang, Xiaopeng, Xie, Yuankun, Liu, Yukun, Li, Guanjun, Liu, Xuefei, Li, Yongwei
In the current era of Artificial Intelligence Generated Content (AIGC), a Low-Rank Adaptation (LoRA) method has emerged. It uses a plugin-based approach to learn new knowledge with lower parameter quantities and computational costs, and it can be plu
Externí odkaz:
http://arxiv.org/abs/2408.10852
Autor:
Wang, Zhiyong, Wang, Xiaopeng, Xie, Yuankun, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Liu, Yukun, Li, Guanjun, Qi, Xin, Lu, Yi, Liu, Xuefei, Li, Yongwei
In the field of deepfake detection, previous studies focus on using reconstruction or mask and prediction methods to train pre-trained models, which are then transferred to fake audio detection training where the encoder is used to extract features,
Externí odkaz:
http://arxiv.org/abs/2408.10849
Autor:
Xie, Yuankun, Wang, Xiaopeng, Wang, Zhiyong, Fu, Ruibo, Wen, Zhengqi, Cheng, Haonan, Ye, Long
ASVspoof5, the fifth edition of the ASVspoof series, is one of the largest global audio security challenges. It aims to advance the development of countermeasure (CM) to discriminate bonafide and spoofed speech utterances. In this paper, we focus on
Externí odkaz:
http://arxiv.org/abs/2408.06922
Autor:
Qiang, Chunyu, Geng, Wang, Zhao, Yi, Fu, Ruibo, Wang, Tao, Gong, Cheng, Wang, Tianrui, Liu, Qiuyu, Yi, Jiangyan, Wen, Zhengqi, Zhang, Chen, Che, Hao, Wang, Longbiao, Dang, Jianwu, Tao, Jianhua
Deep learning has brought significant improvements to the field of cross-modal representation learning. For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) se
Externí odkaz:
http://arxiv.org/abs/2408.05758
Autor:
Cai, Cong, Liang, Shan, Liu, Xuefei, Zhu, Kang, Wen, Zhengqi, Tao, Jianhua, Xie, Heng, Cui, Jizhou, Ma, Yiming, Cheng, Zhenhua, Xu, Hanzhe, Fu, Ruibo, Liu, Bin, Li, Yongwei
Deception detection has garnered increasing attention in recent years due to the significant growth of digital media and heightened ethical and security concerns. It has been extensively studied using multimodal methods, including video, audio, and t
Externí odkaz:
http://arxiv.org/abs/2407.12274