Výsledky vyhledávání

Akademický článek

Progress in relationship between chronotype and technology addiction and its mechanism

Autor: WU Yujing, GUO Qian, LIU Xiaohua, LI Guanjun

Publikováno v: Shanghai Jiaotong Daxue xuebao. Yixue ban, Vol 43, Iss 4, Pp 487-494 (2023)

In the era of digitalization, the Internet has changed people's lifestyle and circadian rhythm, and has also brought the global problem of technology addiction. Many studies have shown that chronotype is significantly related to specific technology a

Externí odkaz: https://doaj.org/article/9c45bf7c5f87483eb223f433e1b38812

Zobrazit plný text záznamu

Report

Reject Threshold Adaptation for Open-Set Model Attribution of Deepfake Audio

Autor: Yan, Xinrui, Yi, Jiangyan, Tao, Jianhua, Chen, Yujie, Gu, Hao, Li, Guanjun, Zhou, Junzuo, Ren, Yong, Xu, Tao

Open environment oriented open set model attribution of deepfake audio is an emerging research topic, aiming to identify the generation models of deepfake audio. Most previous work requires manually setting a rejection threshold for unknown classes t

Externí odkaz: http://arxiv.org/abs/2412.01425

Zobrazit plný text záznamu

Report

Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0

Autor: Wang, Zhiyong, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Wang, Xiaopeng, Xie, Yuankun, Qi, Xin, Shi, Shuchen, Lu, Yi, Liu, Yukun, Li, Chenxing, Liu, Xuefei, Li, Guanjun

Speech synthesis technology has posed a serious threat to speaker verification systems. Currently, the most effective fake audio detection methods utilize pretrained models, and integrating features from various layers of pretrained model further enh

Externí odkaz: http://arxiv.org/abs/2409.11909

Zobrazit plný text záznamu

Report

DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech

Autor: Qi, Xin, Fu, Ruibo, Wen, Zhengqi, Wang, Tao, Qiang, Chunyu, Tao, Jianhua, Li, Chenxing, Lu, Yi, Shi, Shuchen, Wang, Zhiyong, Wang, Xiaopeng, Xie, Yuankun, Liu, Yukun, Liu, Xuefei, Li, Guanjun

In recent years, speech diffusion models have advanced rapidly. Alongside the widely used U-Net architecture, transformer-based models such as the Diffusion Transformer (DiT) have also gained attention. However, current DiT speech models treat Mel sp

Externí odkaz: http://arxiv.org/abs/2409.11835

Zobrazit plný text záznamu

Report

Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation

Autor: Xiong, Chenxu, Fu, Ruibo, Shi, Shuchen, Wen, Zhengqi, Tao, Jianhua, Wang, Tao, Li, Chenxing, Qiang, Chunyu, Xie, Yuankun, Qi, Xin, Li, Guanjun, Yang, Zizheng

Current mainstream audio generation methods primarily rely on simple text prompts, often failing to capture the nuanced details necessary for multi-style audio generation. To address this limitation, the Sound Event Enhanced Prompt Adapter is propose

Externí odkaz: http://arxiv.org/abs/2409.09381

Zobrazit plný text záznamu

Report

Exploring the Role of Audio in Multimodal Misinformation Detection

Autor: Liu, Moyang, Liu, Yukun, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Liu, Xuefei, Li, Guanjun

With the rapid development of deepfake technology, especially the deep audio fake technology, misinformation detection on the social media scene meets a great challenge. Social media data often contains multimodal information which includes audio, vi

Externí odkaz: http://arxiv.org/abs/2408.12558

Zobrazit plný text záznamu

Report

Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?

Autor: Xie, Yuankun, Xiong, Chenxu, Wang, Xiaopeng, Wang, Zhiyong, Lu, Yi, Qi, Xin, Fu, Ruibo, Liu, Yukun, Wen, Zhengqi, Tao, Jianhua, Li, Guanjun, Ye, Long

Currently, Audio Language Models (ALMs) are rapidly advancing due to the developments in large language models and audio neural codecs. These ALMs have significantly lowered the barrier to creating deepfake audio, generating highly realistic and dive

Externí odkaz: http://arxiv.org/abs/2408.10853

Zobrazit plný text záznamu

Report

EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech

Autor: Qi, Xin, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Shi, Shuchen, Lu, Yi, Wang, Zhiyong, Wang, Xiaopeng, Xie, Yuankun, Liu, Yukun, Li, Guanjun, Liu, Xuefei, Li, Yongwei

In the current era of Artificial Intelligence Generated Content (AIGC), a Low-Rank Adaptation (LoRA) method has emerged. It uses a plugin-based approach to learn new knowledge with lower parameter quantities and computational costs, and it can be plu

Externí odkaz: http://arxiv.org/abs/2408.10852

Zobrazit plný text záznamu

Report

A Noval Feature via Color Quantisation for Fake Audio Detection

Autor: Wang, Zhiyong, Wang, Xiaopeng, Xie, Yuankun, Fu, Ruibo, Wen, Zhengqi, Tao, Jianhua, Liu, Yukun, Li, Guanjun, Qi, Xin, Lu, Yi, Liu, Xuefei, Li, Yongwei

In the field of deepfake detection, previous studies focus on using reconstruction or mask and prediction methods to train pre-trained models, which are then transferred to fake audio detection training where the encoder is used to extract features,

Externí odkaz: http://arxiv.org/abs/2408.10849

Zobrazit plný text záznamu

Report

ICAGC 2024: Inspirational and Convincing Audio Generation Challenge 2024

Autor: Fu, Ruibo, Liu, Rui, Qiang, Chunyu, Gao, Yingming, Lu, Yi, Shi, Shuchen, Wang, Tao, Li, Ya, Wen, Zhengqi, Zhang, Chen, Bu, Hui, Liu, Yukun, Qi, Xin, Li, Guanjun

The Inspirational and Convincing Audio Generation Challenge 2024 (ICAGC 2024) is part of the ISCSLP 2024 Competitions and Challenges track. While current text-to-speech (TTS) technology can generate high-quality audio, its ability to convey complex e

Externí odkaz: http://arxiv.org/abs/2407.12038

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání