Zobrazeno 1 - 10
of 1 558
pro vyhledávání: '"Wang, Wenwu"'
Autor:
Yuan, Yi, Jia, Dongya, Zhuang, Xiaobin, Chen, Yuanzhe, Liu, Zhengxi, Chen, Zhuo, Wang, Yuping, Wang, Yuxuan, Liu, Xubo, Plumbley, Mark D., Wang, Wenwu
Generative models have shown significant achievements in audio generation tasks. However, existing models struggle with complex and detailed prompts, leading to potential performance degradation. We hypothesize that this problem stems from the low qu
Externí odkaz:
http://arxiv.org/abs/2407.04416
Personalized dialogue generation, focusing on generating highly tailored responses by leveraging persona profiles and dialogue context, has gained significant attention in conversational AI applications. However, persona profiles, a prevalent setting
Externí odkaz:
http://arxiv.org/abs/2406.18847
In conversational AI, personalizing dialogues with persona profiles and contextual understanding is essential. Despite large language models' (LLMs) improved response coherence, effective persona integration remains a challenge. In this work, we firs
Externí odkaz:
http://arxiv.org/abs/2406.18187
Sound event localization and detection (SELD) aims to determine the appearance of sound classes, together with their Direction of Arrival (DOA). However, current SELD systems can only predict the activities of specific classes, for example, 13 classe
Externí odkaz:
http://arxiv.org/abs/2406.16058
Digital aquaculture leverages advanced technologies and data-driven methods, providing substantial benefits over traditional aquaculture practices. Fish tracking, counting, and behaviour analysis are crucial components of digital aquaculture, which a
Externí odkaz:
http://arxiv.org/abs/2406.17800
Autor:
Hu, Tao, Shao, Xianzhou, Bai, Mingkai, Jia, Xinpei, Dai, Saifei, Sun, Xiaoqing, Han, Runhao, Yang, Jia, Ke, Xiaoyu, Tian, Fengbin, Yang, Shuai, Chai, Junshuai, Xu, Hao, Wang, Xiaolei, Wang, Wenwu, Ye, Tianchun
We study the impact of top SiO2 interlayer thickness on the memory window (MW) of Si channel ferroelectric field-effect transistor (FeFET) with TiN/SiO2/Hf0.5Zr0.5O2/SiOx/Si (MIFIS) gate structure. We find that the MW increases with the increasing th
Externí odkaz:
http://arxiv.org/abs/2406.15478
Autor:
Zhang, Yiming, Xu, Xuenan, Du, Ruoyi, Liu, Haohe, Dong, Yuan, Tan, Zheng-Hua, Wang, Wenwu, Ma, Zhanyu
In traditional audio captioning methods, a model is usually trained in a fully supervised manner using a human-annotated dataset containing audio-text pairs and then evaluated on the test sets from the same dataset. Such methods have two limitations.
Externí odkaz:
http://arxiv.org/abs/2406.06295
Autor:
Hou, Yuanbo, Ren, Qiaoqiao, Mitchell, Andrew, Wang, Wenwu, Kang, Jian, Belpaeme, Tony, Botteldooren, Dick
We live in a rich and varied acoustic world, which is experienced by individuals or communities as a soundscape. Computational auditory scene analysis, disentangling acoustic scenes by detecting and classifying events, focuses on objective attributes
Externí odkaz:
http://arxiv.org/abs/2406.05914
Differentiable particle filters are an emerging class of models that combine sequential Monte Carlo techniques with the flexibility of neural networks to perform state space inference. This paper concerns the case where the system may switch between
Externí odkaz:
http://arxiv.org/abs/2405.04865
Large language models (LLMs) have significantly advanced audio processing through audio codecs that convert audio into discrete tokens, enabling the application of language modelling techniques to audio data. However, traditional codecs often operate
Externí odkaz:
http://arxiv.org/abs/2405.00233