Zobrazeno 1 - 10
of 531
pro vyhledávání: '"Huang, Qiushi"'
Personalized dialogue generation, focusing on generating highly tailored responses by leveraging persona profiles and dialogue context, has gained significant attention in conversational AI applications. However, persona profiles, a prevalent setting
Externí odkaz:
http://arxiv.org/abs/2406.18847
In conversational AI, personalizing dialogues with persona profiles and contextual understanding is essential. Despite large language models' (LLMs) improved response coherence, effective persona integration remains a challenge. In this work, we firs
Externí odkaz:
http://arxiv.org/abs/2406.18187
Autor:
Wu, Bo, Liu, Peiye, Cheng, Wen-Huang, Liu, Bei, Zeng, Zhaoyang, Wang, Jia, Huang, Qiushi, Luo, Jiebo
Social Media Popularity Prediction (SMPP) is a crucial task that involves automatically predicting future popularity values of online posts, leveraging vast amounts of multimodal data available on social media platforms. Studying and investigating so
Externí odkaz:
http://arxiv.org/abs/2405.10497
Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based appro
Externí odkaz:
http://arxiv.org/abs/2402.02389
Despite recent progress in text-to-audio (TTA) generation, we show that the state-of-the-art models, such as AudioLDM, trained on datasets with an imbalanced class distribution, such as AudioCaps, are biased in their generation performance. Specifica
Externí odkaz:
http://arxiv.org/abs/2309.08051
Autor:
Liu, Xubo, Zhu, Zhongkai, Liu, Haohe, Yuan, Yi, Cui, Meng, Huang, Qiushi, Liang, Jinhua, Cao, Yin, Kong, Qiuqiang, Plumbley, Mark D., Wang, Wenwu
Large Language Models (LLMs) have shown great promise in integrating diverse expert models to tackle intricate language and vision tasks. Despite their significance in advancing the field of Artificial Intelligence Generated Content (AIGC), their pot
Externí odkaz:
http://arxiv.org/abs/2307.14335
Autor:
Bai, Xianyong, Tian, Hui, Deng, Yuanyong, Wang, Zhanshan, Yang, Jianfeng, Zhang, Xiaofeng, Zhang, Yonghe, Qi, Runze, Wang, Nange, Gao, Yang, Yu, Jun, He, Chunling, Shen, Zhengxiang, Shen, Lun, Guo, Song, Hou, Zhenyong, Ji, Kaifan, Bi, Xingzi, Duan, Wei, Yang, Xiao, Lin, Jiaben, Hu, Ziyao, Song, Qian, Yang, Zihao, Chen, Yajie, Qiao, Weidong, Ge, Wei, Li, Fu, Jin, Lei, He, Jiawei, Chen, Xiaobo, Zhu, Xiaocheng, He, Junwang, Shi, Qi, Liu, Liu, Li, Jinsong, Xu, Dongxiao, Liu, Rui, Li, Taijie, Feng, Zhenggong, Wang, Yamin, Fan, Chengcheng, Liu, Shuo, Guo, Sifan, Sun, Zheng, Wu, Yuchuan, Li, Haiyu, Yang, Qi, Ye, Yuyang, Gu, Weichen, Wu, Jiali, Zhang, Zhe, Yu, Yue, Ye, Zeyi, Sheng, Pengfeng, Wang, Yifan, Li, Wenbin, Huang, Qiushi, Zhang, Zhong
The Solar Upper Transition Region Imager (SUTRI) onboard the Space Advanced Technology demonstration satellite (SATech-01), which was launched to a sun-synchronous orbit at a height of 500 km in July 2022, aims to test the on-orbit performance of our
Externí odkaz:
http://arxiv.org/abs/2303.03669
Autor:
Liu, Xubo, Huang, Qiushi, Mei, Xinhao, Liu, Haohe, Kong, Qiuqiang, Sun, Jianyuan, Li, Shengchen, Ko, Tom, Zhang, Yu, Tang, Lilian H., Plumbley, Mark D., Kılıç, Volkan, Wang, Wenwu
Audio captioning aims to generate text descriptions of audio clips. In the real world, many objects produce similar sounds. How to accurately recognize ambiguous sounds is a major challenge for audio captioning. In this work, inspired by inherent hum
Externí odkaz:
http://arxiv.org/abs/2210.16428
Persona-based dialogue systems aim to generate consistent responses based on historical context and predefined persona. Unlike conventional dialogue generation, the persona-based dialogue needs to consider both dialogue context and persona, posing a
Externí odkaz:
http://arxiv.org/abs/2210.15088
Autor:
Liu, Xubo, Liu, Haohe, Kong, Qiuqiang, Mei, Xinhao, Zhao, Jinzheng, Huang, Qiushi, Plumbley, Mark D., Wang, Wenwu
In this paper, we introduce the task of language-queried audio source separation (LASS), which aims to separate a target source from an audio mixture based on a natural language query of the target source (e.g., "a man tells a joke followed by people
Externí odkaz:
http://arxiv.org/abs/2203.15147