Zobrazeno 1 - 10
of 159
pro vyhledávání: '"Peng, Xiaojiang"'
Autor:
Zheng, Yilun, Zhang, Zhuofan, Wang, Ziming, Li, Xiang, Luan, Sitao, Peng, Xiaojiang, Chen, Lihui
To improve the performance of Graph Neural Networks (GNNs), Graph Structure Learning (GSL) has been extensively applied to reconstruct or refine original graph structures, effectively addressing issues like heterophily, over-squashing, and noisy stru
Externí odkaz:
http://arxiv.org/abs/2411.07672
Graph Neural Networks (GNNs) have demonstrated strong capabilities in processing structured data. While traditional GNNs typically treat each feature dimension equally during graph convolution, we raise an important question: Is the graph convolution
Externí odkaz:
http://arxiv.org/abs/2411.07663
Emotion Recognition in Conversations (ERCs) is a vital area within multimodal interaction research, dedicated to accurately identifying and classifying the emotions expressed by speakers throughout a conversation. Traditional ERC approaches predomina
Externí odkaz:
http://arxiv.org/abs/2409.05243
Fashion image editing is a crucial tool for designers to convey their creative ideas by visualizing design concepts interactively. Current fashion image editing techniques, though advanced with multimodal prompts and powerful diffusion models, often
Externí odkaz:
http://arxiv.org/abs/2409.01086
Autor:
Niu, Fuqiang, Cheng, Zebang, Fu, Xianghua, Peng, Xiaojiang, Dai, Genan, Chen, Yin, Huang, Hu, Zhang, Bowen
Stance detection, which aims to identify public opinion towards specific targets using social media data, is an important yet challenging task. With the proliferation of diverse multimodal social media content including text, and images multimodal st
Externí odkaz:
http://arxiv.org/abs/2409.00597
Autor:
Wang, Jue, Lin, Yuxiang, Yuan, Tianshuo, Cheng, Zhi-Qi, Wang, Xiaolong, GH, Jiao, Chen, Wei, Peng, Xiaojiang
Combining Vision Large Language Models (VLLMs) with diffusion models offers a powerful method for executing image editing tasks based on human language instructions. However, language instructions alone often fall short in accurately conveying user r
Externí odkaz:
http://arxiv.org/abs/2408.12429
Autor:
Cheng, Zebang, Tu, Shuyuan, Huang, Dawei, Li, Minghan, Peng, Xiaojiang, Cheng, Zhi-Qi, Hauptmann, Alexander G.
This paper presents our winning approach for the MER-NOISE and MER-OV tracks of the MER2024 Challenge on multimodal emotion recognition. Our system leverages the advanced emotional understanding capabilities of Emotion-LLaMA to generate high-quality
Externí odkaz:
http://arxiv.org/abs/2408.10500
Image quality assessment (IQA) has long been a fundamental challenge in image understanding. In recent years, deep learning-based IQA methods have shown promising performance. However, the lack of large amounts of labeled data in the IQA field has hi
Externí odkaz:
http://arxiv.org/abs/2407.03886
Autor:
Cheng, Zebang, Cheng, Zhi-Qi, He, Jun-Yan, Sun, Jingdong, Wang, Kai, Lin, Yuxiang, Lian, Zheng, Peng, Xiaojiang, Hauptmann, Alexander
Accurate emotion perception is crucial for various applications, including human-computer interaction, education, and counseling. However, traditional single-modality approaches often fail to capture the complexity of real-world emotional expressions
Externí odkaz:
http://arxiv.org/abs/2406.11161
Autor:
Qin, Ziheng, Xu, Zhaopan, Zhou, Yukun, Zheng, Zangwei, Cheng, Zebang, Tang, Hao, Shang, Lei, Sun, Baigui, Peng, Xiaojiang, Timofte, Radu, Yao, Hongxun, Wang, Kai, You, Yang
Deep learning benefits from the growing abundance of available data. Meanwhile, efficiently dealing with the growing data scale has become a challenge. Data publicly available are from different sources with various qualities, and it is impractical t
Externí odkaz:
http://arxiv.org/abs/2405.18347