Zobrazeno 1 - 10
of 3 090
pro vyhledávání: '"Guo,Dan"'
Autor:
Zhou, Sheng, Xiao, Junbin, Yang, Xun, Song, Peipei, Guo, Dan, Yao, Angela, Wang, Meng, Chua, Tat-Seng
Existing efforts in text-based video question answering (TextVideoQA) are criticized for their opaque decisionmaking and heavy reliance on scene-text recognition. In this paper, we propose to study Grounded TextVideoQA by forcing models to answer que
Externí odkaz:
http://arxiv.org/abs/2409.14319
In this paper, we briefly introduce the solution developed by our team, HFUT-VUT, for the track of Micro-gesture Classification in the MiGA challenge at IJCAI 2024. The task of micro-gesture classification task involves recognizing the category of a
Externí odkaz:
http://arxiv.org/abs/2408.03097
Audio-Visual Video Parsing (AVVP) task aims to detect and temporally locate events within audio and visual modalities. Multiple events can overlap in the timeline, making identification challenging. While traditional methods usually focus on improvin
Externí odkaz:
http://arxiv.org/abs/2407.08126
Mental health has attracted substantial attention in recent years and LLM can be an effective technology for alleviating this problem owing to its capability in text understanding and dialogue. However, existing research in this domain often suffers
Externí odkaz:
http://arxiv.org/abs/2407.05721
Human body actions are an important form of non-verbal communication in social interactions. This paper focuses on a specific subset of body actions known as micro-actions, which are subtle, low-intensity body movements that provide a deeper understa
Externí odkaz:
http://arxiv.org/abs/2407.05311
Autor:
Liu, Pengyu, Wang, Fei, Li, Kun, Chen, Guoliang, Wei, Yanyan, Tang, Shengeng, Wu, Zhiliang, Guo, Dan
In this paper, we briefly introduce the solution developed by our team, HFUT-VUT, for the Micro-gesture Online Recognition track in the MiGA challenge at IJCAI 2024. The Micro-gesture Online Recognition task involves identifying the category and loca
Externí odkaz:
http://arxiv.org/abs/2407.04490
This paper briefly introduces the solutions developed by our team, HFUT-VUT, for Track 1 of self-supervised heart rate measurement in the 3rd Vision-based Remote Physiological Signal Sensing (RePSS) Challenge hosted at IJCAI 2024. The goal is to deve
Externí odkaz:
http://arxiv.org/abs/2406.04942
The Audio-Visual Video Parsing task aims to identify and temporally localize the events that occur in either or both the audio and visual streams of audible videos. It often performs in a weakly-supervised manner, where only video event labels are pr
Externí odkaz:
http://arxiv.org/abs/2406.00919
Autor:
Ren, Bin, Li, Yawei, Mehta, Nancy, Timofte, Radu, Yu, Hongyuan, Wan, Cheng, Hong, Yuxin, Han, Bingnan, Wu, Zhuoyuan, Zou, Yajun, Liu, Yuqing, Li, Jizhe, He, Keji, Fan, Chao, Zhang, Heng, Zhang, Xiaolin, Yin, Xuanwu, Zuo, Kunlong, Liao, Bohao, Xia, Peizhe, Peng, Long, Du, Zhibo, Di, Xin, Li, Wangkai, Wang, Yang, Zhai, Wei, Pei, Renjing, Guo, Jiaming, Xu, Songcen, Cao, Yang, Zha, Zhengjun, Wang, Yan, Liu, Yi, Wang, Qing, Zhang, Gang, Zhang, Liou, Zhao, Shijie, Sun, Long, Pan, Jinshan, Dong, Jiangxin, Tang, Jinhui, Liu, Xin, Yan, Min, Wang, Qian, Zhou, Menghan, Yan, Yiqiang, Liu, Yixuan, Chan, Wensong, Tang, Dehua, Zhou, Dong, Wang, Li, Tian, Lu, Emad, Barsoum, Jia, Bohan, Qiao, Junbo, Zhou, Yunshuai, Zhang, Yun, Li, Wei, Lin, Shaohui, Zhou, Shenglong, Chen, Binbin, Liao, Jincheng, Zhao, Suiyi, Zhang, Zhao, Wang, Bo, Luo, Yan, Wei, Yanyan, Li, Feng, Wang, Mingshen, Guan, Jinhan, Hu, Dehua, Yu, Jiawei, Xu, Qisheng, Sun, Tao, Lan, Long, Xu, Kele, Lin, Xin, Yue, Jingtong, Yang, Lehan, Du, Shiyi, Qi, Lu, Ren, Chao, Han, Zeyu, Wang, Yuhan, Chen, Chaolin, Li, Haobo, Zheng, Mingjun, Yang, Zhongbao, Song, Lianhong, Yan, Xingzhuo, Fu, Minghan, Zhang, Jingyi, Li, Baiang, Zhu, Qi, Xu, Xiaogang, Guo, Dan, Guo, Chunle, Chen, Jiadi, Long, Huanhuan, Duanmu, Chunjiang, Lei, Xiaoyan, Liu, Jie, Jia, Weilin, Cao, Weifeng, Zhang, Wenlong, Mao, Yanyu, Guo, Ruilong, Zhang, Nihao, Pandey, Manoj, Chernozhukov, Maksym, Le, Giang, Cheng, Shuli, Wang, Hongyuan, Wei, Ziyan, Tang, Qingting, Wang, Liejun, Li, Yongming, Guo, Yanhui, Xu, Hao, Khatami-Rizi, Akram, Mahmoudi-Aznaveh, Ahmad, Hsu, Chih-Chung, Lee, Chia-Ming, Chou, Yi-Shiuan, Joshi, Amogh, Akalwadi, Nikhil, Malagi, Sampada, Yashaswini, Palani, Desai, Chaitra, Tabib, Ramesh Ashok, Patil, Ujwala, Mudenagudi, Uma
This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor
Externí odkaz:
http://arxiv.org/abs/2404.10343
Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-mo
Externí odkaz:
http://arxiv.org/abs/2403.14174