Výsledky vyhledávání

Report

A Two-Stage Adverse Weather Semantic Segmentation Method for WeatherProof Challenge CVPR 2024 Workshop UG2+

Autor: Wang, Jianzhao, Wei, Yanyan, Hu, Dehua, Zhang, Yilin, Tang, Shengeng, Guo, Dan, Zhang, Zhao

This technical report presents our team's solution for the WeatherProof Dataset Challenge: Semantic Segmentation in Adverse Weather at CVPR'24 UG2+. We propose a two-stage deep learning framework for this task. In the first stage, we preprocess the p

Externí odkaz: http://arxiv.org/abs/2406.05513

Zobrazit plný text záznamu

Report

Joint Spatial-Temporal Modeling and Contrastive Learning for Self-supervised Heart Rate Measurement

Autor: Qian, Wei, Li, Qi, Li, Kun, Wang, Xinke, Sun, Xiao, Wang, Meng, Guo, Dan

This paper briefly introduces the solutions developed by our team, HFUT-VUT, for Track 1 of self-supervised heart rate measurement in the 3rd Vision-based Remote Physiological Signal Sensing (RePSS) Challenge hosted at IJCAI 2024. The goal is to deve

Externí odkaz: http://arxiv.org/abs/2406.04942

Zobrazit plný text záznamu

Report

Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-wise Pseudo Labeling

Autor: Zhou, Jinxing, Guo, Dan, Zhong, Yiran, Wang, Meng

The Audio-Visual Video Parsing task aims to identify and temporally localize the events that occur in either or both the audio and visual streams of audible videos. It often performs in a weakly-supervised manner, where only video event labels are pr

Externí odkaz: http://arxiv.org/abs/2406.00919

Zobrazit plný text záznamu

Report

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Autor: Ren, Bin, Li, Yawei, Mehta, Nancy, Timofte, Radu, Yu, Hongyuan, Wan, Cheng, Hong, Yuxin, Han, Bingnan, Wu, Zhuoyuan, Zou, Yajun, Liu, Yuqing, Li, Jizhe, He, Keji, Fan, Chao, Zhang, Heng, Zhang, Xiaolin, Yin, Xuanwu, Zuo, Kunlong, Liao, Bohao, Xia, Peizhe, Peng, Long, Du, Zhibo, Di, Xin, Li, Wangkai, Wang, Yang, Zhai, Wei, Pei, Renjing, Guo, Jiaming, Xu, Songcen, Cao, Yang, Zha, Zhengjun, Wang, Yan, Liu, Yi, Wang, Qing, Zhang, Gang, Zhang, Liou, Zhao, Shijie, Sun, Long, Pan, Jinshan, Dong, Jiangxin, Tang, Jinhui, Liu, Xin, Yan, Min, Wang, Qian, Zhou, Menghan, Yan, Yiqiang, Liu, Yixuan, Chan, Wensong, Tang, Dehua, Zhou, Dong, Wang, Li, Tian, Lu, Emad, Barsoum, Jia, Bohan, Qiao, Junbo, Zhou, Yunshuai, Zhang, Yun, Li, Wei, Lin, Shaohui, Zhou, Shenglong, Chen, Binbin, Liao, Jincheng, Zhao, Suiyi, Zhang, Zhao, Wang, Bo, Luo, Yan, Wei, Yanyan, Li, Feng, Wang, Mingshen, Guan, Jinhan, Hu, Dehua, Yu, Jiawei, Xu, Qisheng, Sun, Tao, Lan, Long, Xu, Kele, Lin, Xin, Yue, Jingtong, Yang, Lehan, Du, Shiyi, Qi, Lu, Ren, Chao, Han, Zeyu, Wang, Yuhan, Chen, Chaolin, Li, Haobo, Zheng, Mingjun, Yang, Zhongbao, Song, Lianhong, Yan, Xingzhuo, Fu, Minghan, Zhang, Jingyi, Li, Baiang, Zhu, Qi, Xu, Xiaogang, Guo, Dan, Guo, Chunle, Chen, Jiadi, Long, Huanhuan, Duanmu, Chunjiang, Lei, Xiaoyan, Liu, Jie, Jia, Weilin, Cao, Weifeng, Zhang, Wenlong, Mao, Yanyu, Guo, Ruilong, Zhang, Nihao, Pandey, Manoj, Chernozhukov, Maksym, Le, Giang, Cheng, Shuli, Wang, Hongyuan, Wei, Ziyan, Tang, Qingting, Wang, Liejun, Li, Yongming, Guo, Yanhui, Xu, Hao, Khatami-Rizi, Akram, Mahmoudi-Aznaveh, Ahmad, Hsu, Chih-Chung, Lee, Chia-Ming, Chou, Yi-Shiuan, Joshi, Amogh, Akalwadi, Nikhil, Malagi, Sampada, Yashaswini, Palani, Desai, Chaitra, Tabib, Ramesh Ashok, Patil, Ujwala, Mudenagudi, Uma

This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor

Externí odkaz: http://arxiv.org/abs/2404.10343

Zobrazit plný text záznamu

Report

Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding

Autor: Hu, Jingjing, Guo, Dan, Li, Kun, Si, Zhan, Yang, Xun, Chang, Xiaojun, Wang, Meng

Inspired by the activity-silent and persistent activity mechanisms in human visual perception biology, we design a Unified Static and Dynamic Network (UniSDNet), to learn the semantic association between the video and text/audio queries in a cross-mo

Externí odkaz: http://arxiv.org/abs/2403.14174

Zobrazit plný text záznamu

Report

Training A Small Emotional Vision Language Model for Visual Art Comprehension

Autor: Zhang, Jing, Zheng, Liang, Guo, Dan, Wang, Meng

This paper develops small vision language models to understand visual art, which, given an art work, aims to identify its emotion category and explain this prediction with natural language. While small models are computationally efficient, their capa

Externí odkaz: http://arxiv.org/abs/2403.11150

Zobrazit plný text záznamu

Report

Frequency Decoupling for Motion Magnification via Multi-Level Isomorphic Architecture

Autor: Wang, Fei, Guo, Dan, Li, Kun, Zhong, Zhun, Wang, Meng

Video Motion Magnification (VMM) aims to reveal subtle and imperceptible motion information of objects in the macroscopic world. Prior methods directly model the motion field from the Eulerian perspective by Representation Learning that separates sha

Externí odkaz: http://arxiv.org/abs/2403.07347

Zobrazit plný text záznamu

Report

Benchmarking Micro-action Recognition: Dataset, Methods, and Applications

Autor: Guo, Dan, Li, Kun, Hu, Bin, Zhang, Yan, Wang, Meng

Micro-action is an imperceptible non-verbal behaviour characterised by low-intensity movement. It offers insights into the feelings and intentions of individuals and is important for human-oriented applications such as emotion recognition and psychol

Externí odkaz: http://arxiv.org/abs/2403.05234

Zobrazit plný text záznamu

Report

Structure of the $\mathbf{\Lambda(1670)}$ resonance

Autor: Liu, Jiong-Jiong, Liu, Zhan-Wei, Chen, Kan, Guo, Dan, Leinweber, Derek B., Liu, Xiang, Thomas, Anthony W.

Publikováno v: Phys. Rev. D 109, 054025 (2024)

We examine the internal structure of the $\Lambda(1670)$ through an analysis of lattice QCD simulations and experimental data within Hamiltonian effective field theory. Two scenarios are presented. The first describes the $\Lambda(1670)$ as a bare th

Externí odkaz: http://arxiv.org/abs/2312.13072

Zobrazit plný text záznamu

Report

Object-aware Adaptive-Positivity Learning for Audio-Visual Question Answering

Autor: Li, Zhangbin, Guo, Dan, Zhou, Jinxing, Zhang, Jing, Wang, Meng

This paper focuses on the Audio-Visual Question Answering (AVQA) task that aims to answer questions derived from untrimmed audible videos. To generate accurate answers, an AVQA model is expected to find the most informative audio-visual clues relevan

Externí odkaz: http://arxiv.org/abs/2312.12816

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání