Zobrazeno 1 - 10
of 340
pro vyhledávání: '"Lin, Weiyao"'
Autor:
Chen, Tieyuan, Liu, Huabin, He, Tianyao, Chen, Yihang, Gan, Chaofan, Ma, Xiao, Zhong, Cheng, Zhang, Yang, Wang, Yingxue, Lin, Hui, Lin, Weiyao
Video causal reasoning aims to achieve a high-level understanding of video content from a causal perspective. However, current video reasoning tasks are limited in scope, primarily executed in a question-answering paradigm and focusing on short video
Externí odkaz:
http://arxiv.org/abs/2409.17647
With the recent burst of 2D and 3D data, cross-modal retrieval has attracted increasing attention recently. However, manual labeling by non-experts will inevitably introduce corrupted annotations given ambiguous 2D/3D content. Though previous works h
Externí odkaz:
http://arxiv.org/abs/2407.17779
Swarm navigation in cluttered environments is a grand challenge in robotics. This work combines deep learning with first-principle physics through differentiable simulation to enable autonomous navigation of multiple aerial robots through complex env
Externí odkaz:
http://arxiv.org/abs/2407.10648
Publikováno v:
ECCV 2024
3D Gaussian Splatting (3DGS) has emerged as a promising framework for novel view synthesis, boasting rapid rendering speed with high fidelity. However, the substantial Gaussians and their associated attributes necessitate effective compression techni
Externí odkaz:
http://arxiv.org/abs/2403.14530
Video Correlation Learning (VCL), which aims to analyze the relationships between videos, has been widely studied and applied in various general video tasks. However, applying VCL to instructional videos is still quite challenging due to their intrin
Externí odkaz:
http://arxiv.org/abs/2312.11024
Active domain adaptation has emerged as a solution to balance the expensive annotation cost and the performance of trained models in semantic segmentation. However, existing works usually ignore the correlation between selected samples and its local
Externí odkaz:
http://arxiv.org/abs/2312.09595
Bases have become an integral part of modern deep learning-based models for time series forecasting due to their ability to act as feature extractors or future references. To be effective, a basis must be tailored to the specific set of time series d
Externí odkaz:
http://arxiv.org/abs/2310.20496
Current few-shot action recognition involves two primary sources of information for classification:(1) intra-video information, determined by frame content within a single video clip, and (2) inter-video information, measured by relationships (e.g.,
Externí odkaz:
http://arxiv.org/abs/2305.06114
The scene graph is a new data structure describing objects and their pairwise relationship within image scenes. As the size of scene graph in vision applications grows, how to losslessly and efficiently store such data on disks or transmit over the n
Externí odkaz:
http://arxiv.org/abs/2304.13359
Multiple Object Tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories. MOT remains a challenging task as noisy and confusing detection results often hinder the fina
Externí odkaz:
http://arxiv.org/abs/2302.02444