Výsledky vyhledávání

Report

MECD: Unlocking Multi-Event Causal Discovery in Video Reasoning

Autor: Chen, Tieyuan, Liu, Huabin, He, Tianyao, Chen, Yihang, Gan, Chaofan, Ma, Xiao, Zhong, Cheng, Zhang, Yang, Wang, Yingxue, Lin, Hui, Lin, Weiyao

Video causal reasoning aims to achieve a high-level understanding of video content from a causal perspective. However, current video reasoning tasks are limited in scope, primarily executed in a question-answering paradigm and focusing on short video

Externí odkaz: http://arxiv.org/abs/2409.17647

Zobrazit plný text záznamu

Report

DAC: 2D-3D Retrieval with Noisy Labels via Divide-and-Conquer Alignment and Correction

Autor: Gan, Chaofan, Tu, Yuanpeng, Li, Yuxi, Lin, Weiyao

With the recent burst of 2D and 3D data, cross-modal retrieval has attracted increasing attention recently. However, manual labeling by non-experts will inevitably introduce corrupted annotations given ambiguous 2D/3D content. Though previous works h

Externí odkaz: http://arxiv.org/abs/2407.17779

Zobrazit plný text záznamu

Report

Back to Newton's Laws: Learning Vision-based Agile Flight via Differentiable Physics

Autor: Zhang, Yuang, Hu, Yu, Song, Yunlong, Zou, Danping, Lin, Weiyao

Swarm navigation in cluttered environments is a grand challenge in robotics. This work combines deep learning with first-principle physics through differentiable simulation to enable autonomous navigation of multiple aerial robots through complex env

Externí odkaz: http://arxiv.org/abs/2407.10648

Zobrazit plný text záznamu

Report

HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression

Autor: Chen, Yihang, Wu, Qianyi, Lin, Weiyao, Harandi, Mehrtash, Cai, Jianfei

Publikováno v: ECCV 2024

3D Gaussian Splatting (3DGS) has emerged as a promising framework for novel view synthesis, boasting rapid rendering speed with high fidelity. However, the substantial Gaussians and their associated attributes necessitate effective compression techni

Externí odkaz: http://arxiv.org/abs/2403.14530

Zobrazit plný text záznamu

Report

Collaborative Weakly Supervised Video Correlation Learning for Procedure-Aware Instructional Video Analysis

Autor: He, Tianyao, Liu, Huabin, Li, Yuxi, Ma, Xiao, Zhong, Cheng, Zhang, Yang, Lin, Weiyao

Video Correlation Learning (VCL), which aims to analyze the relationships between videos, has been widely studied and applied in various general video tasks. However, applying VCL to instructional videos is still quite challenging due to their intrin

Externí odkaz: http://arxiv.org/abs/2312.11024

Zobrazit plný text záznamu

Report

Density Matters: Improved Core-set for Active Domain Adaptive Segmentation

Autor: Liu, Shizhan, Jiang, Zhengkai, Li, Yuxi, Peng, Jinlong, Wang, Yabiao, Lin, Weiyao

Active domain adaptation has emerged as a solution to balance the expensive annotation cost and the performance of trained models in semantic segmentation. However, existing works usually ignore the correlation between selected samples and its local

Externí odkaz: http://arxiv.org/abs/2312.09595

Zobrazit plný text záznamu

Report

BasisFormer: Attention-based Time Series Forecasting with Learnable and Interpretable Basis

Autor: Ni, Zelin, Yu, Hang, Liu, Shizhan, Li, Jianguo, Lin, Weiyao

Bases have become an integral part of modern deep learning-based models for time series forecasting due to their ability to act as feature extractors or future references. To be effective, a basis must be tailored to the specific set of time series d

Externí odkaz: http://arxiv.org/abs/2310.20496

Zobrazit plný text záznamu

Report

Few-shot Action Recognition via Intra- and Inter-Video Information Maximization

Autor: Liu, Huabin, Lin, Weiyao, Chen, Tieyuan, Li, Yuxi, Li, Shuyuan, See, John

Current few-shot action recognition involves two primary sources of information for classification:(1) intra-video information, determined by frame content within a single video clip, and (2) inter-video information, measured by relationships (e.g.,

Externí odkaz: http://arxiv.org/abs/2305.06114

Zobrazit plný text záznamu

Report

Scene Graph Lossless Compression with Adaptive Prediction for Objects and Relations

Autor: Zhang, Yufeng, Lin, Weiyao, Dai, Wenrui, Liu, Huabin, Xiong, Hongkai

The scene graph is a new data structure describing objects and their pairwise relationship within image scenes. As the size of scene graph in vision applications grows, how to losslessly and efficiently store such data on disks or transmit over the n

Externí odkaz: http://arxiv.org/abs/2304.13359

Zobrazit plný text záznamu

Report

Spatio-Temporal Point Process for Multiple Object Tracking

Autor: Wang, Tao, Chen, Kean, Lin, Weiyao, See, John, Zhang, Zenghui, Xu, Qian, Jia, Xia

Multiple Object Tracking (MOT) focuses on modeling the relationship of detected objects among consecutive frames and merge them into different trajectories. MOT remains a challenging task as noisy and confusing detection results often hinder the fina

Externí odkaz: http://arxiv.org/abs/2302.02444

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání