Výsledky vyhledávání - "Tang, Mingqian"

Report

Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning

Autor: Pei, Yixuan, Qing, Zhiwu, Cen, Jun, Wang, Xiang, Zhang, Shiwei, Wang, Yaxiong, Tang, Mingqian, Sang, Nong, Qian, Xueming

Recent incremental learning for action recognition usually stores representative videos to mitigate catastrophic forgetting. However, only a few bulky videos can be stored due to the limited memory. To address this problem, we propose FrameMaker, a m

Externí odkaz: http://arxiv.org/abs/2211.00833

Zobrazit plný text záznamu

Report

Grow and Merge: A Unified Framework for Continuous Categories Discovery

Autor: Zhang, Xinwei, Jiang, Jianwen, Feng, Yutong, Wu, Zhi-Fan, Zhao, Xibin, Wan, Hai, Tang, Mingqian, Jin, Rong, Gao, Yue

Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories. In this work, we focus on the application scenarios where u

Externí odkaz: http://arxiv.org/abs/2210.04174

Zobrazit plný text záznamu

Report

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

Autor: Yuan, Hangjie, Jiang, Jianwen, Albanie, Samuel, Feng, Tao, Huang, Ziyuan, Ni, Dong, Tang, Mingqian

The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications. Prior work has demonstrated the benefits of effective architecture design a

Externí odkaz: http://arxiv.org/abs/2209.01814

Zobrazit plný text záznamu

Report

Open-world Semantic Segmentation for LIDAR Point Clouds

Autor: Cen, Jun, Yun, Peng, Zhang, Shiwei, Cai, Junhao, Luan, Di, Wang, Michael Yu, Liu, Ming, Tang, Mingqian

Current methods for LIDAR semantic segmentation are not robust enough for real-world applications, e.g., autonomous driving, since it is closed-set and static. The closed-set assumption makes the network only able to output labels of trained classes,

Externí odkaz: http://arxiv.org/abs/2207.01452

Zobrazit plný text záznamu

Report

Hybrid Relation Guided Set Matching for Few-shot Action Recognition

Autor: Wang, Xiang, Zhang, Shiwei, Qing, Zhiwu, Tang, Mingqian, Zuo, Zhengrong, Gao, Changxin, Jin, Rong, Sang, Nong

Current few-shot action recognition methods reach impressive performance by learning discriminative features for each video via episodic training and designing various temporal alignment strategies. Nevertheless, they are limited in that (a) learning

Externí odkaz: http://arxiv.org/abs/2204.13423

Zobrazit plný text záznamu

Report

Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency

Autor: Qing, Zhiwu, Zhang, Shiwei, Huang, Ziyuan, Xu, Yi, Wang, Xiang, Tang, Mingqian, Gao, Changxin, Jin, Rong, Sang, Nong

Natural videos provide rich visual contents for self-supervised learning. Yet most existing approaches for learning spatio-temporal representations rely on manually trimmed videos, leading to limited diversity in visual patterns and limited performan

Externí odkaz: http://arxiv.org/abs/2204.03017

Zobrazit plný text záznamu

Report

TAda! Temporally-Adaptive Convolutions for Video Understanding

Autor: Huang, Ziyuan, Zhang, Shiwei, Pan, Liang, Qing, Zhiwu, Tang, Mingqian, Liu, Ziwei, Ang Jr, Marcelo H.

Spatial convolutions are widely used in numerous deep video models. It fundamentally assumes spatio-temporal invariance, i.e., using shared weights for every location in different frames. This work presents Temporally-Adaptive Convolutions (TAdaConv)

Externí odkaz: http://arxiv.org/abs/2110.06178

Zobrazit plný text záznamu

Report

Rethinking Supervised Pre-training for Better Downstream Transferring

Autor: Feng, Yutong, Jiang, Jianwen, Tang, Mingqian, Jin, Rong, Gao, Yue

The pretrain-finetune paradigm has shown outstanding performance on many applications of deep learning, where a model is pre-trained on a upstream large dataset (e.g. ImageNet), and is then fine-tuned to different downstream tasks. Though for most ca

Externí odkaz: http://arxiv.org/abs/2110.06014

Zobrazit plný text záznamu

Report

NGC: A Unified Framework for Learning with Open-World Noisy Data

Autor: Wu, Zhi-Fan, Wei, Tong, Jiang, Jianwen, Mao, Chaojie, Tang, Mingqian, Li, Yu-Feng

The existence of noisy data is prevalent in both the training and testing phases of machine learning systems, which inevitably leads to the degradation of model performance. There have been plenty of works concentrated on learning with in-distributio

Externí odkaz: http://arxiv.org/abs/2108.11035

Zobrazit plný text záznamu

Report

Support-Set Based Cross-Supervision for Video Grounding

Autor: Ding, Xinpeng, Wang, Nannan, Zhang, Shiwei, Cheng, De, Li, Xiaomeng, Huang, Ziyuan, Tang, Mingqian, Gao, Xinbo

Current approaches for video grounding propose kinds of complex architectures to capture the video-text relations, and have achieved impressive improvements. However, it is hard to learn the complicated multi-modal relations by only architecture desi

Externí odkaz: http://arxiv.org/abs/2108.10576

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání