Výsledky vyhledávání

Report

Unified and Dynamic Graph for Temporal Character Grouping in Long Videos

Autor: Shu, Xiujun, Wen, Wei, Xu, Liangsheng, Qiao, Ruizhi, Guo, Taian, Li, Hanjun, Gan, Bei, Wang, Xiao, Sun, Xing

Video temporal character grouping locates appearing moments of major characters within a video according to their identities. To this end, recent works have evolved from unsupervised clustering to graph-based supervised clustering. However, graph met

Externí odkaz: http://arxiv.org/abs/2308.14105

Zobrazit plný text záznamu

Report

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

Autor: Li, Hanjun, Shu, Xiujun, He, Sunan, Qiao, Ruizhi, Wen, Wei, Guo, Taian, Gan, Bei, Sun, Xing

Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. Recently, weakly supervised methods still have a large performance gap compared to fully supervised ones, while the latter

Externí odkaz: http://arxiv.org/abs/2308.04197

Zobrazit plný text záznamu

Report

Collaborative Noisy Label Cleaner: Learning Scene-aware Trailers for Multi-modal Highlight Detection in Movies

Autor: Gan, Bei, Shu, Xiujun, Qiao, Ruizhi, Wu, Haoqian, Chen, Keyu, Li, Hanjun, Ren, Bo

Movie highlights stand out of the screenplay for efficient browsing and play a crucial role on social media platforms. Based on existing efforts, this work has two observations: (1) For different annotators, labeling highlight has uncertainty, which

Externí odkaz: http://arxiv.org/abs/2303.14768

Zobrazit plný text záznamu

Report

VLMAE: Vision-Language Masked Autoencoder

Autor: He, Sunan, Guo, Taian, Dai, Tao, Qiao, Ruizhi, Wu, Chen, Shu, Xiujun, Ren, Bo

Image and language modeling is of crucial importance for vision-language pre-training (VLP), which aims to learn multi-modal representations from large-scale paired image-text data. However, we observe that most existing VLP methods focus on modeling

Externí odkaz: http://arxiv.org/abs/2208.09374

Zobrazit plný text záznamu

Report

See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval

Autor: Shu, Xiujun, Wen, Wei, Wu, Haoqian, Chen, Keyu, Song, Yiran, Qiao, Ruizhi, Ren, Bo, Wang, Xiao

Text-based person retrieval aims to find the query person based on a textual description. The key is to learn a common latent space mapping between visual-textual modalities. To achieve this goal, existing works employ segmentation to obtain explicit

Externí odkaz: http://arxiv.org/abs/2208.08608

Zobrazit plný text záznamu

Report

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Autor: Shu, Xiujun, Wen, Wei, Guo, Taian, He, Sunan, Wu, Chen, Qiao, Ruizhi

This technical report presents the 3rd winning solution for MTVG, a new task introduced in the 4-th Person in Context (PIC) Challenge at ACM MM 2022. MTVG aims at localizing the temporal boundary of the step in an untrimmed video based on a textual d

Externí odkaz: http://arxiv.org/abs/2208.06179

Zobrazit plný text záznamu

Akademický článek

Precise occlusion-aware and feature-level reconstruction for occluded person re-identification

Autor: Shu, Xiujun, Li, Hanjun, Wen, Wei, Qiao, Ruizhi, Li, Nannan, Ruan, Weijian, Su, Hanjing, Wang, Bo, Chen, Shouzhi, Zhou, Jun

Publikováno v: In Neurocomputing 1 February 2025 616

Zobrazit plný text záznamu

Report

Head and Body: Unified Detector and Graph Network for Person Search in Media

Autor: Shu, Xiujun, Tao, Yusheng, Qiao, Ruizhi, Ke, Bo, Wen, Wei, Ren, Bo

Person search in media has seen increasing potential in Internet applications, such as video clipping and character collection. This task is common but overlooked by previous person search works which focus on surveillance scenes. The media scenarios

Externí odkaz: http://arxiv.org/abs/2111.13888

Zobrazit plný text záznamu

Report

Learning to Disentangle Scenes for Person Re-identification

Autor: Zang, Xianghao, Li, Ge, Gao, Wei, Shu, Xiujun

Publikováno v: Image and Vision Computing 2021

There are many challenging problems in the person re-identification (ReID) task, such as the occlusion and scale variation. Existing works usually tried to solve them by employing a one-branch network. This one-branch network needs to be robust to va

Externí odkaz: http://arxiv.org/abs/2111.05476

Zobrazit plný text záznamu

Report

Exploiting Robust Unsupervised Video Person Re-identification

Autor: Zang, Xianghao, Li, Ge, Gao, Wei, Shu, Xiujun

Publikováno v: IET Image Processing 2022

Unsupervised video person re-identification (reID) methods usually depend on global-level features. And many supervised reID methods employed local-level features and achieved significant performance improvements. However, applying local-level featur

Externí odkaz: http://arxiv.org/abs/2111.05170

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání