Výsledky vyhledávání

Report

Multimodal Label Relevance Ranking via Reinforcement Learning

Autor: Guo, Taian, Zhang, Taolin, Wu, Haoqian, Li, Hanjun, Qiao, Ruizhi, Sun, Xing

Conventional multi-label recognition methods often focus on label confidence, frequently overlooking the pivotal role of partial order relations consistent with human preference. To resolve these issues, we introduce a novel method for multimodal lab

Externí odkaz: http://arxiv.org/abs/2407.13221

Zobrazit plný text záznamu

Report

MMEvalPro: Calibrating Multimodal Benchmarks Towards Trustworthy and Efficient Evaluation

Autor: Huang, Jinsheng, Chen, Liang, Guo, Taian, Zeng, Fu, Zhao, Yusheng, Wu, Bohan, Yuan, Ye, Zhao, Haozhe, Guo, Zhihui, Zhang, Yichi, Yuan, Jingyang, Ju, Wei, Liu, Luchen, Liu, Tianyu, Chang, Baobao, Zhang, Ming

Large Multimodal Models (LMMs) exhibit impressive cross-modal understanding and reasoning abilities, often assessed through multiple-choice questions (MCQs) that include an image, a question, and several options. However, many benchmarks used for suc

Externí odkaz: http://arxiv.org/abs/2407.00468

Zobrazit plný text záznamu

Report

Unified and Dynamic Graph for Temporal Character Grouping in Long Videos

Autor: Shu, Xiujun, Wen, Wei, Xu, Liangsheng, Qiao, Ruizhi, Guo, Taian, Li, Hanjun, Gan, Bei, Wang, Xiao, Sun, Xing

Video temporal character grouping locates appearing moments of major characters within a video according to their identities. To this end, recent works have evolved from unsupervised clustering to graph-based supervised clustering. However, graph met

Externí odkaz: http://arxiv.org/abs/2308.14105

Zobrazit plný text záznamu

Report

D3G: Exploring Gaussian Prior for Temporal Sentence Grounding with Glance Annotation

Autor: Li, Hanjun, Shu, Xiujun, He, Sunan, Qiao, Ruizhi, Wen, Wei, Guo, Taian, Gan, Bei, Sun, Xing

Temporal sentence grounding (TSG) aims to locate a specific moment from an untrimmed video with a given natural language query. Recently, weakly supervised methods still have a large performance gap compared to fully supervised ones, while the latter

Externí odkaz: http://arxiv.org/abs/2308.04197

Zobrazit plný text záznamu

Report

VLMAE: Vision-Language Masked Autoencoder

Autor: He, Sunan, Guo, Taian, Dai, Tao, Qiao, Ruizhi, Wu, Chen, Shu, Xiujun, Ren, Bo

Image and language modeling is of crucial importance for vision-language pre-training (VLP), which aims to learn multi-modal representations from large-scale paired image-text data. However, we observe that most existing VLP methods focus on modeling

Externí odkaz: http://arxiv.org/abs/2208.09374

Zobrazit plný text záznamu

Report

Exploiting Feature Diversity for Make-up Temporal Video Grounding

Autor: Shu, Xiujun, Wen, Wei, Guo, Taian, He, Sunan, Wu, Chen, Qiao, Ruizhi

This technical report presents the 3rd winning solution for MTVG, a new task introduced in the 4-th Person in Context (PIC) Challenge at ACM MM 2022. MTVG aims at localizing the temporal boundary of the step in an untrimmed video based on a textual d

Externí odkaz: http://arxiv.org/abs/2208.06179

Zobrazit plný text záznamu

Report

Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer

Autor: He, Sunan, Guo, Taian, Dai, Tao, Qiao, Ruizhi, Ren, Bo, Xia, Shu-Tao

Real-world recognition system often encounters the challenge of unseen labels. To identify such unseen labels, multi-label zero-shot learning (ML-ZSL) focuses on transferring knowledge by a pre-trained textual label embedding (e.g., GloVe). However,

Externí odkaz: http://arxiv.org/abs/2207.01887

Zobrazit plný text záznamu

Report

MuCAN: Multi-Correspondence Aggregation Network for Video Super-Resolution

Autor: Li, Wenbo, Tao, Xin, Guo, Taian, Qi, Lu, Lu, Jiangbo, Jia, Jiaya

Video super-resolution (VSR) aims to utilize multiple low-resolution frames to generate a high-resolution prediction for each frame. In this process, inter- and intra-frames are the key sources for exploiting temporal and spatial information. However

Externí odkaz: http://arxiv.org/abs/2007.11803

Zobrazit plný text záznamu

Akademický článek

Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.

Akademický článek

Study on the networked integration model and its efficiency of private company taking part in state-owned company mixed ownership reform.

Autor: Zhou, Zhiqiang¹ (AUTHOR), Xu, Xinyu² (AUTHOR) 741756048@qq.com, Qu, Xilong³ (AUTHOR) quxilong@126.com, Li, Shun¹ (AUTHOR)

Publikováno v: International Journal of Engineering Business Management. 7/15/2020, Vol. 12, p1-13. 13p.

Zobrazit plný text záznamu

Plný text ve formátu HTML

Vyhledávací nástroje:

Upřesnit hledání