Výsledky vyhledávání - "Zhang, Haoji"

Report

Autor: Wang, Yiqin, Zhang, Haoji, Tang, Yansong, Liu, Yong, Feng, Jiashi, Dai, Jifeng, Jin, Xiaojie

This paper describes our champion solution to the LOVEU Challenge @ CVPR'24, Track 1 (Long Video VQA). Processing long sequences of visual tokens is computationally expensive and memory-intensive, making long video question-answering a challenging ta

Externí odkaz: http://arxiv.org/abs/2407.00603

Zobrazit plný text záznamu

Report

Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Autor: Zhang, Haoji, Wang, Yiqin, Tang, Yansong, Liu, Yong, Feng, Jiashi, Dai, Jifeng, Jin, Xiaojie

Benefiting from the advancements in large language models and cross-modal alignment, existing multi-modal video understanding methods have achieved prominent performance in offline scenario. However, online video streams, as one of the most common me

Externí odkaz: http://arxiv.org/abs/2406.08085

Zobrazit plný text záznamu

Report

PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image

Autor: Li, Jianhui, Li, Jianmin, Zhang, Haoji, Liu, Shilong, Wang, Zhengyi, Xiao, Zihao, Zheng, Kaiwen, Zhu, Jun

We study the 3D-aware image attribute editing problem in this paper, which has wide applications in practice. Recent methods solved the problem by training a shared encoder to map images into a 3D generator's latent space or by per-image latent code

Externí odkaz: http://arxiv.org/abs/2304.10263

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání