Výsledky vyhledávání

Report

IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking

Autor: Luo, Run, Song, Zikai, Chen, Longze, Li, Yunshui, Yang, Min, Yang, Wei

Multi-Object Tracking (MOT) aims to associate multiple objects across video frames and is a challenging vision task due to inherent complexities in the tracking environment. Most existing approaches train and track within a single domain, resulting i

Externí odkaz: http://arxiv.org/abs/2410.23907

Zobrazit plný text záznamu

Report

Autogenic Language Embedding for Coherent Point Tracking

Autor: Song, Zikai, Tang, Ying, Luo, Run, Ma, Lintao, Yu, Junqing, Chen, Yi-Ping Phoebe, Yang, Wei

Point tracking is a challenging task in computer vision, aiming to establish point-wise correspondence across long video sequences. Recent advancements have primarily focused on temporal modeling techniques to improve local feature similarity, often

Externí odkaz: http://arxiv.org/abs/2407.20730

Zobrazit plný text záznamu

Report

Coupled Mamba: Enhanced Multi-modal Fusion with Coupled State Space Model

Autor: Li, Wenbing, Zhou, Hang, Yu, Junqing, Song, Zikai, Yang, Wei

The essence of multi-modal fusion lies in exploiting the complementary information inherent in diverse modalities. However, prevalent fusion methods rely on traditional neural architectures and are inadequately equipped to capture the dynamics of int

Externí odkaz: http://arxiv.org/abs/2405.18014

Zobrazit plný text záznamu

Report

DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception

Autor: Luo, Run, Li, Yunshui, Chen, Longze, He, Wanwei, Lin, Ting-En, Liu, Ziqiang, Zhang, Lei, Song, Zikai, Xia, Xiaobo, Liu, Tongliang, Yang, Min, Hui, Binyuan

The development of large language models (LLMs) has significantly advanced the emergence of large multimodal models (LMMs). While LMMs have achieved tremendous success by promoting the synergy between multimodal comprehension and creation, they often

Externí odkaz: http://arxiv.org/abs/2405.15232

Zobrazit plný text záznamu

Report

EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation

Autor: Liu, Wenkai, Guan, Tao, Zhu, Bin, Ju, Lili, Song, Zikai, Li, Dan, Wang, Yuesong, Yang, Wei

In the domain of 3D scene representation, 3D Gaussian Splatting (3DGS) has emerged as a pivotal technology. However, its application to large-scale, high-resolution scenes (exceeding 4k$\times$4k pixels) is hindered by the excessive computational req

Externí odkaz: http://arxiv.org/abs/2404.12777

Zobrazit plný text záznamu

Report

AMD:Anatomical Motion Diffusion with Interpretable Motion Decomposition and Fusion

Autor: Jing, Beibei, Zhang, Youjia, Song, Zikai, Yu, Junqing, Yang, Wei

Generating realistic human motion sequences from text descriptions is a challenging task that requires capturing the rich expressiveness of both natural language and human motion.Recent advances in diffusion models have enabled significant progress i

Externí odkaz: http://arxiv.org/abs/2312.12763

Zobrazit plný text záznamu

Report

Optimized View and Geometry Distillation from Multi-view Diffuser

Autor: Zhang, Youjia, Song, Zikai, Yu, Junqing, Luo, Yawei, Yang, Wei

Generating multi-view images from a single input view using image-conditioned diffusion models is a recent advancement and has shown considerable potential. However, issues such as the lack of consistency in synthesized views and over-smoothing in ex

Externí odkaz: http://arxiv.org/abs/2312.06198

Zobrazit plný text záznamu

Report

Progressive Text-to-Image Diffusion with Soft Latent Direction

Autor: Ye, YuTeng, Cai, Jiale, Zhou, Hang, Li, Guanwen, Zhang, Youjia, Song, Zikai, Gao, Chenxing, Yu, Junqing, Yang, Wei

In spite of the rapidly evolving landscape of text-to-image generation, the synthesis and manipulation of multiple entities while adhering to specific relational constraints pose enduring challenges. This paper introduces an innovative progressive sy

Externí odkaz: http://arxiv.org/abs/2309.09466

Zobrazit plný text záznamu

Report

DiffusionTrack: Diffusion Model For Multi-Object Tracking

Autor: Luo, Run, Song, Zikai, Ma, Lintao, Wei, Jinlin, Yang, Wei, Yang, Min

Multi-object tracking (MOT) is a challenging vision task that aims to detect individual objects within a single frame and associate them across multiple frames. Recent MOT approaches can be categorized into two-stage tracking-by-detection (TBD) metho

Externí odkaz: http://arxiv.org/abs/2308.09905

Zobrazit plný text záznamu

Report

Compact Transformer Tracker with Correlative Masked Modeling

Autor: Song, Zikai, Luo, Run, Yu, Junqing, Chen, Yi-Ping Phoebe, Yang, Wei

Transformer framework has been showing superior performances in visual object tracking for its great strength in information aggregation across the template and search image with the well-known attention mechanism. Most recent advances focus on explo

Externí odkaz: http://arxiv.org/abs/2301.10938

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání