Výsledky vyhledávání - "Hui, Tianrui"

Report

Dynamic Prompting of Frozen Text-to-Image Diffusion Models for Panoptic Narrative Grounding

Autor: Li, Hongyu, Hui, Tianrui, Ding, Zihan, Zhang, Jing, Ma, Bin, Wei, Xiaoming, Han, Jizhong, Liu, Si

Panoptic narrative grounding (PNG), whose core target is fine-grained image-text alignment, requires a panoptic segmentation of referred objects given a narrative caption. Previous discriminative methods achieve only weak or coarse-grained alignment

Externí odkaz: http://arxiv.org/abs/2409.08251

Zobrazit plný text záznamu

Report

Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation

Autor: Huang, Shaofei, Ling, Rui, Li, Hongyu, Hui, Tianrui, Tang, Zongheng, Wei, Xiaoming, Han, Jizhong, Liu, Si

In this paper, we propose an Audio-Language-Referenced SAM 2 (AL-Ref-SAM 2) pipeline to explore the training-free paradigm for audio and language-referenced video object segmentation, namely AVS and RVOS tasks. The intuitive solution leverages Ground

Externí odkaz: http://arxiv.org/abs/2408.15876

Zobrazit plný text záznamu

Report

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

Autor: He, Runze, Huang, Shaofei, Nie, Xuecheng, Hui, Tianrui, Liu, Luoqi, Dai, Jiao, Han, Jizhong, Li, Guanbin, Liu, Si

In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt. However, obtaining desired editing results conformed with the editin

Externí odkaz: http://arxiv.org/abs/2312.01663

Zobrazit plný text záznamu

Report

Enriching Phrases with Coupled Pixel and Object Contexts for Panoptic Narrative Grounding

Autor: Hui, Tianrui, Ding, Zihan, Huang, Junshi, Wei, Xiaoming, Wei, Xiaolin, Dai, Jiao, Han, Jizhong, Liu, Si

Panoptic narrative grounding (PNG) aims to segment things and stuff objects in an image described by noun phrases of a narrative caption. As a multimodal task, an essential aspect of PNG is the visual-linguistic interaction between image and caption.

Externí odkaz: http://arxiv.org/abs/2311.01091

Zobrazit plný text záznamu

Report

Cross-Modality Domain Adaptation for Freespace Detection: A Simple yet Effective Baseline

Autor: Wang, Yuanbin, Zhu, Leyan, Huang, Shaofei, Hui, Tianrui, Li, Xiaojie, Wang, Fei, Liu, Si

As one of the fundamental functions of autonomous driving system, freespace detection aims at classifying each pixel of the image captured by the camera as drivable or non-drivable. Current works of freespace detection heavily rely on large amount of

Externí odkaz: http://arxiv.org/abs/2210.02991

Zobrazit plný text záznamu

Report

PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding

Autor: Ding, Zihan, Ding, Zi-han, Hui, Tianrui, Huang, Junshi, Wei, Xiaoming, Wei, Xiaolin, Liu, Si

Panoptic Narrative Grounding (PNG) is an emerging task whose goal is to segment visual objects of things and stuff categories described by dense narrative captions of a still image. The previous two-stage approach first extracts segmentation region p

Externí odkaz: http://arxiv.org/abs/2208.05647

Zobrazit plný text záznamu

Report

Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation

Autor: Ding, Zihan, Hui, Tianrui, Huang, Junshi, Wei, Xiaoming, Han, Jizhong, Liu, Si

Referring video object segmentation aims to predict foreground labels for objects referred by natural language expressions in videos. Previous methods either depend on 3D ConvNets or incorporate additional 2D ConvNets as encoders to extract mixed spa

Externí odkaz: http://arxiv.org/abs/2206.03789

Zobrazit plný text záznamu

Report

A Keypoint-based Global Association Network for Lane Detection

Autor: Wang, Jinsheng, Ma, Yinchao, Huang, Shaofei, Hui, Tianrui, Wang, Fei, Qian, Chen, Zhang, Tianzhu

Lane detection is a challenging task that requires predicting complex topology shapes of lane lines and distinguishing different types of lanes simultaneously. Earlier works follow a top-down roadmap to regress predefined anchors into various shapes

Externí odkaz: http://arxiv.org/abs/2204.07335

Zobrazit plný text záznamu

Akademický článek

Modality adaptation via feature difference learning for depth human parsing

Autor: Huang, Shaofei, Hui, Tianrui, Gong, Yue, Peng, Fengguang, Fang, Yuqiang, Wang, Jingwei, Ma, Bin, Wei, Xiaoming, Han, Jizhong

Publikováno v: In Computer Vision and Image Understanding October 2024 247

Zobrazit plný text záznamu

Report

TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding

Autor: He, Dailan, Zhao, Yusheng, Luo, Junyu, Hui, Tianrui, Huang, Shaofei, Zhang, Aixi, Liu, Si

Recently proposed fine-grained 3D visual grounding is an essential and challenging task, whose goal is to identify the 3D object referred by a natural language sentence from other distractive objects of the same category. Existing works usually adopt

Externí odkaz: http://arxiv.org/abs/2108.02388

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání