Výsledky vyhledávání - "Ding, Henghui"

Report

3D-GRES: Generalized 3D Referring Expression Segmentation

Autor: Wu, Changli, Liu, Yihang, Ji, Jiayi, Ma, Yiwei, Wang, Haowei, Luo, Gen, Ding, Henghui, Sun, Xiaoshuai, Ji, Rongrong

3D Referring Expression Segmentation (3D-RES) is dedicated to segmenting a specific instance within a 3D space based on a natural language description. However, current approaches are limited to segmenting a single target, restricting the versatility

Externí odkaz: http://arxiv.org/abs/2407.20664

Zobrazit plný text záznamu

Report

RefMask3D: Language-Guided Transformer for 3D Referring Segmentation

Autor: He, Shuting, Ding, Henghui

3D referring segmentation is an emerging and challenging vision-language task that aims to segment the object described by a natural language expression in a point cloud scene. The key challenge behind this task is vision-language feature fusion and

Externí odkaz: http://arxiv.org/abs/2407.18244

Zobrazit plný text záznamu

Report

SegPoint: Segment Any Point Cloud via Large Language Model

Autor: He, Shuting, Ding, Henghui, Jiang, Xudong, Wen, Bihan

Despite significant progress in 3D point cloud segmentation, existing methods primarily address specific tasks and depend on explicit instructions to identify targets, lacking the capability to infer and understand implicit user intentions in a unifi

Externí odkaz: http://arxiv.org/abs/2407.13761

Zobrazit plný text záznamu

Report

PECTP: Parameter-Efficient Cross-Task Prompts for Incremental Vision Transformer

Autor: Feng, Qian, Zhao, Hanbin, Zhang, Chao, Dong, Jiahua, Ding, Henghui, Jiang, Yu-Gang, Qian, Hui

Incremental Learning (IL) aims to learn deep models on sequential tasks continually, where each new task includes a batch of new classes and deep models have no access to task-ID information at the inference time. Recent vast pre-trained models (PTMs

Externí odkaz: http://arxiv.org/abs/2407.03813

Zobrazit plný text záznamu

Report

PVUW 2024 Challenge on Complex Video Understanding: Methods and Results

Pixel-level Video Understanding in the Wild Challenge (PVUW) focus on complex video understanding. In this CVPR 2024 workshop, we add two new tracks, Complex Video Object Segmentation Track based on MOSE dataset and Motion Expression guided Video Seg

Externí odkaz: http://arxiv.org/abs/2406.17005

Zobrazit plný text záznamu

Report

A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models

Autor: Shuai, Xincheng, Ding, Henghui, Ma, Xingjun, Tu, Rongcheng, Jiang, Yu-Gang, Tao, Dacheng

Image editing aims to edit the given synthetic or real image to meet the specific requirements from users. It is widely studied in recent years as a promising and challenging field of Artificial Intelligence Generative Content (AIGC). Recent signific

Externí odkaz: http://arxiv.org/abs/2406.14555

Zobrazit plný text záznamu

Report

SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow

Autor: Wang, Chaoyang, Li, Xiangtai, Qi, Lu, Ding, Henghui, Tong, Yunhai, Yang, Ming-Hsuan

Semantic segmentation and semantic image synthesis are two representative tasks in visual perception and generation. While existing methods consider them as two distinct tasks, we propose a unified diffusion-based framework (SemFlow) and model them a

Externí odkaz: http://arxiv.org/abs/2405.20282

Zobrazit plný text záznamu

Report

Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation

Autor: Gu, Zejun, Zhao, Zhong-Qiu, Ding, Henghui, Shen, Hao, Zhang, Zhao, Huang, De-Shuang

In practical applications of human pose estimation, low-resolution inputs frequently occur, and existing state-of-the-art models perform poorly with low-resolution images. This work focuses on boosting the performance of low-resolution models by dist

Externí odkaz: http://arxiv.org/abs/2405.11448

Zobrazit plný text záznamu

Report

Mitigating the Curse of Dimensionality for Certified Robustness via Dual Randomized Smoothing

Autor: Xia, Song, Yu, Yi, Jiang, Xudong, Ding, Henghui

Randomized Smoothing (RS) has been proven a promising method for endowing an arbitrary image classifier with certified robustness. However, the substantial uncertainty inherent in the high-dimensional isotropic Gaussian noise imposes the curse of dim

Externí odkaz: http://arxiv.org/abs/2404.09586

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání