Výsledky vyhledávání - "Li, Ruihuang"

Report

Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding

Autor: Li, Ruihuang, Zhang, Zhengqiang, He, Chenhang, Ma, Zhiyuan, Patel, Vishal M., Zhang, Lei

Recent vision-language pre-training models have exhibited remarkable generalization ability in zero-shot recognition tasks. Previous open-vocabulary 3D scene understanding methods mostly focus on training 3D models using either image or text supervis

Externí odkaz: http://arxiv.org/abs/2407.09781

Zobrazit plný text záznamu

Report

SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing

Autor: Li, Ruihuang, Chen, Liyi, Zhang, Zhengqiang, Jampani, Varun, Patel, Vishal M., Zhang, Lei

Text-based 2D diffusion models have demonstrated impressive capabilities in image generation and editing. Meanwhile, the 2D diffusion models also exhibit substantial potentials for 3D editing tasks. However, how to achieve consistent edits across mul

Externí odkaz: http://arxiv.org/abs/2406.17396

Zobrazit plný text záznamu

Report

Source Prompt Disentangled Inversion for Boosting Image Editability with Diffusion Models

Autor: Li, Ruibin, Li, Ruihuang, Guo, Song, Zhang, Lei

Text-driven diffusion models have significantly advanced the image editing performance by using text prompts as inputs. One crucial step in text-driven image editing is to invert the original image into a latent noise code conditioned on the source p

Externí odkaz: http://arxiv.org/abs/2403.11105

Zobrazit plný text záznamu

Report

ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention

Autor: He, Chenhang, Li, Ruihuang, Zhang, Guowen, Zhang, Lei

Window-based transformers excel in large-scale point cloud understanding by capturing context-aware representations with affordable attention computation in a more localized manner. However, the sparse nature of point clouds leads to a significant va

Externí odkaz: http://arxiv.org/abs/2401.00912

Zobrazit plný text záznamu

Report

TMP: Temporal Motion Propagation for Online Video Super-Resolution

Autor: Zhang, Zhengqiang, Li, Ruihuang, Guo, Shi, Cao, Yang, Zhang, Lei

Online video super-resolution (online-VSR) highly relies on an effective alignment module to aggregate temporal information, while the strict latency requirement makes accurate and efficient alignment very challenging. Though much progress has been a

Externí odkaz: http://arxiv.org/abs/2312.09909

Zobrazit plný text záznamu

Report

One-to-Few Label Assignment for End-to-End Dense Detection

Autor: Li, Shuai, Li, Minghan, Li, Ruihuang, He, Chenhang, Zhang, Lei

One-to-one (o2o) label assignment plays a key role for transformer based end-to-end detection, and it has been recently introduced in fully convolutional detectors for end-to-end dense detection. However, o2o can degrade the feature learning efficien

Externí odkaz: http://arxiv.org/abs/2303.11567

Zobrazit plný text záznamu

Report

SIM: Semantic-aware Instance Mask Generation for Box-Supervised Instance Segmentation

Autor: Li, Ruihuang, He, Chenhang, Zhang, Yabin, Li, Shuai, Chen, Liyi, Zhang, Lei

Weakly supervised instance segmentation using only bounding box annotations has recently attracted much research attention. Most of the current efforts leverage low-level image features as extra supervision without explicitly exploiting the high-leve

Externí odkaz: http://arxiv.org/abs/2303.08578

Zobrazit plný text záznamu

Report

MSF: Motion-guided Sequential Fusion for Efficient 3D Object Detection from Point Cloud Sequences

Autor: He, Chenhang, Li, Ruihuang, Zhang, Yabin, Li, Shuai, Zhang, Lei

Point cloud sequences are commonly used to accurately detect 3D objects in applications such as autonomous driving. Current top-performing multi-frame detectors mostly follow a Detect-and-Fuse framework, which extracts features from each frame of the

Externí odkaz: http://arxiv.org/abs/2303.08316

Zobrazit plný text záznamu

Report

DynaMask: Dynamic Mask Selection for Instance Segmentation

Autor: Li, Ruihuang, He, Chenhang, Li, Shuai, Zhang, Yabin, Zhang, Lei

The representative instance segmentation methods mostly segment different object instances with a mask of the fixed resolution, e.g., 28*28 grid. However, a low-resolution mask loses rich details, while a high-resolution mask incurs quadratic computa

Externí odkaz: http://arxiv.org/abs/2303.07868

Zobrazit plný text záznamu

Report

Adversarial Style Augmentation for Domain Generalization

Autor: Zhang, Yabin, Deng, Bin, Li, Ruihuang, Jia, Kui, Zhang, Lei

It is well-known that the performance of well-trained deep neural networks may degrade significantly when they are applied to data with even slightly shifted distributions. Recent studies have shown that introducing certain perturbation on feature st

Externí odkaz: http://arxiv.org/abs/2301.12643

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání