Výsledky vyhledávání - "Shi, Hengcan"

Report

DrVideo: Document Retrieval Based Long Video Understanding

Autor: Ma, Ziyu, Gou, Chenhui, Shi, Hengcan, Sun, Bin, Li, Shutao, Rezatofighi, Hamid, Cai, Jianfei

Existing methods for long video understanding primarily focus on videos only lasting tens of seconds, with limited exploration of techniques for handling longer videos. The increased number of frames in longer videos presents two main challenges: dif

Externí odkaz: http://arxiv.org/abs/2406.12846

Zobrazit plný text záznamu

Report

DifFUSER: Diffusion Model for Robust Multi-Sensor Fusion in 3D Object Detection and BEV Segmentation

Autor: Le, Duy-Tho, Shi, Hengcan, Cai, Jianfei, Rezatofighi, Hamid

Diffusion models have recently gained prominence as powerful deep generative models, demonstrating unmatched performance across various domains. However, their potential in multi-sensor fusion remains largely unexplored. In this work, we introduce Di

Externí odkaz: http://arxiv.org/abs/2404.04629

Zobrazit plný text záznamu

Report

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

Autor: Le, Duy-Tho, Gou, Chenhui, Datta, Stavya, Shi, Hengcan, Reid, Ian, Cai, Jianfei, Rezatofighi, Hamid

Autonomous robot systems have attracted increasing research attention in recent years, where environment understanding is a crucial step for robot navigation, human-robot interaction, and decision. Real-world robot systems usually collect visual data

Externí odkaz: http://arxiv.org/abs/2404.01686

Zobrazit plný text záznamu

Report

Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss

Autor: Ren, Xuhua, Shi, Hengcan, Li, Jin

Scene text recognition is an important and challenging task in computer vision. However, most prior works focus on recognizing pre-defined words, while there are various out-of-vocabulary (OOV) words in real-world applications. In this paper, we prop

Externí odkaz: http://arxiv.org/abs/2403.07518

Zobrazit plný text záznamu

Report

Unified Open-Vocabulary Dense Visual Prediction

Autor: Shi, Hengcan, Hayat, Munawar, Cai, Jianfei

In recent years, open-vocabulary (OV) dense visual prediction (such as OV object detection, semantic, instance and panoptic segmentations) has attracted increasing research attention. However, most of existing approaches are task-specific and individ

Externí odkaz: http://arxiv.org/abs/2307.08238

Zobrazit plný text záznamu

Report

CoactSeg: Learning from Heterogeneous Data for New Multiple Sclerosis Lesion Segmentation

Autor: Wu, Yicheng, Wu, Zhonghua, Shi, Hengcan, Picker, Bjoern, Chong, Winston, Cai, Jianfei

New lesion segmentation is essential to estimate the disease progression and therapeutic effects during multiple sclerosis (MS) clinical treatments. However, the expensive data acquisition and expert annotation restrict the feasibility of applying la

Externí odkaz: http://arxiv.org/abs/2307.04513

Zobrazit plný text záznamu

Report

Open-Vocabulary Object Detection via Scene Graph Discovery

Autor: Shi, Hengcan, Hayat, Munawar, Cai, Jianfei

In recent years, open-vocabulary (OV) object detection has attracted increasing research attention. Unlike traditional detection, which only recognizes fixed-category objects, OV detection aims to detect objects in an open category set. Previous work

Externí odkaz: http://arxiv.org/abs/2307.03339

Zobrazit plný text záznamu

Report

Class Enhancement Losses with Pseudo Labels for Zero-shot Semantic Segmentation

Autor: Dao, Son Duy, Shi, Hengcan, Phung, Dinh, Cai, Jianfei

Recent mask proposal models have significantly improved the performance of zero-shot semantic segmentation. However, the use of a `background' embedding during training in these methods is problematic as the resulting model tends to over-learn and as

Externí odkaz: http://arxiv.org/abs/2301.07336

Zobrazit plný text záznamu

Report

Transformer Scale Gate for Semantic Segmentation

Autor: Shi, Hengcan, Hayat, Munawar, Cai, Jianfei

Effectively encoding multi-scale contextual information is crucial for accurate semantic segmentation. Existing transformer-based segmentation models combine features across scales without any selection, where features on sub-optimal scales may degra

Externí odkaz: http://arxiv.org/abs/2205.07056

Zobrazit plný text záznamu

Report

ProposalCLIP: Unsupervised Open-Category Object Proposal Generation via Exploiting CLIP Cues

Autor: Shi, Hengcan, Hayat, Munawar, Wu, Yicheng, Cai, Jianfei

Object proposal generation is an important and fundamental task in computer vision. In this paper, we propose ProposalCLIP, a method towards unsupervised open-category object proposal generation. Unlike previous works which require a large number of

Externí odkaz: http://arxiv.org/abs/2201.06696

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání