Výsledky vyhledávání - "Zhao, Sanyuan"

Report

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Autor: Li, Chunliang, Han, Wencheng, Yin, Junbo, Zhao, Sanyuan, Shen, Jianbing

Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using tradit

Externí odkaz: http://arxiv.org/abs/2407.10876

Zobrazit plný text záznamu

Report

InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions

Autor: Zhang, Yiyuan, Kang, Yuhao, Zhang, Zhixin, Ding, Xiaohan, Zhao, Sanyuan, Yue, Xiangyu

We introduce $\textit{InteractiveVideo}$, a user-centric framework for video generation. Different from traditional generative approaches that operate based on user-provided images or text, our framework is designed for dynamic interaction, allowing

Externí odkaz: http://arxiv.org/abs/2402.03040

Zobrazit plný text záznamu

Report

TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision

Autor: Zhai, Yukun, Zhang, Xiaoqiang, Qin, Xiameng, Zhao, Sanyuan, Dong, Xingping, Shen, Jianbing

End-to-end text spotting is a vital computer vision task that aims to integrate scene text detection and recognition into a unified framework. Typical methods heavily rely on Region-of-Interest (RoI) operations to extract local features and complex p

Externí odkaz: http://arxiv.org/abs/2306.03377

Zobrazit plný text záznamu

Report

Generalized Few-Shot 3D Object Detection of LiDAR Point Cloud for Autonomous Driving

Autor: Liu, Jiawei, Dong, Xingping, Zhao, Sanyuan, Shen, Jianbing

Recent years have witnessed huge successes in 3D object detection to recognize common objects for autonomous driving (e.g., vehicles and pedestrians). However, most methods rely heavily on a large amount of well-labeled training data. This limits the

Externí odkaz: http://arxiv.org/abs/2302.03914

Zobrazit plný text záznamu

Report

Bilateral Cross-Modality Graph Matching Attention for Feature Fusion in Visual Question Answering

Autor: Cao, JianJian, Qin, Xiameng, Zhao, Sanyuan, Shen, Jianbing

Answering semantically-complicated questions according to an image is challenging in Visual Question Answering (VQA) task. Although the image can be well represented by deep learning, the question is always simply embedded and cannot well indicate it

Externí odkaz: http://arxiv.org/abs/2112.07270

Zobrazit plný text záznamu

Report

Self-Learning with Rectification Strategy for Human Parsing

Autor: Li, Tao, Liang, Zhiyuan, Zhao, Sanyuan, Gong, Jiahao, Shen, Jianbing

In this paper, we solve the sample shortage problem in the human parsing task. We begin with the self-learning strategy, which generates pseudo-labels for unlabeled data to retrain the model. However, directly using noisy pseudo-labels will cause err

Externí odkaz: http://arxiv.org/abs/2004.08055

Zobrazit plný text záznamu

Akademický článek

Spectrum-irrelevant fine-grained representation for visible–infrared person re-identification

Autor: Gong, Jiahao, Zhao, Sanyuan, Lam, Kin-Man, Gao, Xin, Shen, Jianbing

Publikováno v: In Computer Vision and Image Understanding July 2023 232

Zobrazit plný text záznamu

Report

Improved Face Detection and Alignment using Cascade Deep Convolutional Network

Autor: Cong, Weilin, Zhao, Sanyuan, Tian, Hui, Shen, Jianbing

Real-world face detection and alignment demand an advanced discriminative model to address challenges by pose, lighting and expression. Illuminated by the deep learning algorithm, some convolutional neural networks based face detection and alignment

Externí odkaz: http://arxiv.org/abs/1707.09364

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání