Zobrazeno 1 - 10
of 181
pro vyhledávání: '"Hou, Qibin"'
We present ClearSR, a new method that can better take advantage of latent low-resolution image (LR) embeddings for diffusion-based real-world image super-resolution (Real-ISR). Previous Real-ISR models mostly focus on how to activate more generative
Externí odkaz:
http://arxiv.org/abs/2410.14279
Autor:
Wang, Jiabao, Liu, Zhaojiang, Meng, Qiang, Yan, Liujiang, Wang, Ke, Yang, Jie, Liu, Wei, Hou, Qibin, Cheng, Ming-Ming
Occupancy prediction, aiming at predicting the occupancy status within voxelized 3D environment, is quickly gaining momentum within the autonomous driving community. Mainstream occupancy prediction works first discretize the 3D environment into voxel
Externí odkaz:
http://arxiv.org/abs/2409.09350
Autor:
Wang, Jiabao, Meng, Qiang, Liu, Guochao, Yan, Liujiang, Wang, Ke, Cheng, Ming-Ming, Hou, Qibin
In autonomous driving, the temporal stability of 3D object detection greatly impacts the driving safety. However, the detection stability cannot be accessed by existing metrics such as mAP and MOTA, and consequently is less explored by the community.
Externí odkaz:
http://arxiv.org/abs/2407.04305
Pre-trained vision-language models, e.g., CLIP, have been successfully applied to zero-shot semantic segmentation. Existing CLIP-based approaches primarily utilize visual features from the last layer to align with text embeddings, while they neglect
Externí odkaz:
http://arxiv.org/abs/2406.00670
For recent diffusion-based generative models, maintaining consistent content across a series of generated images, especially those containing subjects and complex details, presents a significant challenge. In this paper, we propose a new way of self-
Externí odkaz:
http://arxiv.org/abs/2405.01434
Previous multi-task dense prediction methods based on the Mixture of Experts (MoE) have received great performance but they neglect the importance of explicitly modeling the global relations among all tasks. In this paper, we present a novel decoder-
Externí odkaz:
http://arxiv.org/abs/2403.17749
Autor:
Li, Yuxuan, Li, Xiang, Dai, Yimian, Hou, Qibin, Liu, Li, Liu, Yongxiang, Cheng, Ming-Ming, Yang, Jian
Remote sensing images pose distinct challenges for downstream tasks due to their inherent complexity. While a considerable amount of research has been dedicated to remote sensing classification, object detection and semantic segmentation, most of the
Externí odkaz:
http://arxiv.org/abs/2403.11735
Synthetic Aperture Radar (SAR) object detection has gained significant attention recently due to its irreplaceable all-weather imaging capabilities. However, this research field suffers from both limited public datasets (mostly comprising <2K images
Externí odkaz:
http://arxiv.org/abs/2403.06534
The recently developed Sora model [1] has exhibited remarkable capabilities in video generation, sparking intense discussions regarding its ability to simulate real-world phenomena. Despite its growing popularity, there is a lack of established metri
Externí odkaz:
http://arxiv.org/abs/2402.17403
Previous deep learning-based event denoising methods mostly suffer from poor interpretability and difficulty in real-time processing due to their complex architecture designs. In this paper, we propose window-based event denoising, which simultaneous
Externí odkaz:
http://arxiv.org/abs/2402.09270