Zobrazeno 1 - 10
of 729
pro vyhledávání: '"Lu , Haonan"'
Autor:
Yang, Fan, Zhao, Sicheng, Zhang, Yanhao, Chen, Haoxiang, Chen, Hui, Tang, Wenbo, Lu, Haonan, Xu, Pengfei, Yang, Zhenyu, Han, Jungong, Ding, Guiguang
Recent advancements in autonomous driving, augmented reality, robotics, and embodied intelligence have necessitated 3D perception algorithms. However, current 3D perception methods, particularly small models, struggle with processing logical reasonin
Externí odkaz:
http://arxiv.org/abs/2408.07422
Posters play a crucial role in marketing and advertising by enhancing visual communication and brand visibility, making significant contributions to industrial design. With the latest advancements in controllable T2I diffusion models, increasing rese
Externí odkaz:
http://arxiv.org/abs/2407.02252
Distilling latent diffusion models (LDMs) into ones that are fast to sample from is attracting growing research interest. However, the majority of existing methods face two critical challenges: (1) They hinge on long training using a huge volume of r
Externí odkaz:
http://arxiv.org/abs/2406.05768
Large Language Models (LLMs) have shown their impressive capabilities, while also raising concerns about the data contamination problems due to privacy issues and leakage of benchmark datasets in the pre-training phase. Therefore, it is vital to dete
Externí odkaz:
http://arxiv.org/abs/2406.01333
In the era of AIGC, the demand for low-budget or even on-device applications of diffusion models emerged. In terms of compressing the Stable Diffusion models (SDMs), several approaches have been proposed, and most of them leveraged the handcrafted la
Externí odkaz:
http://arxiv.org/abs/2404.11098
Instruction tuning effectively optimizes Large Language Models (LLMs) for downstream tasks. Due to the changing environment in real-life applications, LLMs necessitate continual task-specific adaptation without catastrophic forgetting. Considering th
Externí odkaz:
http://arxiv.org/abs/2403.11435
Autor:
Liu, Hongjian, Xie, Qingsong, Deng, Zhijie, Chen, Chen, Tang, Shixiang, Fu, Fueyang, Zha, Zheng-jun, Lu, Haonan
The iterative sampling procedure employed by diffusion models (DMs) often leads to significant inference latency. To address this, we propose Stochastic Consistency Distillation (SCott) to enable accelerated text-to-image generation, where high-quali
Externí odkaz:
http://arxiv.org/abs/2403.01505
Autor:
Ai, Hao, Cao, Zidong, Lu, Haonan, Chen, Chen, Ma, Jian, Zhou, Pengyuan, Kim, Tae-Kyun, Hui, Pan, Wang, Lin
360 images, with a field-of-view (FoV) of 180x360, provide immersive and realistic environments for emerging virtual reality (VR) applications, such as virtual tourism, where users desire to create diverse panoramic scenes from a narrow FoV photo the
Externí odkaz:
http://arxiv.org/abs/2401.10564
Text-to-image diffusion models are well-known for their ability to generate realistic images based on textual prompts. However, the existing works have predominantly focused on English, lacking support for non-English text-to-image models. The most c
Externí odkaz:
http://arxiv.org/abs/2311.17086
Due to the success of large-scale visual-language pretraining (VLP) models and the widespread use of image-text retrieval in industry areas, it is now critically necessary to reduce the model size and streamline their mobile-device deployment. Single
Externí odkaz:
http://arxiv.org/abs/2310.19654