Zobrazeno 1 - 9
of 9
pro vyhledávání: '"Hang, Tiankai"'
Diffusion models have emerged as the de facto choice for generating high-quality visual signals across various domains. However, training a single model to predict noise across various levels poses significant challenges, necessitating numerous itera
Externí odkaz:
http://arxiv.org/abs/2407.03297
Autor:
Liang, Zhanhao, Yuan, Yuhui, Gu, Shuyang, Chen, Bohan, Hang, Tiankai, Cheng, Mingxi, Li, Ji, Zheng, Liang
Generating visually appealing images is fundamental to modern text-to-image generation models. A potential solution to better aesthetics is direct preference optimization (DPO), which has been applied to diffusion models to improve general image qual
Externí odkaz:
http://arxiv.org/abs/2406.04314
This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling
Externí odkaz:
http://arxiv.org/abs/2403.14623
This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks. Drawing inspiration from Generative Adversarial Net
Externí odkaz:
http://arxiv.org/abs/2401.13011
Autor:
Geng, Zigang, Yang, Binxin, Hang, Tiankai, Li, Chen, Gu, Shuyang, Zhang, Ting, Bao, Jianmin, Zhang, Zheng, Hu, Han, Chen, Dong, Guo, Baining
We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions. Unlike existing approaches that integrate prior knowledge and pre-define the output space (e.g., categories and coordinates) fo
Externí odkaz:
http://arxiv.org/abs/2309.03895
Autor:
Hang, Tiankai, Gu, Shuyang, Li, Chen, Bao, Jianmin, Chen, Dong, Hu, Han, Geng, Xin, Guo, Baining
Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence. In this paper, we discovered that the slow convergence is partly due to conflicting optimization dire
Externí odkaz:
http://arxiv.org/abs/2303.09556
Recent works on language-guided image manipulation have shown great power of language in providing rich semantics, especially for face images. However, the other natural information, motions, in language is less explored. In this paper, we leverage t
Externí odkaz:
http://arxiv.org/abs/2208.05617
Autor:
Xue, Hongwei, Hang, Tiankai, Zeng, Yanhong, Sun, Yuchong, Liu, Bei, Yang, Huan, Fu, Jianlong, Guo, Baining
Publikováno v:
published in CVPR 2022
We study joint video and language (VL) pre-training to enable cross-modality learning and benefit plentiful downstream VL tasks. Existing works either extract low-quality video features or learn limited text embedding, while neglecting that high-reso
Externí odkaz:
http://arxiv.org/abs/2111.10337
Akademický článek
Tento výsledek nelze pro nepřihlášené uživatele zobrazit.
K zobrazení výsledku je třeba se přihlásit.
K zobrazení výsledku je třeba se přihlásit.