One-Step Diffusion Distillation through Score Implicit Matching

Autor:	Luo, Weijian, Huang, Zemin, Geng, Zhengyang, Kolter, J. Zico, Qi, Guo-jun
Rok vydání:	2024
Předmět:	Computer Science - Computer Vision and Pattern Recognition Computer Science - Artificial Intelligence Computer Science - Machine Learning
Zdroj:	NeurIPS 2024
Druh dokumentu:	Working Paper
Popis:	Despite their strong performances on many generative tasks, diffusion models require a large number of sampling steps in order to generate realistic samples. This has motivated the community to develop effective methods to distill pre-trained diffusion models into more efficient models, but these methods still typically require few-step inference or perform substantially worse than the underlying model. In this paper, we present Score Implicit Matching (SIM) a new approach to distilling pre-trained diffusion models into single-step generator models, while maintaining almost the same sample generation ability as the original model as well as being data-free with no need of training samples for distillation. The method rests upon the fact that, although the traditional score-based loss is intractable to minimize for generator models, under certain conditions we can efficiently compute the gradients for a wide class of score-based divergences between a diffusion model and a generator. SIM shows strong empirical performances for one-step generators: on the CIFAR10 dataset, it achieves an FID of 2.06 for unconditional generation and 1.96 for class-conditional generation. Moreover, by applying SIM to a leading transformer-based diffusion model, we distill a single-step generator for text-to-image (T2I) generation that attains an aesthetic score of 6.42 with no performance decline over the original multi-step counterpart, clearly outperforming the other one-step generators including SDXL-TURBO of 5.33, SDXL-LIGHTNING of 5.34 and HYPER-SDXL of 5.85. We will release this industry-ready one-step transformer-based T2I generator along with this paper. Comment: Accepted by NeurIPS 2024
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2410.16794 Zobrazit plný text záznamu View this record from Arxiv