SWAGAN: A Style-based Wavelet-driven Generative Model

Autor:	Daniel Cohen-Or, Dana Cohen Hochberg, Rinon Gal, Amit Bermano
Jazyk:	angličtina
Rok vydání:	2021
Předmět:	FOS: Computer and information sciences Discriminator Artificial neural network business.industry Computer science Computer Vision and Pattern Recognition (cs.CV) Image and Video Processing (eess.IV) Computer Science - Computer Vision and Pattern Recognition Electrical Engineering and Systems Science - Image and Video Processing Machine learning computer.software_genre Computer Graphics and Computer-Aided Design Domain (software engineering) Generative model Wavelet Frequency domain FOS: Electrical engineering electronic engineering information engineering Artificial intelligence business Representation (mathematics) computer Generator (mathematics)
Popis:	In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach, designed to directly tackle the spectral bias of neural networks, yields an improvement in the ability to generate medium and high frequency content, including structures which other networks fail to learn. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to more realistic high-frequency content, even when trained for fewer iterations. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved high-frequency performance in downstream tasks.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::666fa4e4c7aa5b7066e68b0260b027b3 http://arxiv.org/abs/2102.06108 Zobrazit plný text záznamu