SWAGAN: A Style-based Wavelet-driven Generative Model
Autor: | Daniel Cohen-Or, Dana Cohen Hochberg, Rinon Gal, Amit Bermano |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
FOS: Computer and information sciences
Discriminator Artificial neural network business.industry Computer science Computer Vision and Pattern Recognition (cs.CV) Image and Video Processing (eess.IV) Computer Science - Computer Vision and Pattern Recognition Electrical Engineering and Systems Science - Image and Video Processing Machine learning computer.software_genre Computer Graphics and Computer-Aided Design Domain (software engineering) Generative model Wavelet Frequency domain FOS: Electrical engineering electronic engineering information engineering Artificial intelligence business Representation (mathematics) computer Generator (mathematics) |
Popis: | In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach, designed to directly tackle the spectral bias of neural networks, yields an improvement in the ability to generate medium and high frequency content, including structures which other networks fail to learn. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to more realistic high-frequency content, even when trained for fewer iterations. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved high-frequency performance in downstream tasks. |
Databáze: | OpenAIRE |
Externí odkaz: |