URCA-GAN: UpSample Residual Channel-wise Attention Generative Adversarial Network for image-to-image translation
Autor: | Manhua Qi, Yifei Wang, Edward K. Wong, Haoxuan Ding, Xuan Nie |
---|---|
Rok vydání: | 2021 |
Předmět: |
0209 industrial biotechnology
Channel (digital image) Computer science business.industry Cognitive Neuroscience Pattern recognition 02 engineering and technology Translation (geometry) Residual Computer Science Applications Visualization Image (mathematics) Upsampling 020901 industrial engineering & automation Artificial Intelligence Softmax function 0202 electrical engineering electronic engineering information engineering Image translation 020201 artificial intelligence & image processing Artificial intelligence business |
Zdroj: | Neurocomputing. 443:75-84 |
ISSN: | 0925-2312 |
DOI: | 10.1016/j.neucom.2021.02.054 |
Popis: | Multimodal image-to-image translation is a challenging topic in computer vision. In image-to-image translation, an image is translated from a source domain to different target domains. For many translation tasks, the difference between the source image and the target image is only in the foreground. In this paper, we propose a novel deep-learning-based method for image-to-image translation. Our method, named URCA-GAN, is based on a generative adversarial network and it can generate images of higher quality and diversity than existing methods. We introduce Upsample Residual Channel-wise Attention Blocks (URCABs), based on ResNet and softmax channel-wise attention, to extract features associated with the foreground. The URCABs form a parallel architecture module named Upsample Residual Channel-wise Attention Module (URCAM) to merge features from the URCABs. URCAM is embedded after the decoder in the generator to regulate image generation. Experimental results and quantitative evaluations showed that our model has better performance than current state-of-the-art methods in both quality and diversity. Especially, the LPIPS, PSNR, and SSIM of URCA-GAN on CelebA dataset increase by 1.31 % , 1.66 % , and 4.74 % respectively, the PSNR and SSIM on RaFD dataset increase by 1.35 % and 6.71 % respectively. In addition, visualization of the features from URCABs demonstrates that our model puts emphasis on the foreground features. |
Databáze: | OpenAIRE |
Externí odkaz: |