Popis: |
Human face to anime face translation has attracted the attention of many researchers in recent years, and various works have achieved high-quality style transfer on conventional tasks. However, existing works often have fatal shortcomings when the target domain training data is heavily insufficient, which is named as imbalanced setting. Here the imbalanced (low-resouce) task, generally means there is no sufficient data on the training dataset compared with the conventional task, e.g. the training data size is fewer than 100. To solve this problem, we propose a multi-modal translation model for a specific style. Based on the cyclic adversarial network and class activation map, we import semantic modality to enhance data information and attention modules, which will help our model focus more on the discriminative areas between source and target domain. The experimental results show that our method has superiority in low-resource settings compared with the existing similar work. |