Image Generation from Image Captioning -- Invertible Approach

Autor:	Menon, Nandakishore S, Kamanchi, Chandramouli, Diddigi, Raghuram Bharadwaj
Rok vydání:	2024
Předmět:	Computer Science - Computer Vision and Pattern Recognition
Druh dokumentu:	Working Paper
Popis:	Our work aims to build a model that performs dual tasks of image captioning and image generation while being trained on only one task. The central idea is to train an invertible model that learns a one-to-one mapping between the image and text embeddings. Once the invertible model is efficiently trained on one task, the image captioning, the same model can generate new images for a given text through the inversion process, with no additional training. This paper proposes a simple invertible neural network architecture for this problem and presents our current findings. Comment: Accepted as Tiny Paper at ICVGIP 2024 conference
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2410.20171 Zobrazit plný text záznamu View this record from Arxiv