Automatic Image and Video Caption Generation With Deep Learning: A Concise Review and Algorithmic Overlap

Autor: Thiab R. Taha, Khaled Rasheed, Soheyla Amirian, Hamid R. Arabnia
Rok vydání: 2020
Předmět:
Zdroj: IEEE Access, Vol 8, Pp 218386-218400 (2020)
ISSN: 2169-3536
Popis: Methodologies that utilize Deep Learning offer great potential for applications that automatically attempt to generate captions or descriptions about images and video frames. Image and video captioning are considered to be intellectually challenging problems in imaging science. The application domains include automatic caption (or description) generation for images and videos for people who suffer from various degrees of visual impairment; the automatic creation of metadata for images and videos (indexing) for use by search engines; general-purpose robot vision systems; and many others. Each of these application domains can positively and significantly impact many other task-specific applications. This article is not meant to be a comprehensive review of image captioning; rather, it is a concise review of both image captioning and video captioning methodologies based on deep learning. This study treats both image and video captioning by emphasizing the algorithmic overlap between the two.
Databáze: OpenAIRE