Leveraging auxiliary image descriptions for dense video captioning

Autor:	Emre Boran, Erkut Erdem, Aykut Erdem, Nazli Ikizler-Cinbis, Pranava Madhyastha, Lucia Specia
Rok vydání:	2021
Předmět:	Closed captioning Computer science InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Context (language use) 02 engineering and technology computer.software_genre 01 natural sciences Task (project management) Artificial Intelligence 0103 physical sciences 0202 electrical engineering electronic engineering information engineering Leverage (statistics) 010306 general physics BLEU Event (computing) business.industry Signal Processing ComputingMethodologies_DOCUMENTANDTEXTPROCESSING 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Artificial intelligence Paragraph business computer Software Natural language processing Generator (mathematics)
Zdroj:	Pattern Recognition Letters. 146:70-76
ISSN:	0167-8655
DOI:	10.1016/j.patrec.2021.02.009
Popis:	Collecting textual descriptions is an especially costly task for dense video captioning, since each event in the video needs to be annotated separately and a long descriptive paragraph needs to be provided. In this paper, we investigate a way to mitigate this heavy burden and propose to leverage captions of visually similar images as auxiliary context. Our model successfully fetches visually relevant images and combines noun and verb phrases from their captions to generating coherent descriptions. To this end, we use a generator and discriminator design, together with an attention-based fusion technique, to incorporate image captions as context in the video caption generation process. The experiments on the challenging ActivityNet Captions dataset demonstrate that our proposed approach achieves more accurate and more diverse video descriptions compared to the strong baseline using METEOR, BLEU and CIDEr-D metrics and qualitative evaluations.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::6cbb53b1f0ca3454aa71e554386cf18d https://doi.org/10.1016/j.patrec.2021.02.009 Zobrazit plný text záznamu Full Text from ScienceDirect