Zobrazeno 1 - 1
of 1
pro vyhledávání: '"Nagasawa, Haruki"'
This paper proposes a practical multimodal video summarization task setting and a dataset to train and evaluate the task. The target task involves summarizing a given video into a predefined number of keyframe-caption pairs and displaying them in a l
Externí odkaz:
http://arxiv.org/abs/2312.01575