Sequence-to-Segments Networks for Detecting Segments in Videos

Autor: Radomir Mech, Jianming Zhang, Minh Hoai, Zijun Wei, Boyu Wang, Xiaohui Shen, Dimitris Samaras, Zhe Lin
Rok vydání: 2021
Předmět:
Zdroj: IEEE Transactions on Pattern Analysis and Machine Intelligence. 43:1009-1021
ISSN: 1939-3539
0162-8828
DOI: 10.1109/tpami.2019.2940225
Popis: Detecting segments of interest from videos is a common problem for many applications. And yet it is a challenging problem as it often requires not only knowledge of individual target segments, but also contextual understanding of the entire video and the relationships between the target segments. To address this problem, we propose the Sequence-to-Segments Network (S2N), a novel and general end-to-end sequential encoder-decoder architecture. S2N first encodes the input video into a sequence of hidden states that capture information progressively, as it appears in the video. It then employs the Segment Detection Unit (SDU), a novel decoding architecture, that sequentially detects segments. At each decoding step, the SDU integrates the decoder state and encoder hidden states to detect a target segment. During training, we address the problem of finding the best assignment of predicted segments to ground truth using the Hungarian Matching Algorithm with Lexicographic Cost. Additionally we propose to use the squared Earth Mover's Distance to optimize the localization errors of the segments. We show the state-of-the-art performance of S2N across numerous tasks, including video highlighting, video summarization, and human action proposal generation.
Databáze: OpenAIRE