Autor: |
Souza, Marcos Roberto, de Almeida Maia, Helena, Souza Santos, Anderson Carlos, Vieira, Marcelo Bernardes, Pedrini, Helio |
Předmět: |
|
Zdroj: |
Applied Artificial Intelligence; 2022, Vol. 36 Issue 1, p1-32, 32p |
Abstrakt: |
Localization of video caption plays an important role in information retrieval in multimedia applications. In this work, we present and evaluate a novel method for localizing video captions using visual rhythms, which enable the representation and analysis of a specific feature throughout the time. We build visual rhythms from the text location maps produced by general text localization methods that are far more common in the literature than caption-oriented ones. Then, we process the maps properly to keep only the captions, generating caption localization masks. To meet the need for a standardized and large dataset, we constructed a new one, where captions with thirteen different scripts are added to the video frames, generating a total of 221 videos with ground truth. Experiments demonstrate that our method achieves competitive results when compared to other literature approaches. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|