Autor: |
Urala Kota, Bhargava, Davila, Kenny, Stone, Alexander, Setlur, Srirangaraj, Govindaraju, Venu |
Zdroj: |
International Journal on Document Analysis and Recognition; 20240101, Issue: Preprints p1-13, 13p |
Abstrakt: |
We propose a framework to extract and binarize handwritten content in lecture videos. The extracted content could potentially be used to index video collections powering content-based search and navigation within lecture videos helping students and educators across the world. A deep learning pipeline is used to detect handwritten text, formulae and sketches and then binarize the extracted content. We exploit the spatio-temporal structure of our binarized detections to compute associativity information of content across all video frames. This information is later used to segment the video. Experiments are conducted to compare the performance of key components of our framework in isolation, as well as the impact on overall performance, with respect to existing methods. We evaluate our framework on the publicly available AccessMath lecture video dataset obtaining an f-measure of $$94.32\%$$94.32%for binary connected components. Code for the framework (including trained weights) and summarization will be released. |
Databáze: |
Supplemental Index |
Externí odkaz: |
|