Shot boundary detection using deep convolutional networks
Autor: | Zhao, Chenjia |
---|---|
Přispěvatelé: | Tarrés Ruiz, Francisco, Raventós Mayoral, Arnau |
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
Neural networks (Computer science)
Deep Learning Tensorflow Enginyeria de la telecomunicació::Telemàtica i xarxes d'ordinadors [Àrees temàtiques de la UPC] Image Processing ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION Xarxes neuronals (Informàtica) Shot detection Enginyeria de la telecomunicació [Àrees temàtiques de la UPC] Aprenentatge profund |
Zdroj: | UPCommons. Portal del coneixement obert de la UPC Universitat Politècnica de Catalunya (UPC) Recercat. Dipósit de la Recerca de Catalunya instname |
Popis: | Shot boundary detection (SBD) is the process of automatically detecting the boundaries between shots in the videos, which is an important pre - processing step for vi deo analysis, such as indexing, browsing, summarization and other content - based operations. Nowadays with the continuously growing video data, traditional technologies based on low - level features of the frames (such as color histogram) no longer fits the r equirements, not only from the speed point of view, but also the accuracy. Knowing that convolutional neural networks (CNN) become unprecedented popular in image processing these years, due to it’s powerfulness in feature analysis and classification, we implement a fully convolutional neural network based on the paper created by Michael Gygli, which is a 3 - dimensional neural network. Our purpose is to detect the middle shot boundary in every 10 frames, that is to say, to estimate if there is shot boundary between the 5th and the 6th frame out of 10. While implementing the neural network architecture several modifications have been tested and proposed. Thus, we created a proprietary dataset with thousands of frames and generated different transitions such as cuts and gradual transitions, to put them into our 3 - dimensional network for training. The advantage of fully convolutional networks is that allows to use a large temporal context without the need of repeatedly processing frames. By testing with our evaluation dataset, in the end of the project we got a satisfactory result with the accuracy approximately 95%. And we found the weakness of the network through analysis of each kind of shot transition. We hope that our efforts for the project will contrib ute to the investigation of shot detection strat e gies based on convolutional neural network |
Databáze: | OpenAIRE |
Externí odkaz: |