Scene Classification for Sports Video Summarization Using Transfer Learning
Autor: | Gyu Sang Choi, Seong-Il Jin, Ghazala Rafiq, Muhammad Rafiq, Rockson Agyeman |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: |
Computer science
02 engineering and technology Machine learning computer.software_genre lcsh:Chemical technology Biochemistry Convolutional neural network Article Analytical Chemistry Task (project management) Component (UML) 0202 electrical engineering electronic engineering information engineering lcsh:TP1-1185 Electrical and Electronic Engineering Instrumentation small dataset business.industry Deep learning deep learning 020207 software engineering Automatic summarization Atomic and Molecular Physics and Optics 020201 artificial intelligence & image processing alexnet cnn Artificial intelligence business Transfer of learning computer Encoder data augmentation |
Zdroj: | Sensors, Vol 20, Iss 6, p 1702 (2020) Sensors Volume 20 Issue 6 Sensors (Basel, Switzerland) |
ISSN: | 1424-8220 |
Popis: | This paper proposes a novel method for sports video scene classification with the particular intention of video summarization. Creating and publishing a shorter version of the video is more interesting than a full version due to instant entertainment. Generating shorter summaries of the videos is a tedious task that requires significant labor hours and unnecessary machine occupation. Due to the growing demand for video summarization in marketing, advertising agencies, awareness videos, documentaries, and other interest groups, researchers are continuously proposing automation frameworks and novel schemes. Since the scene classification is a fundamental component of video summarization and video analysis, the quality of scene classification is particularly important. This article focuses on various practical implementation gaps over the existing techniques and presents a method to achieve high-quality of scene classification. We consider cricket as a case study and classify five scene categories, i.e., batting, bowling, boundary, crowd and close-up. We employ our model using pre-trained AlexNet Convolutional Neural Network (CNN) for scene classification. The proposed method employs new, fully connected layers in an encoder fashion. We employ data augmentation to achieve a high accuracy of 99.26% over a smaller dataset. We conduct a performance comparison against baseline approaches to prove the superiority of the method as well as state-of-the-art models. We evaluate our performance results on cricket videos and compare various deep-learning models, i.e., Inception V3, Visual Geometry Group (VGGNet16, VGGNet19), Residual Network (ResNet50), and AlexNet. Our experiments demonstrate that our method with AlexNet CNN produces better results than existing proposals. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |