Autor: |
Danish Maruf, Ramesh M. Kagalkar, Prasad Khot, Sanket Potdar, Rudraneel Bhaumik |
Rok vydání: |
2020 |
Předmět: |
|
Zdroj: |
Learning and Analytics in Intelligent Systems ISBN: 9783030469382 |
DOI: |
10.1007/978-3-030-46939-9_52 |
Popis: |
Human uses communication language either by written or spoken to describe visual world around them. The study of text description for any video goes increasing. This paper presents a system which produce English descriptions from the complex video samples. Here system produces text description from complex video, where it represents a framework that gives output as description for any long length video with multiple objects. This paper is broadly classified into two modules training and testing modules. Where the training module perform extracting of its unique features a with its description found in that video and is stored in database. In testing module consider the video sample which under goes frame extraction, preprocessing, segmentation, feature extraction and the extracted features are compared with features which are computed in training module then identify the video action, classify it and finally generate the text description using langauge model. The sentences are generated from objects for this assessment, a preferred database from youtube are accumulated in which 250 samples from 50 domain names. The performance of the system can be calculated and gives the accuracy of 90% with minimum processing time for object 2. |
Databáze: |
OpenAIRE |
Externí odkaz: |
|