Event modelling and recognition in video

Autor: Gkalelis, Nikolaos
Rok vydání: 2013
Předmět:
Druh dokumentu: Electronic Thesis or Dissertation
Popis: The management of digital video has become a very challenging problem as the amount of video content continues to witness phenomenal growth. This trend necessitates the development of advanced techniques for the efficient and effective manipulation of video information. However, the performance of current video processing tools has not yet reached the required satisfaction levels mainly due to the gap between the computer generated semantic descriptions of video content and the interpretations of the same content by humans, a discrepancy commonly referred to as the semantic gap. Inspired from recent studies in neuroscience suggesting that humans remember real life using past experience structured in events, in this thesis we investigate the use of appropriate models and machine learning approaches for representing and recognizing events in video. Specifically, a joint content-event model is proposed for describing video content (e.g., shots, scenes, etc.), as well as real-life events (e.g., demonstration, birthday party, etc.) and their key semantic entities (participants, location, etc.). In the core of this model stands a referencing mechanism which utilizes a set of video analysis algorithms for the automatic generation of event model instances and their enrichment with semantic information extracted from the video content. In particular, a set of subclass discriminant analysis and support vector machine methods for handling data nonlinearities and addressing several limitations of the current state-of-the-art approaches are proposed. These approaches are evaluated using several publicly available benchmarks particularly suited for testing the robustness and reliability of nonlinear classification methods, such as the facial image collection of the Four Face database, datasets from the UCI repository, and other. Moreover, the most efficient of the proposed methods are additionally evaluated using a large-scale video collection, consisting of the datasets provided in TRECVID multimedia event detection (MED) track of 2010 and 2011, which are among the most challenging in this field, for the tasks of event detection and event recounting. This experiment is designed in such a manner so that it can be conceived as a fundamental evaluation of the proposed joint content-event model.
Databáze: Networked Digital Library of Theses & Dissertations