News Story Clustering with Bag of Word Model and Affinity Propagation

Autor: Chin-Chao Huang, 黃朝琴
Rok vydání: 2011
Druh dokumentu: 學位論文 ; thesis
Popis: 99
The 24-hour news TV channels repeat the same news stories again and again. To skip browsing repeated news stories, we propose a framework to cluster topic-related news stories together, and thus facilitate efficient browsing and summarization. Our proposed system continuously monitors news broadcast, and automatically segments news videos into shots, removes commercial breaks, and detects anchorpersons. Each news story is represented by the bag of visual word (BoW) model, the bag of trajectory (BoT) model to describe what and how objects present in it. We also utilize concept detectors to detect concepts in news stories, and apply them to construct semantic features. We measure similarity between news stories by the earth mover’s distance, and the affinity propagation algorithm is used to cluster stories of the same topic together. Four news videos captured from Taiwanese news TV channels are studied in this thesis. We evaluate news story clustering in a news TV channel and across different channels. We also show performance of automatic news story segmentation. We verify that the news story clustering problem is much harder than near-duplicate detection and video copy detection, because video content of news stories with the same topic may vary. With the proposed methods, we conclude that various news stories can be effectively clustered.
Databáze: Networked Digital Library of Theses & Dissertations