On-topic Cover Stories from News Archives

Autor: Christian Schulte, Bilyana Taneva, Gerhard Weikum
Rok vydání: 2015
Předmět:
Zdroj: Lecture Notes in Computer Science ISBN: 9783319163536
ECIR
DOI: 10.1007/978-3-319-16354-3_4
Popis: While Web or newspaper archives store large amounts of articles, they also contain a lot of near-duplicate information. Examples include articles about the same event published by multiple news agencies or articles about evolving events that lead to copies of paragraphs to provide background information. To support journalists, who attempt to read all information on a given topic at once, we propose an approach that, given a topic and a text collection, extracts a set of articles with broad coverage of the topic and minimum amount of duplicates.
Databáze: OpenAIRE