Following the Common Thread Through Word Hierarchies

Autor: Matthias J. Feiler
Rok vydání: 2018
Předmět:
Zdroj: Machine Learning and Data Mining in Pattern Recognition ISBN: 9783319961354
MLDM (1)
DOI: 10.1007/978-3-319-96136-1_13
Popis: In this paper we develop a new algorithm for automatic taxonomy construction from a text corpus. In contrast to existing work, our objective is not to develop a general purpose lexicon or ontology but to identify the structure in a time–ordered sequence of documents. The idea is to identify “lead” words by which we are able to follow the common thread in the public discourse on a specific topic. Our taxonomy represents the backbone of the discourse (including names of protagonists and places) and may change over time. It is thus less rigid and universal than a lexicon and instead targets relationships that are valid in a given context. We present an example to illustrate the idea.
Databáze: OpenAIRE