Following the Common Thread Through Word Hierarchies
Autor: | Matthias J. Feiler |
---|---|
Rok vydání: | 2018 |
Předmět: |
0301 basic medicine
Change over time Text corpus Computer science business.industry 02 engineering and technology Thread (computing) computer.software_genre Lexicon 03 medical and health sciences 030104 developmental biology General purpose Automatic taxonomy induction Public discourse 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing |
Zdroj: | Machine Learning and Data Mining in Pattern Recognition ISBN: 9783319961354 MLDM (1) |
DOI: | 10.1007/978-3-319-96136-1_13 |
Popis: | In this paper we develop a new algorithm for automatic taxonomy construction from a text corpus. In contrast to existing work, our objective is not to develop a general purpose lexicon or ontology but to identify the structure in a time–ordered sequence of documents. The idea is to identify “lead” words by which we are able to follow the common thread in the public discourse on a specific topic. Our taxonomy represents the backbone of the discourse (including names of protagonists and places) and may change over time. It is thus less rigid and universal than a lexicon and instead targets relationships that are valid in a given context. We present an example to illustrate the idea. |
Databáze: | OpenAIRE |
Externí odkaz: |