Corpus-dependent association thesauri for information retrieval

Autor: Hiroyuki Kaji, Yasutsugu Morimoto, Toshiko Aizono, Noriyuki Yamasaki
Rok vydání: 2000
Předmět:
Zdroj: COLING
DOI: 10.3115/990820.990879
Popis: This paper presents a method for automatically generating an association thesaurus from a text corpus, and demonstrates its application to information retrieval. The thesaurus generation method consists of extracting terms and co-occurrence data from a corpus and analyzing the correlation between terms statistically. A new method for disambiguating the structure of compound nouns, which is a key component for term extraction, is also proposed. The automatically generated thesaurus is effectively used as a tool for exploring information. A thesaurus navigator having novel functions such as term clustering, thesaurus overview, and zooming-in is proposed.
Databáze: OpenAIRE