A Comparative Evaluation of Different Keyword Extraction Techniques

Autor: Raj Kishor Bisht
Rok vydání: 2021
Předmět:
Zdroj: International Journal of Information Retrieval Research. 12:1-17
ISSN: 2155-6385
2155-6377
DOI: 10.4018/ijirr.289573
Popis: Retrieving keywords in a text is attracting researchers for a long time as it forms a base for many natural language applications like information retrieval, text summarization, document categorization etc. A text is a collection of words that represent the theme of the text naturally and to bring the naturalism under certain rules is itself a challenging task. In the present paper, the authors evaluate different spatial distribution based keyword extraction methods available in the literature on three standard scientific texts. The authors choose the first few high-frequency words for evaluation to reduce the complexity as all the methods are somehow based on frequency. The authors find that the methods are not providing good results particularly in the case of the first few retrieved words. Thus, the authors propose a new measure based on frequency, inverse document frequency, variance, and Tsallis entropy. Evaluation of different methods is done on the basis of precision, recall, and F-measure. Results show that the proposed method provides improved results.
Databáze: OpenAIRE