Semantic document retrieval system using fuzzy clustering and reformulated query

Autor: Dabbu Murali, Avula Damodaram
Rok vydání: 2015
Předmět:
Zdroj: 2015 International Conference on Advances in Computer Engineering and Applications.
DOI: 10.1109/icacea.2015.7164788
Popis: In this paper, we develop an algorithm for document retrieval system through clustering process and query basis. Initially, the pre-processing is applied on whole documents to remove the unnecessary words and phrases of every document. Then the clustering process in applied to make the partition of the documents through the proposed semantic similarity measure used in the possibilistic fuzzy c means (PFCM) clustering algorithm. For each cluster, the index constructed, which contains common important keywords of the documents of cluster. Once the user enter the keyword as the input to the system, it will process the keywords with the WORDNET ontology to obtain the neighbourhood keywords and related synset keywords. From the set of keywords obtained from the WORDNET is refined and the refined keywords are matched with the index keywords of the clusters to calculate the matching score. Finally, the documents inside the cluster are released at first as the resultant related documents for the query keyword, which clusters have the maximum matching score values. The experimentation process is carried out with the help of different set of documents to achieve the results, the performance analysis of the proposed approach is estimated by precision, and we proved our proposed approach is outperformed in terms of precision.
Databáze: OpenAIRE