Semantic document retrieval system using fuzzy clustering and reformulated query
Autor: | Dabbu Murali, Avula Damodaram |
---|---|
Rok vydání: | 2015 |
Předmět: |
Fuzzy clustering
Information retrieval Computer science Computer Science::Information Retrieval InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL Correlation clustering WordNet Document clustering computer.software_genre Fuzzy logic Semantic similarity ComputingMethodologies_DOCUMENTANDTEXTPROCESSING Data mining Document retrieval Cluster analysis computer |
Zdroj: | 2015 International Conference on Advances in Computer Engineering and Applications. |
DOI: | 10.1109/icacea.2015.7164788 |
Popis: | In this paper, we develop an algorithm for document retrieval system through clustering process and query basis. Initially, the pre-processing is applied on whole documents to remove the unnecessary words and phrases of every document. Then the clustering process in applied to make the partition of the documents through the proposed semantic similarity measure used in the possibilistic fuzzy c means (PFCM) clustering algorithm. For each cluster, the index constructed, which contains common important keywords of the documents of cluster. Once the user enter the keyword as the input to the system, it will process the keywords with the WORDNET ontology to obtain the neighbourhood keywords and related synset keywords. From the set of keywords obtained from the WORDNET is refined and the refined keywords are matched with the index keywords of the clusters to calculate the matching score. Finally, the documents inside the cluster are released at first as the resultant related documents for the query keyword, which clusters have the maximum matching score values. The experimentation process is carried out with the help of different set of documents to achieve the results, the performance analysis of the proposed approach is estimated by precision, and we proved our proposed approach is outperformed in terms of precision. |
Databáze: | OpenAIRE |
Externí odkaz: |