An Efficient Semantic based Clustering Algorithm for Textual Documents

Autor: Jegatha Deborah L, Karthika R
Rok vydání: 2018
Předmět:
Zdroj: 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET).
Popis: Documents that are classified into different categories gets flooded in the internet every day. These documents have many links or associations with the other documents in the web. The terms in the document are open to multiple interpretations which are vague and unclear. Hence there is a need to find the semantic understanding of the terms. One of the major application in identifying and applying such semantic measure lies in clustering the related textual documents. However, the traditional clustering algorithms may exhibit reduced performances due to the existence of irrelevant terms in the raw documents. Hence, the proposed algorithm in this paper exploits the use of a feature selection algorithm in order to increase the performance of the clustering algorithm. In this paper, a feature selection algorithm with booster technique is used. Moreover, clustering algorithm based on a fuzzy linguistic variable measure that uses separation and dominance value is used in this paper for precise clustering. Experimental analysis shows that the three performance measures that evaluates the clustering algorithm increases, in comparison to the other existing algorithms.
Databáze: OpenAIRE