An Efficient Semantic based Clustering Algorithm for Textual Documents
Autor: | Jegatha Deborah L, Karthika R |
---|---|
Rok vydání: | 2018 |
Předmět: |
Measure (data warehouse)
Fuzzy clustering Exploit business.industry Computer science Value (computer science) Feature selection 02 engineering and technology computer.software_genre Variable (computer science) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing The Internet Data mining Cluster analysis business computer |
Zdroj: | 2018 International Conference on Circuits and Systems in Digital Enterprise Technology (ICCSDET). |
Popis: | Documents that are classified into different categories gets flooded in the internet every day. These documents have many links or associations with the other documents in the web. The terms in the document are open to multiple interpretations which are vague and unclear. Hence there is a need to find the semantic understanding of the terms. One of the major application in identifying and applying such semantic measure lies in clustering the related textual documents. However, the traditional clustering algorithms may exhibit reduced performances due to the existence of irrelevant terms in the raw documents. Hence, the proposed algorithm in this paper exploits the use of a feature selection algorithm in order to increase the performance of the clustering algorithm. In this paper, a feature selection algorithm with booster technique is used. Moreover, clustering algorithm based on a fuzzy linguistic variable measure that uses separation and dominance value is used in this paper for precise clustering. Experimental analysis shows that the three performance measures that evaluates the clustering algorithm increases, in comparison to the other existing algorithms. |
Databáze: | OpenAIRE |
Externí odkaz: |