New algorithm for clustering unlabeled big data
Autor: | Wafaa Al-Hameed, Marwan B. Mohammed |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Sequence
Control and Optimization Computer Networks and Communications Computer science business.industry Big data DBI Process (computing) k-means clustering Hierarchical clustering Hardware and Architecture Signal Processing USE Electrical and Electronic Engineering Lexical chain sentence business Cluster analysis Algorithm Sentence Word (computer architecture) Information Systems K-mean clustering |
Popis: | The clustering analysis techniques play an important role in the area of data mining. Although from existence several clustering techniques. However, it still to their tries to improve the clustering process efficiently or propose new techniques seeks to allocate objects into clusters so that two objects in the same cluster are more similar than two objects in different clusters and careful not to duplicate the same objects in different groups with the ability to cover all data as much as possible. This paper presents two directions. The first is to propose a new algorithm that coined a name (MB Algorithm) to collect unlabeled data and put them into appropriate groups. The second is the creation of a lexical sequence sentence (LCS) based on similar semantic sentences which are different from the traditional lexical word chain (LCW) based on words. The results showed that the performance of the MB algorithm has generally outperformed the two algorithms the hierarchical clustering algorithm and the K-mean algorithm. |
Databáze: | OpenAIRE |
Externí odkaz: |