Incremental Interval Type-2 Fuzzy Clustering of Data Streams using Single Pass Method
Autor: | Abdulmohsen Almalawi, Asif Irshad Khan, Izzatdin Abdul Aziz, Mohd Hilmi Hasan, Sana Qaiyum |
---|---|
Rok vydání: | 2020 |
Předmět: |
Data stream
ant colony optimization Fuzzy clustering Computer science Initialization 02 engineering and technology lcsh:Chemical technology computer.software_genre Biochemistry Fuzzy logic Article interval type-2 fuzzy c-means Analytical Chemistry 0202 electrical engineering electronic engineering information engineering lcsh:TP1-1185 Electrical and Electronic Engineering Cluster analysis data stream Instrumentation incremental learning Data stream mining Ant colony optimization algorithms 020206 networking & telecommunications Atomic and Molecular Physics and Optics ComputingMethodologies_PATTERNRECOGNITION 020201 artificial intelligence & image processing Data mining computer |
Zdroj: | Sensors Volume 20 Issue 11 Sensors, Vol 20, Iss 3210, p 3210 (2020) Sensors (Basel, Switzerland) |
ISSN: | 1424-8220 |
DOI: | 10.3390/s20113210 |
Popis: | Data Streams create new challenges for fuzzy clustering algorithms, specifically Interval Type-2 Fuzzy C-Means (IT2FCM). One problem associated with IT2FCM is that it tends to be sensitive to initialization conditions and therefore, fails to return global optima. This problem has been addressed by optimizing IT2FCM using Ant Colony Optimization approach. However, IT2FCM-ACO obtain clusters for the whole dataset which is not suitable for clustering large streaming datasets that may be coming continuously and evolves with time. Thus, the clusters generated will also evolve with time. Additionally, the incoming data may not be available in memory all at once because of its size. Therefore, to encounter the challenges of a large data stream environment we propose improvising IT2FCM-ACO to generate clusters incrementally. The proposed algorithm produces clusters by determining appropriate cluster centers on a certain percentage of available datasets and then the obtained cluster centroids are combined with new incoming data points to generate another set of cluster centers. The process continues until all the data are scanned. The previous data points are released from memory which reduces time and space complexity. Thus, the proposed incremental method produces data partitions comparable to IT2FCM-ACO. The performance of the proposed method is evaluated on large real-life datasets. The results obtained from several fuzzy cluster validity index measures show the enhanced performance of the proposed method over other clustering algorithms. The proposed algorithm also improves upon the run time and produces excellent speed-ups for all datasets. |
Databáze: | OpenAIRE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |