Abstrakt: |
In today's era of the fourth industrial revolution, individuals are confronted with an overwhelming deluge of information on a daily basis. The digital landscape is teeming with diverse data streams, encompassing realms such as IoT, social media, healthcare, business, cryptocurrencies, and cybersecurity. This phenomenon presents challenges due to the considerable storage capacity demanded by these extensive datasets, culminating in the complexities of executing time-consuming and labor-intensive tasks like analytical, processing, and retrieval operations. In addressing this conundrum, artificial intelligence, particularly machine learning and deep learning, emerges as a pragmatic solution. Clustering, an unsupervised learning technique, assumes a pivotal role by discerning a specific number of clusters to effectively categorize data through coherent grouping. Consequently, clustering finds relevance across numerous domains and applications dealing with vast datasets. This comprehensive survey meticulously scrutinizes seven prominent clustering methodologies--namely k-means, G-means, DBSCAN, Agglomerative hierarchical clustering, Two-stage density (DBSCAN and k-means) algorithm, Twolevels (DBSCAN and hierarchical) clustering algorithm, and Two-stage MeanShift and k-means clustering algorithm--undertaking a rigorous comparison using a genuine dataset: The Blockchain dataset, encompassing prominent cryptocurrencies like Binance, Bitcoin, Doge, and Ethereum. The assessment encompasses various metrics, including silhouette coefficient, Calinski-Harabasz, Davies-Bouldin Index, time complexity, and entropy. [ABSTRACT FROM AUTHOR] |