A Weighted K- Means clustering Algorithm for Clustering Big Data based on MapReduce

Autor: Lakshmi Srinivasulu Dandug, Koneru Suvarna Vani
Rok vydání: 2022
Popis: Big data has gained popularity as a means of storing, managing and processing large amounts of data. In the area of big data analysis, clustering datasets has become a difficult problem. MapReduce and its accessible version of Hadoop have gained appeal in both academia and businesses as the requirement for big data analytics in scientific applications and internet services continues to grow. Hadoop is a very practical approach for constructing big data analytics frameworks. Hadoop, on either side, has faults in a variety of areas, such as data resource management, scheduling policies and management to mention a few. When performing MapReduce jobs in Hadoop clusters, these issues frequently lead to excessive energy usage. This approach intends to introduce an innovative weighted k-means clustering algorithm based on intelligent Aquila weighting activity. To broaden the difference between clusters, the degree of connectivity among clusters is first stated in the clustering model. Additionally, Aquila optimization is used to select initial cluster centers, which eliminates the sensitivity of early cluster centers. This novel algorithm has high-speed performance with a perfect Map-Reduce function. For that, we use HDFS with the python platform. The proposed method allows any volume of data to be clustered, and there is no restriction on the number of data that may be clustered. The presented algorithm performs well and has a high degree of precision. This kind of novel method provides better accuracy for the clustering process and reduces the complexities presented in this clustering. The experimental outcomes show that the proposed method yields superior performances.
Databáze: OpenAIRE