Develop extended visual methods for an effective clusters assessment of large datasets.

Autor: Prasad, K. Rajendra, Basha, M. Suleman, Kumar, M. Kalyan, Reddy, K. Shivaram, Prakash, G. Jaya, Lokesh, G.
Předmět:
Zdroj: AIP Conference Proceedings; 2023, Vol. 2821 Issue 1, p1-7, 7p
Abstrakt: Nowadays, smartphones, IoT (Internet of Things) devices, and social media produce massive amounts of data. To control this massive amount of data, we have to arrange the extensive data in groups called clusters. We have to apply some algorithms to the raw data to form the clusters. For this, we surveyed some algorithms for accessing the number of clusters (also called cluster tendency). Some of the existing algorithms to form Clusters are VAT (Visual Assessment Tendency), iVAT(improved Visual Assessment Tendency), inc-VAT(Increment Visual Assessment Tendency), inc-iVAT(Increment improved Visual Assessment Tendency), dec-VAT(Decrement Visual Assessment Tendency), dec-iVAT(Decrement improved Visual Assessment Tendency). These algorithms showed the number of clusters visually in the form of square-shaped dark blocks along the diagonal. A number of square-shaped dark-colored blocks indicate the value of cluster tendency. For small datasets, the above algorithms will work well, but for large datasets, the clustering accuracy decreases, and processing time increases. To overcome this problem, we are proposing a system with extended ideas of the state-of-the-art of visual methods. In the proposed work, visual algorithms are designed with cosine-metric for better assessment of cluster tendency rather than a normal Euclidean distance. In our system, we are using COSINE-based distance instead of Euclidean Distance to form the clusters on raw data, and also, we are verifying the clustering accuracy and processing time by using the existing algorithms with the help of COSINE based distance measures. The experiments are conducted on benchmarked large datasets for demonstrating and comparing the efficiency of proposed and existing visual clustering methods. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index