Scrutinizing Big Data using Machine Learning Classifiers.

Autor: Jain, Ashu, Jain, Soumya, Dewan, Charul
Předmět:
Zdroj: International Journal of Recent Research Aspects; Mar2018, Vol. 5 Issue 1, p123-126, 4p
Abstrakt: The massive unstructured and semi structured heterogeneous data generated from devices, household appliances and from day to day activities namely sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few collected together is what we call as big data. Machine learning's supervised and unsupervised learning techniques can be used to process this large amount of data using the various classifiers to derive useful insight from the data and to predict the future patterns and trends. Supervised learning techniques uses the concept of train and test data i.e., some portion of data is first trained with expected results for a given input and then the remaining chunk of data is used to test the algorithm for prediction accuracy. J48, IBK & Naïve Bayes are few of the supervised learning classifiers used in this comparative analysis. The classifiers are used on the multivariate big data set selected under different models of training and test data to obtain the best performing classifier among them on basis of correctly classified instances (one of the performance measuring criteria). On the basis of the experiments conducted, J48 is the classifier which is generating the maximum number of correctly classified instances. Hence it can be considered as the best classifier to process the data of given kind. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index