Big Data Analytics in Telecommunication using State-of-the-art Big Data Framework in a Distributed Computing Environment: A Case Study

Autor: Mohit Ved, Rizwanahmed B
Rok vydání: 2019
Předmět:
Zdroj: COMPSAC (1)
DOI: 10.1109/compsac.2019.00066
Popis: Predictive Analytics is of great interest when it comes to enhancing Business Intelligence. Businesses have already started to use Big Data Analytics, particularly predictive and prescriptive analytics, to strengthen and increase their business yields. Not only has analytics resulted in business growth, but has also provided a significant competitive edge over others. The voluminous data generated from various resources is highly unstructured in nature and adding a structure to it would leverage the actual potential of the data. New techniques and frameworks should serve as human aids in automatically and intelligently analyzing large datasets in order to acquire useful information. In this paper, we attempt to perform Big Data Analytics on data from one of the most important and growing sources, namely, Telecommunication. To keep pace with the growing telecommunication market and ever increasing demands of the consumers for quality service, the telecom service providers are required to observe and estimate various trends in customer's usage to plan future upgrades and deployments driven by real data. We have attempted to use several data mining techniques to find hidden and interesting patterns from the telecom data generated by Telecoms Italia cellular network for the city of Milano, Italy. K-means clustering is used to categorize the usage statistics while several machine learning algorithms like Decision Tree, Random Forest, Logistic Regression and SVM are used for predicting the usage of telecom services. In the end, a performance comparison matrix is generated to rate the performance of these algorithms for the given dataset. All these experiments are performed on the big data environment set up at the supercomputing infrastructure of C-DAC. Given such a matrix, the result can be applied to similar dataset pertaining to other domains as well.
Databáze: OpenAIRE