Implementation of a self-adaptive real time recommendation system using spark machine learning libraries

Autor: Anu Bonia Francis, P. S. Janardhanan, Reena Murali, Bobin K Sunny
Rok vydání: 2017
Předmět:
Zdroj: 2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES).
DOI: 10.1109/spices.2017.8091310
Popis: Real time recommendation systems have become an essential component of e-commerce web applications. With increasing volume and velocity of data handled by these applications, known as the bigdata problem, traditional recommendation systems that analyze data and update models at regular time intervals would not be able to satisfy this requirement. With the evolution of technologies for processing bigdata in real time, it has become fairly easy to implement real time recommendation systems. Stream-computing is a new computing paradigm for handling the velocity attribute of bigdata which makes it possible to develop real time bigdata applications. This paper gives the details of implementation of a real time recommendation system using Apache Spark, a widely used platform for stream computing. This system is implemented for recommending TV channels to viewers in real time. This becomes a challenging task due to continuous changes in the set of available channels and the context dependent preference of viewers. In channel recommendation scenario, characterized by its dynamic nature, volume of data, and tight time constraints, traditional approaches cannot be used. We have implemented a highly scalable TV channel recommendation system optimized for the processing of real-time data streams originating from set-top boxes. The proposed system implements a self-adaptive approach for model building. The system effectively uses distributed processing power of Apache Spark to make recommendations in real time with scalability to meet the real time constraints with increasing load. The Spark Machine Learning Libraries (Spark MLLib) provide several algorithms which were used for developing the proposed recommendation system. The large amount of data in the system is efficiently managed by the data processing method of Lambda Architecture.
Databáze: OpenAIRE