Implementation of a self-adaptive real time recommendation system using spark machine learning libraries
Autor: | Anu Bonia Francis, P. S. Janardhanan, Reena Murali, Bobin K Sunny |
---|---|
Rok vydání: | 2017 |
Předmět: |
Data stream mining
Computer science business.industry Stream 020208 electrical & electronic engineering Real-time computing Big data 020206 networking & telecommunications Context (language use) 02 engineering and technology Recommender system Machine learning computer.software_genre Data modeling Spark (mathematics) Scalability 0202 electrical engineering electronic engineering information engineering Artificial intelligence business computer |
Zdroj: | 2017 IEEE International Conference on Signal Processing, Informatics, Communication and Energy Systems (SPICES). |
DOI: | 10.1109/spices.2017.8091310 |
Popis: | Real time recommendation systems have become an essential component of e-commerce web applications. With increasing volume and velocity of data handled by these applications, known as the bigdata problem, traditional recommendation systems that analyze data and update models at regular time intervals would not be able to satisfy this requirement. With the evolution of technologies for processing bigdata in real time, it has become fairly easy to implement real time recommendation systems. Stream-computing is a new computing paradigm for handling the velocity attribute of bigdata which makes it possible to develop real time bigdata applications. This paper gives the details of implementation of a real time recommendation system using Apache Spark, a widely used platform for stream computing. This system is implemented for recommending TV channels to viewers in real time. This becomes a challenging task due to continuous changes in the set of available channels and the context dependent preference of viewers. In channel recommendation scenario, characterized by its dynamic nature, volume of data, and tight time constraints, traditional approaches cannot be used. We have implemented a highly scalable TV channel recommendation system optimized for the processing of real-time data streams originating from set-top boxes. The proposed system implements a self-adaptive approach for model building. The system effectively uses distributed processing power of Apache Spark to make recommendations in real time with scalability to meet the real time constraints with increasing load. The Spark Machine Learning Libraries (Spark MLLib) provide several algorithms which were used for developing the proposed recommendation system. The large amount of data in the system is efficiently managed by the data processing method of Lambda Architecture. |
Databáze: | OpenAIRE |
Externí odkaz: |