Integration of ETL in Cloud Using Spark for Streaming Data

Autor: Kartick Chandra Mondal, Neepa Biswas
Rok vydání: 2021
Předmět:
Zdroj: Advanced Techniques for IoT Applications ISBN: 9789811644344
DOI: 10.1007/978-981-16-4435-1_18
Popis: Extract-Transform-Load (ETL) consists of a series of process which collects raw transactional data and reshapes it into clean information which is actionable by Business Intelligence in the future. Presently most organizations are considering moving towards cloud-based implementation for their mission-critical applications. This trend is also affecting the management of ETL processes in the Data warehouse environment. The limitations of the traditional ETL process and the benefits of moving ETL into the cloud are discussed in this paper. After that, challenges in cloud computing adoption regarding the ETL process are identified. Features offered by some leading cloud-enabled ETL solutions are incorporated herewith some brief analysis. This paper will also cover the general issues in cloud ETL both from the perspective of cloud consumers and service providers. A novel framework is designed to process streaming data coming from real-time data feed. The solution facilitates the rapid development and deployment of real-time ETL applications.
Databáze: OpenAIRE