Integration of ETL in Cloud Using Spark for Streaming Data
Autor: | Kartick Chandra Mondal, Neepa Biswas |
---|---|
Rok vydání: | 2021 |
Předmět: |
Database
business.industry Process (engineering) Computer science Big data InformationSystems_DATABASEMANAGEMENT Data feed Cloud computing Service provider computer.software_genre Data warehouse TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES Business intelligence Spark (mathematics) business computer |
Zdroj: | Advanced Techniques for IoT Applications ISBN: 9789811644344 |
DOI: | 10.1007/978-981-16-4435-1_18 |
Popis: | Extract-Transform-Load (ETL) consists of a series of process which collects raw transactional data and reshapes it into clean information which is actionable by Business Intelligence in the future. Presently most organizations are considering moving towards cloud-based implementation for their mission-critical applications. This trend is also affecting the management of ETL processes in the Data warehouse environment. The limitations of the traditional ETL process and the benefits of moving ETL into the cloud are discussed in this paper. After that, challenges in cloud computing adoption regarding the ETL process are identified. Features offered by some leading cloud-enabled ETL solutions are incorporated herewith some brief analysis. This paper will also cover the general issues in cloud ETL both from the perspective of cloud consumers and service providers. A novel framework is designed to process streaming data coming from real-time data feed. The solution facilitates the rapid development and deployment of real-time ETL applications. |
Databáze: | OpenAIRE |
Externí odkaz: |