Autor: |
Lina Dinesh, K. Gayathri Devi |
Jazyk: |
angličtina |
Rok vydání: |
2024 |
Předmět: |
|
Zdroj: |
Journal of Cloud Computing: Advances, Systems and Applications, Vol 13, Iss 1, Pp 1-12 (2024) |
Druh dokumentu: |
article |
ISSN: |
2192-113X |
DOI: |
10.1186/s13677-023-00571-y |
Popis: |
Abstract In big data, analysis data is collected from different sources in various formats, transforming into the aspect of cleansing the data, customization, and loading it into a Data Warehouse. Extracting data in other formats and transforming it to the required format requires transformation algorithms. This transformation stage has redundancy issues and is stored across any location in the data warehouse, which increases computation costs. The main issues in big data ETL are handling high-dimensional data and maintaining similar data for effective data warehouse usage. Therefore, Extract, Transform, Load (ETL) plays a vital role in extracting meaningful information from the data warehouse and trying to retain the users. This paper proposes hybrid optimization of Swarm Intelligence with a tabu search algorithm for handling big data in a cloud-based architecture-based ETL process. This proposed work overcomes many issues related to complex data storage and retrieval in the data warehouse. Swarm Intelligence algorithms can overcome problems like high dimensional data, dynamical change of huge data and cost optimization in the transformation stage. In this work for the swarm intelligence algorithm, a Grey-Wolf Optimizer (GWO) is implemented to reduce the high dimensionality of data. Tabu Search (TS) is used for clustering the relevant data as a group. Clustering means the segregation of relevant data accurately from the data warehouse. The cluster size in the ETL process can be optimized by the proposed work of (GWO-TS). Therefore, the huge data in the warehouse can be processed within an expected latency. |
Databáze: |
Directory of Open Access Journals |
Externí odkaz: |
|