Data Lake: A Case of Study of a Big Data Analytics Architecture for Public Procurements

Autor: Julio Paciello, David Sosa
Rok vydání: 2021
Předmět:
Zdroj: 2021 Eighth International Conference on eDemocracy & eGovernment (ICEDEG).
DOI: 10.1109/icedeg52154.2021.9530976
Popis: Big Data technologies are facing problems of volume, velocity, variety and veracity of data, attending to the wide expansion of emerging technologies like IoT and IoE. Cyberocracy proposes a decision-making process of a Government based on the effective use of information. An important effort in this line, focusing on government public procurement, has been carried out by the Open Contracting Partnership (OCP), promoting the publication of more volumes of public procurement data in non-relational and machine processable formats every day. This work analyzes the underlying Big Data infrastructure for the analysis of public procurement data through a comparative case of study between a technology proposed by the OCP called KingFisher and emergent technologies based on Data Lakes.With an emphasis on storage requirements to support a high volume of payloads, also considering criteria of velocity and RAM use. Preliminary results show encouraging findings especially in terms of volume required by a Data Lake, even for different payload scenarios, up to 10 times less storage than the relational database-based model.
Databáze: OpenAIRE