Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures

Autor: André Miralles, Thérèse Libourel Rouge, Anne Laurent, Cédrine Madera, Marianne Huchard
Přispěvatelé: Models And Reuse Engineering, Languages (MAREL), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), WEB-CUBE, UMR 228 Espace-Dev, Espace pour le développement, Université de Guyane (UG)-Université des Antilles (UA)-Institut de Recherche pour le Développement (IRD)-Université de Perpignan Via Domitia (UPVD)-Avignon Université (AU)-Université de La Réunion (UR)-Université de Montpellier (UM), IBM PSSC Montpellier - Innovation Lab., IBM PSSC Montpellier, Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Anne Laurent, Dominique Laurent, Cédrine Madera
Jazyk: angličtina
Rok vydání: 2020
Předmět:
Zdroj: Data Lakes
Anne Laurent; Dominique Laurent; Cédrine Madera. Data Lakes, Wiley, pp.41-56, 2020, 978-1-119-72043-0. ⟨10.1002/9781119720430.ch3⟩
DOI: 10.1002/9781119720430.ch3⟩
Popis: International audience; This chapter aims to investigate an approach to assisting the user in the design of a data lake architecture. Software product line engineering is an approach that allows for the formalization of a series of similar software products or systems, which only differ in some of their optional components. The chapter introduces a formalization approach based on the model of product lines. Before doing so, it provides an overview of basic notions and terminology related to Software Product Line and Formal Concept Analysis. The chapter shows an approach to assisting and accelerating the construction of a data lake. This approach consists of high‐level modeling, independent from physical tools, relying on existing software product line concepts. The chapter considers existing semi‐automated processes to generate the feature model that provides us with a preliminary formal model regarding the functionalities of the components of a data lake.
Databáze: OpenAIRE