SETL: A programmable semantic extract-transform-load framework for semantic data warehouses

Autor: Katja Hose, Oscar Romero, Torben Bach Pedersen, Rudra Pratap Deb Nath
Přispěvatelé: Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació, Universitat Politècnica de Catalunya. MPI - Modelització i Processament de la Informació
Jazyk: angličtina
Rok vydání: 2017
Předmět:
Expert systems (Computer science)
Computer science
Semantic-aware
Informàtica::Sistemes d'informació [Àrees temàtiques de la UPC]
Semantic integration
02 engineering and technology
computer.software_genre
Semantic data model
Social Semantic Web
RDF
Knowledge base
Business analytics
Data warehouse
020204 information systems
Semantic computing
0202 electrical engineering
electronic engineering
information engineering

Semantic analytics
Semantic Web Stack
Semantic Web
Semantic compression
Semantic-aware Knowledge base
Information retrieval
Database
business.industry
Unstructured data
computer.file_format
Semantic interoperability
Gestor de dades
Data warehousing
ETL
data warehouses
Semantic grid
Hardware and Architecture
Semantic technology
020201 artificial intelligence & image processing
business
computer
Software
Information Systems
Sistemes experts (Informàtica)
Zdroj: UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Nath, R, Hose, K, Pedersen, T B & Romero, O 2017, ' SETL : A programmable semantic extract-transform-load framework for semantic data warehouses ', Information Systems, vol. 68, pp. 17-43 . https://doi.org/10.1016/j.is.2017.01.005
Recercat. Dipósit de la Recerca de Catalunya
instname
DOI: 10.1016/j.is.2017.01.005
Popis: In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance.
Databáze: OpenAIRE