SETL: A programmable semantic extract-transform-load framework for semantic data warehouses
Autor: | Katja Hose, Oscar Romero, Torben Bach Pedersen, Rudra Pratap Deb Nath |
---|---|
Přispěvatelé: | Universitat Politècnica de Catalunya. Departament d'Enginyeria de Serveis i Sistemes d'Informació, Universitat Politècnica de Catalunya. MPI - Modelització i Processament de la Informació |
Jazyk: | angličtina |
Rok vydání: | 2017 |
Předmět: |
Expert systems (Computer science)
Computer science Semantic-aware Informàtica::Sistemes d'informació [Àrees temàtiques de la UPC] Semantic integration 02 engineering and technology computer.software_genre Semantic data model Social Semantic Web RDF Knowledge base Business analytics Data warehouse 020204 information systems Semantic computing 0202 electrical engineering electronic engineering information engineering Semantic analytics Semantic Web Stack Semantic Web Semantic compression Semantic-aware Knowledge base Information retrieval Database business.industry Unstructured data computer.file_format Semantic interoperability Gestor de dades Data warehousing ETL data warehouses Semantic grid Hardware and Architecture Semantic technology 020201 artificial intelligence & image processing business computer Software Information Systems Sistemes experts (Informàtica) |
Zdroj: | UPCommons. Portal del coneixement obert de la UPC Universitat Politècnica de Catalunya (UPC) Nath, R, Hose, K, Pedersen, T B & Romero, O 2017, ' SETL : A programmable semantic extract-transform-load framework for semantic data warehouses ', Information Systems, vol. 68, pp. 17-43 . https://doi.org/10.1016/j.is.2017.01.005 Recercat. Dipósit de la Recerca de Catalunya instname |
DOI: | 10.1016/j.is.2017.01.005 |
Popis: | In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this “open world scenario” because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance. |
Databáze: | OpenAIRE |
Externí odkaz: |