DQ2S- A framework for data quality-aware information management
Autor: | Pedro Sampaio, Chao Dong, Sandra Sampaio |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2015 |
Předmět: |
SQL
Computer science View Relational database Big data Data definition language Query language Query optimization computer.software_genre Database design Data modeling Search-oriented architecture Query expansion Relational database management system Artificial Intelligence Web query classification Data control language Information management Data quality Query language extensions Data profiling Decision support systems Query by Example computer.programming_language Database model Information retrieval Web search query business.industry Data manipulation language General Engineering Computer Science Applications Data model Object Query Language Relational model Sargable Data mining business computer RDF query language |
Zdroj: | Sampaio, S F M, Dong, C & Sampaio, P 2015, ' DQ2S-A framework for data quality-aware information management ', Expert Systems with Applications, vol. 42, no. 21, pp. 8304-8326 . https://doi.org/10.1016/j.eswa.2015.06.050 |
DOI: | 10.1016/j.eswa.2015.06.050 |
Popis: | Design of a data quality-aware information management framework and system.Users measure data quality based on an extensible set of data profiling algorithms.Query language, system architecture and heuristic optimization approach developed.System design based on seamless extensions to SQL and relational database systems.Applied in e-Business scenarios and potential for big data profiling discussed. This paper describes the design and implementation of the Data Quality Query System (DQ2S), a query processing framework and tool incorporating data quality profiling functionality in the processing of queries involving quality-aware query language extensions. DQ2S supports the combination of performance and quality-oriented query optimizations, and a query processing platform that enables advanced data profiling queries to be formulated based on well established query language constructs, often used to interact with relational database management systems. DQ2S encompasses a declarative query language and a data model that provides users with the capability to express constraints on the quality of query results as well as query quality-related information; a set of algebraic operators for manipulating data quality-related information, and optimization heuristics. The proposed query language and algebra represent seamless extensions to SQL and relational database engines, respectively. The constructs of the proposed data model are implemented at the user's view level and are internally mapped into relational model constructs. The quality-aware extensions and features are extremely useful when users need to assess the quality of relational data sets and define quality constraints for acceptable data prior to using candidate data sources in decision support systems and conducting big data analytical tasks. |
Databáze: | OpenAIRE |
Externí odkaz: |