Comparing columnar, row and array DBMSs to process recursive queries on graphs
Autor: | Carlos Ordonez, Wellington Cabrera, Achyuth Gurram |
---|---|
Rok vydání: | 2017 |
Předmět: |
SQL
Theoretical computer science Computer science business.industry Big data InformationSystems_DATABASEMANAGEMENT Transitive closure 02 engineering and technology Relational operator Query optimization Hardware and Architecture Reachability 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Sargable Adjacency matrix business computer Software Information Systems computer.programming_language |
Zdroj: | Information Systems. 63:66-79 |
ISSN: | 0306-4379 |
DOI: | 10.1016/j.is.2016.04.006 |
Popis: | Analyzing graphs is a fundamental problem in big data analytics, for which DBMS technology does not seem competitive. On the other hand, SQL recursive queries are a fundamental mechanism to analyze graphs in a DBMS, whose processing and optimization are significantly harder than traditional SPJ queries. Columnar DBMSs are a new faster class of database system, with significantly different storage and query processing mechanisms compared to row DBMSs, still the dominating technology. With that motivation in mind, we study the optimization of recursive queries on a columnar DBMS focusing on two fundamental and complementary graph problems: transitive closure and adjacency matrix multiplication. From a query processing perspective we consider the three fundamental relational operators: selection, projection and join (SPJ), where projection subsumes SQL group-by aggregation. We present comprehensive experiments comparing recursive query processing on columnar, row and array DBMSs to analyze large graphs with different shape and density. We study the relative impact of query optimizations and we compare raw speed of DBMSs to evaluate recursive queries on graphs. Results confirm classical query optimizations that keep working well in a columnar DBMS, but their relative impact is different. Most importantly, a columnar DBMS with tuned query optimization is uniformly faster than row and array systems to analyze large graphs, regardless of their shape, density and connectivity. On the other hand, there is no clear winner between the row and array DBMSs. |
Databáze: | OpenAIRE |
Externí odkaz: |