Data partition optimisation for column-family NoSQL databases
Autor: | Li Yung Ho, Meng Ju Hsieh, Pangfeng Liu, Jan Jan Wu |
---|---|
Rok vydání: | 2017 |
Předmět: | |
Zdroj: | International Journal of Big Data Intelligence. 4:263 |
ISSN: | 2053-1397 2053-1389 |
DOI: | 10.1504/ijbdi.2017.10006848 |
Popis: | Data conversion has become an emerging topic in BigData era. To face the challenge of rapid data growth, legacy or existing relational databases have the need to convert into NoSQL column-family database in order to achieve better scalability. The conversion from SQL to NoSQL databases requires combining small, normalised SQL data tables into larger NoSQL data tables; a process called denormalisation. A challenging issue in data conversion is how to group the denormalised columns in a large data table into 'families' in order to ensure the performance of query processing. In this paper, we propose an efficient heuristic algorithm, graph-based partition algorithm (GPA), to address this problem. We use TPC-C and TPC-H benchmarks to demonstrate that the column-families produced by GPA is very efficient for large-scale data processing. |
Databáze: | OpenAIRE |
Externí odkaz: |