The Multi-Tree Cubing algorithm for computing iceberg cubes
Autor: | Howard J. Hamilton, Kamran Karimi, Xing Li, Liqiang Geng |
---|---|
Rok vydání: | 2008 |
Předmět: | |
Zdroj: | Journal of Intelligent Information Systems. 33:179-208 |
ISSN: | 1573-7675 0925-9902 |
DOI: | 10.1007/s10844-008-0074-3 |
Popis: | The computation of data cubes is one of the most expensive operations in on-line analytical processing (OLAP). To improve efficiency, an iceberg cube represents only the cells whose aggregate values are above a given threshold (minimum support). Top-down and bottom-up approaches are used to compute the iceberg cube for a data set, but both have performance limitations. In this paper, a new algorithm, called Multi-Tree Cubing (MTC), is proposed for computing an iceberg cube. The Multi-Tree Cubing algorithm is an integrated top-down and bottom-up approach. Overall control is handled in a top-down manner, so MTC features shared computation. By processing the orderings in the opposite order from the Top-Down Computation algorithm, the MTC algorithm is able to prune attributes. The Bottom Up Computation (BUC) algorithm and its variations also perform pruning by relying on the processing of intermediate partitions. The MTC algorithm, however, prunes without processing such partitions. The MTC algorithm is based on a specialized type of prefix tree data structure, called an Attribute---Partition tree (AP-tree), consisting of attribute and partition nodes. The AP-tree facilitates fast, in-memory sorting and APRIORI-like pruning. We report on five series of experiments, which confirm that MTC is consistently as fast or faster than BUC, while finding the same iceberg cubes. |
Databáze: | OpenAIRE |
Externí odkaz: |