Abstrakt: |
For solving the shortage of existing distributed Top-k query algorithms, a novel top-k algorithm (named ECHT algorithm) is proposed, which is appropriate for massive distributed data. Taking care of the data distribution, ECHT algorithm designs a new algorithm of error-limited histogram. For one thing, it avoids poor performance on uneven data distribution. For the other, it improves the accuracy of the threshold value, thus further reducing network bandwidth consumption. In addition, ECHT performs early clipping. Clipping before the transmission of large amounts of data priors brings better performance due to avoiding a lot of useless data transmission. The experiments are performed with the real datasets, demonstrating the viability and superior performance of the new algorithm. [ABSTRACT FROM AUTHOR] |