Autor: |
Uhlmann, Jeffrey, Zuniga, Miguel R. |
Rok vydání: |
2021 |
Předmět: |
|
Druh dokumentu: |
Working Paper |
Popis: |
This paper presents the Cascaded Metric Tree (CMT) for efficient satisfaction of metric search queries over a dataset of N objects. It provides extra information that permits query algorithms to exploit all distance calculations performed along each path in the tree for pruning purposes. In addition to improving standard metric range (ball) query algorithms, we present a new algorithm for exploiting the CMT cascaded information to achieve near-optimal performance for k-nearest neighbor (kNN) queries. We demonstrate the performance advantage of CMT over classical metric search structures on synthetic datasets of up to 10 million objects and on the 564K Swiss-Prot protein sequence dataset containing over $200$ million amino acids. As a supplement to the paper, we provide reference implementations of the empirically-examined algorithms to encourage improvements and further applications of CMT to practical scientific and engineering problems |
Databáze: |
arXiv |
Externí odkaz: |
|