Online Nearest Neighbor Search in Binary Space

Autor: Hassan Ashtiani, Sepehr Eghbali, Ladan Tahvildari
Rok vydání: 2017
Předmět:
Zdroj: ICDM
Popis: We revisit the K Nearest Neighbors (KNN) problem in large binary datasets which is of major importance in several applied areas. The goal is to find the K nearest items in a dataset to a query point where both the query and the items lie in the Hamming cube. The problem is addressed in its online setting, that is, data items are inserted sequentially into the dataset. To accommodate efficient similarity search and fast insertion of new items, we propose a data structure that partitions the feature space based on the Hamming weights of the binary codes and their substrings. Empirical evaluations on a large-scale dataset show significant speedup over the linear scan baseline.
Databáze: OpenAIRE