Hash Bit Selection for Nearest Neighbor Search
Autor: | Junfeng He, Xianglong Liu, Shih-Fu Chang |
---|---|
Rok vydání: | 2017 |
Předmět: |
Theoretical computer science
Computer science Nearest neighbor search Hash function 02 engineering and technology 010501 environmental sciences Linear hashing Rolling hash 01 natural sciences K-independent hashing Locality-sensitive hashing Open addressing 0202 electrical engineering electronic engineering information engineering 0105 earth and related environmental sciences Universal hashing Dynamic perfect hashing 2-choice hashing Computer Graphics and Computer-Aided Design Hash table Hopscotch hashing Cuckoo hashing Locality preserving hashing 020201 artificial intelligence & image processing Feature hashing Perfect hash function Extendible hashing Software Double hashing |
Zdroj: | IEEE Transactions on Image Processing. 26:5367-5380 |
ISSN: | 1941-0042 1057-7149 |
DOI: | 10.1109/tip.2017.2695895 |
Popis: | To overcome the barrier of storage and computation when dealing with gigantic-scale data sets, compact hashing has been studied extensively to approximate the nearest neighbor search. Despite the recent advances, critical design issues remain open in how to select the right features, hashing algorithms, and/or parameter settings. In this paper, we address these by posing an optimal hash bit selection problem, in which an optimal subset of hash bits are selected from a pool of candidate bits generated by different features, algorithms, or parameters. Inspired by the optimization criteria used in existing hashing algorithms, we adopt the bit reliability and their complementarity as the selection criteria that can be carefully tailored for hashing performance in different tasks. Then, the bit selection solution is discovered by finding the best tradeoff between search accuracy and time using a modified dynamic programming method. To further reduce the computational complexity, we employ the pairwise relationship among hash bits to approximate the high-order independence property, and formulate it as an efficient quadratic programming method that is theoretically equivalent to the normalized dominant set problem in a vertex- and edge-weighted graph. Extensive large-scale experiments have been conducted under several important application scenarios of hash techniques, where our bit selection framework can achieve superior performance over both the naive selection methods and the state-of-the-art hashing algorithms, with significant accuracy gains ranging from 10% to 50%, relatively. |
Databáze: | OpenAIRE |
Externí odkaz: |