Quantization-based hashing: a general framework for scalable image and video retrieval

Autor:	Lianli Gao, Nicu Sebe, Li Liu, Jingkuan Song, Xiaofeng Zhu
Rok vydání:	2018
Předmět:	Computer science Hash function Hashing Multimedia retrieval Pseudo labels Software Signal Processing 1707 Artificial Intelligence 02 engineering and technology Linear hashing K-independent hashing Locality-sensitive hashing Open addressing 0202 electrical engineering electronic engineering information engineering Computer Science::Databases Universal hashing business.industry Dynamic perfect hashing Vector quantization 020207 software engineering Pattern recognition Hamming distance 2-choice hashing Hash table Hopscotch hashing Cuckoo hashing Locality preserving hashing 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Feature hashing Artificial intelligence business Perfect hash function Extendible hashing Double hashing
Zdroj:	Pattern Recognition. 75:175-187
ISSN:	0031-3203
Popis:	As far as we know, we are the first to propose a general framework to incorporate the quantization-based methods into the conventional similarity-preserving hashing, in order to improve the effectiveness of hashing methods. In theory, any quantization method can be adopted to reduce the quantization error of any similarity-preserving hashing methods to improve their performance.This framework can be applied to both unsupervised and supervised hashing. We experimentally obtained the best performance compared to state-ofthe-art supervised and unsupervised hashing methods on six popular datasets.We successfully show it to work on a huge dataset SIFT1B (1 billion data points) by utilizing the graph approximation and out-of-sample extension. Nowadays, due to the exponential growth of user generated images and videos, there is an increasing interest in learning-based hashing methods. In computer vision, the hash functions are learned in such a way that the hash codes can preserve essential properties of the original space (or label information). Then the Hamming distance of the hash codes can approximate the data similarity. On the other hand, vector quantization methods quantize the data into different clusters based on the criteria of minimal quantization error, and then perform the search using look-up tables. While hashing methods using Hamming distance can achieve faster search speed, their accuracy is often outperformed by quantization methods with the same code length, due to the low quantization error and more flexible distance lookups. To improve the effectiveness of the hashing methods, in this work, we propose Quantization-based Hashing (QBH), a general framework which incorporates the advantages of quantization error reduction methods into conventional property preserving hashing methods. The learned hash codes simultaneously preserve the properties in the original space and reduce the quantization error, and thus can achieve better performance. Furthermore, the hash functions and a quantizer can be jointly learned and iteratively updated in a unified framework, which can be readily used to generate hash codes or quantize new data points. Importantly, QBH is a generic framework that can be integrated to different property preserving hashing methods and quantization strategies, and we apply QBH to both unsupervised and supervised hashing models as showcases in this paper. Experimental results on three large-scale unlabeled datasets (i.e., SIFT1M, GIST1M, and SIFT1B), three labeled datastes (i.e., ESPGAME, IAPRTC and MIRFLICKR) and one video dataset (UQ_VIDEO) demonstrate the superior performance of our QBH over existing unsupervised and supervised hashing methods.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::5e0116796698493b0bd3b48380d9d37c https://doi.org/10.1016/j.patcog.2017.03.021 Zobrazit plný text záznamu Full Text from ScienceDirect