A Multi-dimensional Index Structure Based on Improved VA-file and CAN in the Cloud

Autor: Chun-Ju Sun, Xiao-Long Xu, Dengyin Zhang, Chunling Cheng
Rok vydání: 2014
Předmět:
Zdroj: International Journal of Automation and Computing. 11:109-117
ISSN: 1751-8520
1476-8186
DOI: 10.1007/s11633-014-0772-y
Popis: Currently, the cloud computing systems use simple key-value data processing, which cannot support similarity search effectively due to lack of efficient index structures, and with the increase of dimensionality, the existing tree-like index structures could lead to the problem of "the curse of dimensionality". In this paper, a novel VF-CAN indexing scheme is proposed. VF-CAN integrates content addressable network (CAN) based routing protocol and the improved vector approximation file (VA-file) index. There are two index levels in this scheme: global index and local index. The local index VAK-file is built for the data in each storage node. VAK-file is the k-means clustering result of VA-file approximation vectors according to their degree of proximity. Each cluster forms a separate local index file and each file stores the approximate vectors that are contained in the cluster. The vector of each cluster center is stored in the cluster center information file of corresponding storage node. In the global index, storage nodes are organized into an overlay network CAN, and in order to reduce the cost of calculation, only clustering information of local index is issued to the entire overlay network through the CAN interface. The experimental results show that VF-CAN reduces the index storage space and improves query performance effectively.
Databáze: OpenAIRE