Research on Data Routing Strategy of Deduplication in Cloud Environment

Autor: Qinlu He, Fan Zhang, Genqing Bian, Weiqi Zhang, Dongli Duan, Zhen Li, Chen Chen
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: IEEE Access, Vol 10, Pp 9529-9542 (2022)
Druh dokumentu: article
ISSN: 2169-3536
DOI: 10.1109/ACCESS.2021.3139757
Popis: The application of data deduplication technology reduces the demand for data storage and improves resource utilization. Compared with limited storage capacity and computing capacity of a single node, cluster data deduplication technology has great advantages. However, the cluster data duplication technology also brings new issues on deduplication rate reduction and load balancing of storage nodes. The application of data routing strategy can well balance the problem of deduplication rate and load balancing. Therefore, this paper proposes a data routing strategy based on distributed Bloom Filter. 1)Superchunk is used as the basic unit of data routing to improve system throughput. According to Broder’s theorem, k leastsized fingerprints are selected as the Superchunk features and send to the storage node. The optimal node is selected as the routing node by matching the BloomFilter, and the storage capacity of the node and maintained in the memory of the storage node. 2) Design and implement system prototypes. The specific parameters of all kinds of routing strategies are obtained through experiments, and the routing strategies proposed in this paper are tested. The theoretical analysis and experimental results prove the feasibility of the strategies proposed by this paper. Compared with the other routing strategies, our method improved 3% of the deduplication rate, reduces the communication query overhead by more than 36% and improves the load balancing degree of the storage system.
Databáze: Directory of Open Access Journals