A Sampling-Based Method for Highly Efficient Privacy-Preserving Data Publication
Autor: | Ling Tian, Guoming Lu, Xia Wang, Jingyuan Duan, Xu Zheng |
---|---|
Rok vydání: | 2021 |
Předmět: |
Technology
Data processing Article Subject Computer Networks and Communications Computer science Wireless network 020208 electrical & electronic engineering Bandwidth (signal processing) Sampling (statistics) 020206 networking & telecommunications TK5101-6720 02 engineering and technology computer.software_genre Histogram Encoding (memory) Telecommunication 0202 electrical engineering electronic engineering information engineering Differential privacy Data mining Electrical and Electronic Engineering Mobile device computer Information Systems |
Zdroj: | Wireless Communications and Mobile Computing, Vol 2021 (2021) |
ISSN: | 1530-8677 1530-8669 |
DOI: | 10.1155/2021/6648775 |
Popis: | The data publication from multiple contributors has been long considered a fundamental task for data processing in various domains. It has been treated as one prominent prerequisite for enabling AI techniques in wireless networks. With the emergence of diversified smart devices and applications, data held by individuals becomes more pervasive and nontrivial for publication. First, the data are more private and sensitive, as they cover every aspect of daily life, from the incoming data to the fitness data. Second, the publication of such data is also bandwidth-consuming, as they are likely to be stored on mobile devices. The local differential privacy has been considered a novel paradigm for such distributed data publication. However, existing works mostly request the encoding of contents into vector space for publication, which is still costly in network resources. Therefore, this work proposes a novel framework for highly efficient privacy-preserving data publication. Specifically, two sampling-based algorithms are proposed for the histogram publication, which is an important statistic for data analysis. The first algorithm applies a bit-level sampling strategy to both reduce the overall bandwidth and balance the cost among contributors. The second algorithm allows consumers to adjust their focus on different intervals and can properly allocate the sampling ratios to optimize the overall performance. Both the analysis and the validation of real-world data traces have demonstrated the advancement of our work. |
Databáze: | OpenAIRE |
Externí odkaz: |