Delay-sensitive approaches for anonymizing numerical streaming data

Autor: Hessam Zakerzadeh, Sylvia L. Osborn
Rok vydání: 2013
Předmět:
Zdroj: International Journal of Information Security. 12:423-437
ISSN: 1615-5270
1615-5262
Popis: Streaming data are widely used in today's world. Data come from different sources in streams and must be processed online and with minimum delay. These data stream can contain confidential data such as customers' purchase information and need to be mined in order to reveal other useful information like customers' purchase patterns. Privacy preservation throughout these processes plays a crucial role. K-anonymity is a well-known technique for preserving privacy. The principle issues in k-anonymity are information loss and running time. Although some of the existing k-anonymity techniques are able to generate anonymized data with acceptable information loss, their main drawback is that they are very time-consuming and are not applicable in a streaming context since streaming data are usually very sensitive to delay and need to be processed quite fast. In [32], we proposed a cluster-based k-anonymity algorithm called fast anonymizing algorithm for numerical streaming data (FAANST) which can anonymize numerical streaming data quite fast while providing an admissible information loss. The main drawback of FAANST is that some tuples may remain in the system for a long time and are output when they might be considered to have expired. In this paper, we propose two extensions for FAANST, passive and proactive solutions. These two solutions put a soft deadline, called $$delay$$ , on the time each tuple can stay in the system, and if a tuple passes this deadline, these algorithms force the tuple to be output. The proactive solution goes even one step further and utilizes a simple heuristic function to predict when a tuple in the system may expire and outputs the tuple if it will expire in the next round of the algorithm's execution.
Databáze: OpenAIRE