Data Swapping for Private Information Sharing of Web Search Logs
Autor: | Kato Mivule |
---|---|
Rok vydání: | 2017 |
Předmět: |
Government
Information privacy business.industry Computer science 05 social sciences Big data 050301 education 02 engineering and technology Private sector Computer security computer.software_genre Set (abstract data type) Data sharing 0202 electrical engineering electronic engineering information engineering General Earth and Planetary Sciences 020201 artificial intelligence & image processing Confidentiality business 0503 education Private information retrieval computer General Environmental Science |
Zdroj: | Procedia Computer Science. 114:149-158 |
ISSN: | 1877-0509 |
DOI: | 10.1016/j.procs.2017.09.017 |
Popis: | With the increasing number of sophisticated cyber attacks on both government and private infrastructure, cybersecurity data sharing is critical for the advancement of collaborative research among various entities, both in government, private sector, and academia. Of recent, the US Congress passed the Cyber Intelligence Sharing and Protection Act, as a framework for data sharing between various entities. Nevertheless this development raises the issue of trust between the collaborating parties, since shared data could be revealing. Conversely, due to the sensitive and confidential nature of the data involved, entities would have to employ various anonymization techniques to meet legal requirements in compliance with confidentiality policies of both their own organizations and federal government requirements. Secondly, a basic sharing of the data without the privatization process could make entities involved vulnerable to insider and inference attacks. For instance, an entity sharing data on cyber attacks might accidently reveal a sensitive network topology to an untrusted collaborator. As a contribution, we propose a modest but effective data privacy enhancement heuristic; a targeted 2k basic data swapping of individual web search log records. In this heuristic, if individual has a set of x records in their web search log set A, those records are swapped in that individual set A, then swapped again with another individual y records in set B. Our preliminary results show that data swapping is effective for big data and it would be demanding to trace the original issuer of the queries in a given large dataset of web search logs, thus providing some level of confidentiality. |
Databáze: | OpenAIRE |
Externí odkaz: |