Learning without Looking: Similarity Preserving Hashing and Its Potential for Machine Learning in Privacy Critical Domains

Autor: Eleks, Marian, Rebstadt, Jonas, Fukas, Philipp, Thomas, Oliver
Jazyk: angličtina
Rok vydání: 2022
Předmět:
DOI: 10.18420/inf2022_16
Popis: Machine Learning is frequently ranked as one of the most promising technologies in several application domains but falls short when the data necessary for training is privacy-sensitive and can thus not be used. We address this problem by extending the field of Privacy Aware Machine Learning with the application of Similarity Preserving Hashing algorithms to the task of data anonymization in a Design Science Research approach. In this endeavor, novel anonymization algorithms made to enable Machine Learning on anonymized data are designed, implemented, and evaluated. Throughout the Design Science Research process, we present a collection of issues and requirements for Privacy Aware Machine Learning algorithms along with three Similarity Preserving Hashing-based algorithms to fulfil them. A metric-based comparison of established and novel algorithms as well as new arising opportunities for Machine Learning on sensitive data are also added to the current knowledge base of Information Systems research.
Databáze: OpenAIRE