Privacy-preserving statistical and machine learning methods under fully homomorphic encryption

Autor: Esperança, Pedro M.
Rok vydání: 2016
Předmět:
Druh dokumentu: Electronic Thesis or Dissertation
Popis: Advances in technology have now made it possible to monitor heart rate, body temperature and sleep patterns; continuously track movement; record brain activity; and sequence DNA in the jungle --- all using devices that fit in the palm of a hand. These and other recent developments have sparked interest in privacy-preserving methods: computational approaches which are able to utilise the data without leaking subjects' personal information. Classical encryption techniques have been used very successfully to protect data in transit and in storage. However, the process of encrypting data also renders it unusable in computation. Recently developed fully homomorphic encryption (FHE) techniques improve on this substantially. Unlike classical methods, which require the data to be decrypted prior to computation, homomorphic methods allow data to be simultaneously stored or transfered securely, and used in computation. However, FHE imposes serious constraints on computation, both arithmetic (e.g., no divisions can be performed) and computational (e.g., multiplications become much slower), rendering traditional statistical algorithms inadequate. In this thesis we develop statistical and machine learning methods for outsourced, privacy-preserving analysis of sensitive information under FHE. Specifically, we tackle two problems: (i) classification, using a semiparametric approach based on the naive Bayes assumption and modeling the class decision boundary directly using an approximation to univariate logistic regression; (ii) regression, using two approaches; an accelerated method for least squares estimation based on gradient descent, and a cooperative framework for Bayesian regression based on recursive Bayesian updating in a multi-party setting. Taking into account the constraints imposed by FHE, we analyse the potential of different algorithmic approaches to provide tractable solutions to these problems and give details on several computational costs and performance trade-offs.
Databáze: Networked Digital Library of Theses & Dissertations