Popis: |
In the era of big data, the data-processing pipeline becomes increasingly distributed among multiple sites. To connect data consumers with remote producers, a public directory service is essential. This is evidenced by adoption in emerging applications such as electronic healthcare.This work systematically studies the privacy-preserving and security hardening of a public directory service.First, we address the privacy preservation of serving a directory over the Internet. With Internet eavesdroppersperforming attacks with background knowledge, the directory service has to be privacy preserving, for thecompliance with data-protection laws (e.g., HiPAA). We propose techniques to adaptively inject noises to thepublic directory in such a way that is aware of application-level data schema, effectively preserving privacy and achieving high search recall.Second, we tackle the problem of securely constructing the directory among distrusting data producers.For provable security, we model the directory construction problem by secure multi-party computations (MPC). For efficiency, we propose a pre-computation framework that minimizes the private computation and conducts aggressive pre-computation on public data. In addition, we tackle the systems-level efficiency by exploiting data-level parallelism on general-purpose graphics processing units (GPGPU). We apply the proposed scheme to real health-care scenarios for constructing patient-locator services in emerging Health Information Exchange (or HIE) networks.For privacy evaluation, we conduct extensive analysis of our noise-injecting techniques against variousbackground-knowledge attacks. We conduct experiments on real-world datasets and demonstrate the low attack success rate for the protection effectiveness. For performance evaluation, we implement our MPC optimization techniques on open-source MPC software. Through experiments on local and geo-distributed settings, our performance results show that the proposed pre-computation achieves a speedup of more than an order of magnitude without security loss. |