Efficient privacy-preserving variable-length substring match for genome sequence

Autor: Yoshiki Nakagawa, Satsuya Ohata, Kana Shimizu
Rok vydání: 2021
Předmět:
Zdroj: Algorithms for molecular biology : AMB. 17(1)
ISSN: 1748-7188
Popis: Finding a similar substring that commonly appears in query and database sequences is an essential task for genome data analysis. This study proposes a secure two-party variable-length string search protocol based on secret sharing. The unique feature of our protocol is that time, communication, and round complexities are not dependent on the database length N, after the query input. This property brings dramatic performance improvements in search time, since N is usually quite large in an actual genome database, and the same database is repeatedly used for many queries. Our concept hinges on a technique that efficiently applies the compressed full-text index (FOCS 2000) for a secret-sharing scheme. We conducted an experiment using a human genomic sequence with the length of 10 million as the database and a query with the length of 100 and found that the query response time of our protocol was at least three orders of magnitude faster than a well-designed baseline protocol under the realistic computation/network environment.
LIPIcs, Vol. 201, 21st International Workshop on Algorithms in Bioinformatics (WABI 2021), pages 2:1-2:23
Databáze: OpenAIRE