A novel privacy-preserving federated genome-wide association study framework and its application in identifying potential risk variants in ankylosing spondylitis
Autor: | Jieren Deng, Zuochao Dou, Hao Zheng, Guanmin Gao, Xiang Chen, Kang Xie, Feng Chen, Yuhui Xiao, Xin Wu, Shengqian Xu, Huji Xu, Mengmeng Li, Zhen Wang, Shuang Wang |
---|---|
Rok vydání: | 2020 |
Předmět: |
0301 basic medicine
Information privacy Genotype Potential risk Computer science Single-nucleotide polymorphism Genome-wide association study Data science Statistical power Privacy preserving 03 medical and health sciences 030104 developmental biology 0302 clinical medicine Privacy Sample size determination Humans Genetic Predisposition to Disease Spondylitis Ankylosing 030212 general & internal medicine Molecular Biology Genome-Wide Association Study Information Systems Genetic association |
Zdroj: | Briefings in Bioinformatics. 22 |
ISSN: | 1477-4054 1467-5463 |
Popis: | Genome-wide association studies (GWAS) have been widely used for identifying potential risk variants in various diseases. A statistically meaningful GWAS typically requires a large sample size to detect disease-associated single nucleotide polymorphisms (SNPs). However, a single institution usually only possesses a limited number of samples. Therefore, cross-institutional partnerships are required to increase sample size and statistical power. However, cross-institutional partnerships offer significant challenges, a major one being data privacy. For example, the privacy awareness of people, the impact of data privacy leakages and the privacy-related risks are becoming increasingly important, while there is no de-identification standard available to safeguard genomic data sharing. In this paper, we introduce a novel privacy-preserving federated GWAS framework (iPRIVATES). Equipped with privacy-preserving federated analysis, iPRIVATES enables multiple institutions to jointly perform GWAS analysis without leaking patient-level genotyping data. Only aggregated local statistics are exchanged within the study network. In addition, we evaluate the performance of iPRIVATES through both simulated data and a real-world application for identifying potential risk variants in ankylosing spondylitis (AS). The experimental results showed that the strongest signal of AS-associated SNPs reside mostly around the human leukocyte antigen (HLA) regions. The proposed iPRIVATES framework achieved equivalent results as traditional centralized implementation, demonstrating its great potential in driving collaborative genomic research for different diseases while preserving data privacy. |
Databáze: | OpenAIRE |
Externí odkaz: |