A novel decentralized federated learning approach to train on globally distributed, poor quality, and protected private medical data.
Autor: | Nguyen TV; Presagen, Adelaide, SA, 5000, Australia. tuc@presagen.com.; School of Computing and Information Technology, University of Wollongong, Wollongong, NSW, 2522, Australia. tuc@presagen.com., Dakka MA; Presagen, Adelaide, SA, 5000, Australia.; School of Mathematical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia., Diakiw SM; Presagen, Adelaide, SA, 5000, Australia., VerMilyea MD; Ovation Fertility, Austin, TX, 78731, USA.; Texas Fertility Center, Austin, TX, 78731, USA., Perugini M; Presagen, Adelaide, SA, 5000, Australia.; Adelaide Medical School, The University of Adelaide, Adelaide, SA, 5000, Australia., Hall JMM; Presagen, Adelaide, SA, 5000, Australia.; Australian Research Council Centre of Excellence for Nanoscale BioPhotonics, Adelaide, SA, 5005, Australia.; School of Physical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia., Perugini D; Presagen, Adelaide, SA, 5000, Australia. |
---|---|
Jazyk: | angličtina |
Zdroj: | Scientific reports [Sci Rep] 2022 May 25; Vol. 12 (1), pp. 8888. Date of Electronic Publication: 2022 May 25. |
DOI: | 10.1038/s41598-022-12833-x |
Abstrakt: | Training on multiple diverse data sources is critical to ensure unbiased and generalizable AI. In healthcare, data privacy laws prohibit data from being moved outside the country of origin, preventing global medical datasets being centralized for AI training. Data-centric, cross-silo federated learning represents a pathway forward for training on distributed medical datasets. Existing approaches typically require updates to a training model to be transferred to a central server, potentially breaching data privacy laws unless the updates are sufficiently disguised or abstracted to prevent reconstruction of the dataset. Here we present a completely decentralized federated learning approach, using knowledge distillation, ensuring data privacy and protection. Each node operates independently without needing to access external data. AI accuracy using this approach is found to be comparable to centralized training, and when nodes comprise poor-quality data, which is common in healthcare, AI accuracy can exceed the performance of traditional centralized training. (© 2022. The Author(s).) |
Databáze: | MEDLINE |
Externí odkaz: | |
Nepřihlášeným uživatelům se plný text nezobrazuje | K zobrazení výsledku je třeba se přihlásit. |