Privacy Preserving RNA-Model Validation Across Laboratories

Autor: Eric Lefkofsky, Martin C. Stumpe, Ameen A. Salahudeen, Aly A. Khan, Talal Ahmed, Raphael Pelossof, Jonathan R. Dry, Mark Carty, Stephane Wenric
Rok vydání: 2021
Předmět:
DOI: 10.1101/2021.04.01.437893
Popis: Reproducibility of results obtained using RNA data across labs remains a major hurdle in cancer research. Often, molecular predictors trained on one dataset cannot be applied to another due to differences in RNA library preparation and quantification. While current RNA correction algorithms may overcome these differences, they require access to all patient-level data, which necessitates the sharing of training data for predictors when sharing predictors. Here, we describe SpinAdapt, an unsupervised RNA correction algorithm that enables the transfer of molecular models without requiring access to patient-level data. It computes data corrections only via aggregate statistics of each dataset, thereby maintaining patient data privacy. Furthermore, SpinAdapt can correct new samples, thereby enabling evaluation of validation cohorts. Despite an inherent tradeoff between privacy and performance, SpinAdapt outperforms current correction methods that require patient-level data access. We expect this novel correction paradigm to enhance research reproducibility and patient privacy. Finally, SpinAdapt lays a mathematical framework that can be extended to other -omics modalities.
Databáze: OpenAIRE