A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research

Autor: Xiaoqian Jiang, Lavanya Nookala, Grace M. Kuo, Frederic S. Resnic, Hyeoneui Kim, Robert El-Kareh, Laura Pearlman, Michael E. Matheny, Lucila Ohno-Machado, Katherine K. Kim, Michele E. Day, Daniella Meeker, Michel D'Arcy, Claudiu Farcas, Aziz A. Boxwala, Carl Kesselman
Rok vydání: 2015
Předmět:
Comparative Effectiveness Research
Biomedical Research
Computer science
Information Storage and Retrieval
Health Informatics
Research and Applications
computer.software_genre
Medical and Health Sciences
privacy-preserving network infrastructure
Computer Communication Networks
03 medical and health sciences
Engineering
0302 clinical medicine
Models
Information and Computing Sciences
federated research network
030212 general & internal medicine
030304 developmental biology
Internet
0303 health sciences
Models
Statistical

Database
Information Dissemination
Statistical
Data science
distributed analytics
Data sharing
Workflow
Databases as Topic
Disparate system
Asynchronous communication
Data exchange
Multivariate Analysis
Management system
Scalability
Web service
computer
Software
Medical Informatics
Zdroj: Journal of the American Medical Informatics Association : JAMIA, vol 22, iss 6
Journal of the American Medical Informatics Association : JAMIA
Meeker, D; Jiang, X; Matheny, ME; Farcas, C; D'Arcy, M; Pearlman, L; et al.(2015). A system to build distributed multivariate models and manage disparate data sharing policies: Implementation in the scalable national network for effectiveness research. Journal of the American Medical Informatics Association, 22(6), 1187-1195. doi: 10.1093/jamia/ocv017. UC Davis: Retrieved from: http://www.escholarship.org/uc/item/82p1b28k
Meeker, Daniella; Jiang, Xiaoqian; Matheny, Michael E; Farcas, Claudiu; D'Arcy, Michel; Pearlman, Laura; et al.(2015). A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research.. JAMIA, 22, 1187-1195. UC Davis: Retrieved from: http://www.escholarship.org/uc/item/6kj7f634
ISSN: 1527-974X
1067-5027
DOI: 10.1093/jamia/ocv017
Popis: Background Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner.Objective The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies.Materials and Methods Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network.Results The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws.Discussion and Conclusion Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks.
Databáze: OpenAIRE