Linked Entity Attribute Pair (LEAP): A Harmonization Framework for Data Pooling.
Autor: | Thomas S; Memorial Sloan Kettering Cancer Center, New York, NY., Lichtenberg T; Center for Translational Data Science, University of Chicago, Chicago, IL., Dang K; Sage Bionetworks, Seattle, WA., Fitzsimons M; Center for Translational Data Science, University of Chicago, Chicago, IL.; University of Illinois at Chicago, Chicago, IL., Grossman RL; Center for Translational Data Science, University of Chicago, Chicago, IL., Kundra R; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY., Lavery JA; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY., Lenoue-Newton ML; Vanderbilt-Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN., Panageas KS; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY., Sawyers C; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY., Schultz ND; Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY., Sirintrapun SJ; Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY., Topaloglu U; Cancer Biology, Wake Forest University School of Medicine, Winston Salem, NC., Welch A; Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY., Yu T; Sage Bionetworks, Seattle, WA., Zehir A; Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY., Gardos S; Information Systems, Memorial Sloan Kettering Cancer Center, New York, NY. |
---|---|
Jazyk: | angličtina |
Zdroj: | JCO clinical cancer informatics [JCO Clin Cancer Inform] 2020 Aug; Vol. 4, pp. 691-699. |
DOI: | 10.1200/CCI.20.00037 |
Abstrakt: | Purpose: As data-sharing projects become increasingly frequent, so does the need to map data elements between multiple classification systems. A generic, robust, shareable architecture will result in increased efficiency and transparency of the mapping process, while upholding the integrity of the data. Materials and Methods: The American Association for Cancer Research's Genomics Evidence Neoplasia Information Exchange (GENIE) collects clinical and genomic data for precision cancer medicine. As part of its commitment to open science, GENIE has partnered with the National Cancer Institute's Genomic Data Commons (GDC) as a secondary repository. After initial efforts to submit data from GENIE to GDC failed, we realized the need for a solution to allow for the iterative mapping of data elements between dynamic classification systems. We developed the Linked Entity Attribute Pair (LEAP) database framework to store and manage the term mappings used to submit data from GENIE to GDC. Results: After creating and populating the LEAP framework, we identified 195 mappings from GENIE to GDC requiring remediation and observed a 28% reduction in effort to resolve these issues, as well as a reduction in inadvertent errors. These results led to a decrease in the time to map between OncoTree, the cancer type ontology used by GENIE, and International Classification of Disease for Oncology, 3rd Edition, used by GDC, from several months to less than 1 week. Conclusion: The LEAP framework provides a streamlined mapping process among various classification systems and allows for reusability so that efforts to create or adjust mappings are straightforward. The ability of the framework to track changes over time streamlines the process to map data elements across various dynamic classification systems. |
Databáze: | MEDLINE |
Externí odkaz: |