Systematic data quality assessment of electronic health record data to evaluate study-specific fitness: Report from the PRESERVE research study.
Autor: | Razzaghi H; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America., Goodwin Davies A; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America., Boss S; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America., Bunnell HT; Biomedical Research Informatics Center, Nemours Children's Hospital, Wilmington, Delaware, United States of America., Chen Y; Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America., Chrischilles EA; Department of Epidemiology, College of Public Health, University of Iowa, Iowa City, Iowa, United States of America., Dickinson K; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America., Hanauer D; Department of Learning Health Sciences, University of Michigan Medical School, Ann Arbor, Michigan, United States of America., Huang Y; IT Research and Innovation, Nationwide Children's Hospital, Columbus, Ohio, United States of America., Ilunga KTS; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America., Katsoufis C; Division of Pediatric Nephrology, University of Miami Miller School of Medicine, Miami, Florida United States of America., Lehmann H; Biomedical Informatics & Data Science Section, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America., Lemas DJ; Department of Health Outcomes & Biomedical Informatics, University of Florida, Gainesville, FLorida, United States of America., Matthews K; Analytics Research Center, Children's Hospital of Colorado, Aurora, Colorado, United States of America., Mendonca EA; Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America., Morse K; Division of Pediatric Hospital Medicine, Stanford University School of Medicine, Stanford, California, United States of America., Ranade D; Biostatistics, Epidemiology, and Analytics in Research (BEAR), Seattle Children's Hospital, Seattle, Washington, United States of America., Rosenman M; Department of Pediatrics, Ann & Robert H. Lurie Children's Hospital, Chicago, Illinois, United States of America., Taylor B; Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, Wisconsin, United States of America., Walters K; Translational and Clinical Sciences Institute, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America., Denburg MR; Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.; Division of Nephrology, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America., Forrest CB; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America., Bailey LC; Applied Clinical Research Center, Departments of Pediatrics and Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America.; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania, United States of America. |
---|---|
Jazyk: | angličtina |
Zdroj: | PLOS digital health [PLOS Digit Health] 2024 Jun 27; Vol. 3 (6), pp. e0000527. Date of Electronic Publication: 2024 Jun 27 (Print Publication: 2024). |
DOI: | 10.1371/journal.pdig.0000527 |
Abstrakt: | Study-specific data quality testing is an essential part of minimizing analytic errors, particularly for studies making secondary use of clinical data. We applied a systematic and reproducible approach for study-specific data quality testing to the analysis plan for PRESERVE, a 15-site, EHR-based observational study of chronic kidney disease in children. This approach integrated widely adopted data quality concepts with healthcare-specific evaluation methods. We implemented two rounds of data quality assessment. The first produced high-level evaluation using aggregate results from a distributed query, focused on cohort identification and main analytic requirements. The second focused on extended testing of row-level data centralized for analysis. We systematized reporting and cataloguing of data quality issues, providing institutional teams with prioritized issues for resolution. We tracked improvements and documented anomalous data for consideration during analyses. The checks we developed identified 115 and 157 data quality issues in the two rounds, involving completeness, data model conformance, cross-variable concordance, consistency, and plausibility, extending traditional data quality approaches to address more complex stratification and temporal patterns. Resolution efforts focused on higher priority issues, given finite study resources. In many cases, institutional teams were able to correct data extraction errors or obtain additional data, avoiding exclusion of 2 institutions entirely and resolving 123 other gaps. Other results identified complexities in measures of kidney function, bearing on the study's outcome definition. Where limitations such as these are intrinsic to clinical data, the study team must account for them in conducting analyses. This study rigorously evaluated fitness of data for intended use. The framework is reusable and built on a strong theoretical underpinning. Significant data quality issues that would have otherwise delayed analyses or made data unusable were addressed. This study highlights the need for teams combining subject-matter and informatics expertise to address data quality when working with real world data. Competing Interests: One author (MD) reports funding from Mallinckrodt, Inc. for development of the Glomerular Learning Network (GLEAN) for the study of kidney disease in children. (Copyright: © 2024 Razzaghi et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.) |
Databáze: | MEDLINE |
Externí odkaz: |