Development and Validation of Algorithms to Identify COVID-19 Patients Using a US Electronic Health Records Database: A Retrospective Cohort Study

Autor: Brown CA, Londhe AA, He F, Cheng A, Ma J, Zhang J, Brooks CG, Sprafka JM, Roehl KA, Carlson KB, Page JH
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Zdroj: Clinical Epidemiology, Vol Volume 14, Pp 699-709 (2022)
Druh dokumentu: article
ISSN: 1179-1349
Popis: Carolyn A Brown,1 Ajit A Londhe,1 Fang He,1 Alvan Cheng,1 Junjie Ma,1 Jie Zhang,1 Corinne G Brooks,1 J Michael Sprafka,1,2 Kimberly A Roehl,1 Katherine B Carlson,1,3 John H Page1 1Center for Observational Research, Amgen, Inc., Thousand Oaks, CA, USA; 2Woodford Research Associates, Thousand Oaks, CA, USA; 3Now with R&D Strategy, Moderna Inc., Cambridge, MA, USACorrespondence: Carolyn A Brown; John H Page, Center for Observational Research, Amgen, Inc., 1 Amgen Center Drive, B38-4B, Thousand Oaks, CA, 91320, USA, Tel +1-818-482-9477 ; +1-805-490-5527, Email cbrown14@amgen.com; Jopage@amgen.comIntroduction: In order to identify and evaluate candidate algorithms to detect COVID-19 cases in an electronic health record (EHR) database, this study examined and compared the utilization of acute respiratory disease codes from February to August 2020 versus the corresponding time period in the 3 years preceding.Methods: De-identified EHR data were used to identify codes of interest for candidate algorithms to identify COVID-19 patients. The number and proportion of patients who received a SARS-CoV-2 reverse transcriptase polymerase chain reaction (RT-PCR) within ± 10 days of the occurrence of the diagnosis code and patients who tested positive among those with a test result were calculated, resulting in 11 candidate algorithms. Sensitivity, specificity, and likelihood ratios assessed the candidate algorithms by clinical setting and time period. We adjusted for potential verification bias by weighting by the reciprocal of the estimated probability of verification.Results: From January to March 2020, the most commonly used diagnosis codes related to COVID-19 diagnosis were R06 (dyspnea) and R05 (cough). On or after April 1, 2020, the code with highest sensitivity for COVID-19, U07.1, had near perfect adjusted sensitivity (1.00 [95% CI 1.00, 1.00]) but low adjusted specificity (0.32 [95% CI 0.31, 0.33]) in hospitalized patients.Discussion: Algorithms based on the U07.1 code had high sensitivity among hospitalized patients, but low specificity, especially after April 2020. None of the combinations of ICD-10-CM codes assessed performed with a satisfactory combination of high sensitivity and high specificity when using the SARS-CoV-2 RT-PCR as the reference standard.Keywords: COVID-19, SARS-CoV-2, epidemiology, verification bias, validation
Databáze: Directory of Open Access Journals