A comparison of natural language processing to ICD-10 codes for identification and characterization of pulmonary embolism.
Autor: | Johnson SA; Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America. Electronic address: stacy.a.johnson@hsc.utah.edu., Signor EA; Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America., Lappe KL; Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America., Shi J; Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, UT, United States of America., Jenkins SL; Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America., Wikstrom SW; Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America., Kroencke RD; Division of General Internal Medicine, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America., Hallowell D; Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, United States of America., Jones AE; University of Utah, College of Pharmacy, Department of Pharmacotherapy, Salt Lake City, UT, United States of America; University of Utah, School of Medicine, Department of Population Health, Salt Lake City, UT, United States of America., Witt DM; Thrombosis Service, University of Utah Health, Salt Lake City, UT, United States of America; University of Utah, College of Pharmacy, Department of Pharmacotherapy, Salt Lake City, UT, United States of America. |
---|---|
Jazyk: | angličtina |
Zdroj: | Thrombosis research [Thromb Res] 2021 Jul; Vol. 203, pp. 190-195. Date of Electronic Publication: 2021 May 06. |
DOI: | 10.1016/j.thromres.2021.04.020 |
Abstrakt: | Introduction: The 10th revision of the International Classification of Diseases (ICD-10) codes is frequently used to identify pulmonary embolism (PE) events, although the validity of ICD-10 has been questioned. Natural language processing (NLP) is a novel tool that may be useful for pulmonary embolism identification. Methods: We performed a retrospective comparative accuracy study of 1000 randomly selected healthcare encounters with a CT pulmonary angiogram ordered between January 1, 2019 and January 1, 2020 at a single academic medical center. Two independent observers reviewed each radiology report and abstracted key findings related to PE presence/absence, chronicity, and anatomic location. NLP interpretations of radiology reports and ICD-10 codes were queried electronically and compared to the reference standard, manual chart review. Results: A total of 970 encounters were included for analysis. The prevalence of PE was 13% by manual review. For PE identification, sensitivity was similar between NLP (96.0%) and ICD-10 (92.9%; p = 0.405), and specificity was significantly higher with NLP (97.7%) compared to ICD-10 (91.0%; p < 0.001). NLP demonstrated higher sensitivity (70.0% vs 16.5%, p < 0.001) and specificity (99.9% vs 99.4%, p = 0.014) for saddle/main PE recognition, and significantly higher sensitivity (86.7% vs 8.3%, p < 0.001) and specificity (99.8% vs 96.5%, p < 0.001) for subsegmental PE compared to ICD-10. Conclusions: NLP is highly sensitive for PE identification and more specific than ICD-10 coding. NLP outperformed ICD-10 coding for recognition of subsegmental, saddle, and chronic PE. Our results suggest NLP is an efficient and more reliable method than ICD-10 for PE identification and characterization. (Copyright © 2021. Published by Elsevier Ltd.) |
Databáze: | MEDLINE |
Externí odkaz: |