Learning to detect and understand drug discontinuation events from clinical narratives
Autor: | Richeek Pradhan, Celena B. Peters, Adam J. Gordon, Brian C. Sauer, Emily Druhl, Elaine T. Freund, Weisong Liu, Fran Cunningham, Feifan Liu, Hong Yu |
---|---|
Rok vydání: | 2018 |
Předmět: |
Support Vector Machine
Knowledge representation and reasoning Computer science Health Informatics 030204 cardiovascular system & hematology computer.software_genre Research and Applications Machine Learning 03 medical and health sciences Pharmacovigilance 0302 clinical medicine Chart Drug Therapy Feature (machine learning) Product Surveillance Postmarketing Electronic Health Records Humans Narrative 030212 general & internal medicine Natural Language Processing Narration business.industry Discontinuation Area Under Curve Artificial intelligence business computer Sentence Natural language processing Test data |
Zdroj: | Journal of the American Medical Informatics Association : JAMIA. 26(10) |
ISSN: | 1527-974X |
Popis: | Objective Identifying drug discontinuation (DDC) events and understanding their reasons are important for medication management and drug safety surveillance. Structured data resources are often incomplete and lack reason information. In this article, we assessed the ability of natural language processing (NLP) systems to unlock DDC information from clinical narratives automatically. Materials and Methods We collected 1867 de-identified providers’ notes from the University of Massachusetts Medical School hospital electronic health record system. Then 2 human experts chart reviewed those clinical notes to annotate DDC events and their reasons. Using the annotated data, we developed and evaluated NLP systems to automatically identify drug discontinuations and reasons at the sentence level using a novel semantic enrichment-based vector representation (SEVR) method for enhanced feature representation. Results Our SEVR-based NLP system achieved the best performance of 0.785 (AUC-ROC) for detecting discontinuation events and 0.745 (AUC-ROC) for identifying reasons when testing this highly imbalanced data, outperforming 2 state-of-the-art non–SEVR-based models. Compared with a rule-based baseline system for discontinuation detection, our system improved the sensitivity significantly (57.75% vs 18.31%, absolute value) while retaining a high specificity of 99.25%, leading to a significant improvement in AUC-ROC by 32.83% (absolute value). Conclusion Experiments have shown that a high-performance NLP system can be developed to automatically identify DDCs and their reasons from providers’ notes. The SEVR model effectively improved the system performance showing better generalization and robustness on unseen test data. Our work is an important step toward identifying reasons for drug discontinuation that will inform drug safety surveillance and pharmacovigilance. |
Databáze: | OpenAIRE |
Externí odkaz: |