Learning to Infer Entities, Properties and their Relations from Clinical Conversations
Autor: | Linh Tran, Izhak Shafran, Nan Du, Mingqiu Wang, Gang Lee |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Computation and Language Span (category theory) Computer science business.industry media_common.quotation_subject Contrast (statistics) 02 engineering and technology 010501 environmental sciences Space (commercial competition) computer.software_genre 01 natural sciences Relationship extraction Task (project management) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Conversation Artificial intelligence Duration (project management) business Computation and Language (cs.CL) computer Natural language processing 0105 earth and related environmental sciences media_common |
Zdroj: | EMNLP/IJCNLP (1) |
Popis: | Recently we proposed the Span Attribute Tagging (SAT) Model (Du et al., 2019) to infer clinical entities (e.g., symptoms) and their properties (e.g., duration). It tackles the challenge of large label space and limited training data using a hierarchical two-stage approach that identifies the span of interest in a tagging step and assigns labels to the span in a classification step. We extend the SAT model to jointly infer not only entities and their properties but also relations between them. Most relation extraction models restrict inferring relations between tokens within a few neighboring sentences, mainly to avoid high computational complexity. In contrast, our proposed Relation-SAT (R-SAT) model is computationally efficient and can infer relations over the entire conversation, spanning an average duration of 10 minutes. We evaluate our model on a corpus of clinical conversations. When the entities are given, the R-SAT outperforms baselines in identifying relations between symptoms and their properties by about 32% (0.82 vs 0.62 F-score) and by about 50% (0.60 vs 0.41 F-score) on medications and their properties. On the more difficult task of jointly inferring entities and relations, the R-SAT model achieves a performance of 0.34 and 0.45 for symptoms and medications respectively, which is significantly better than 0.18 and 0.35 for the baseline model. The contributions of different components of the model are quantified using ablation analysis. |
Databáze: | OpenAIRE |
Externí odkaz: |