Novel Graph-Based Model With Biaffine Attention for Family History Extraction From Clinical Text: Modeling Study

Autor: Weihua Peng, Ying Xiong, Qingcai Chen, Buzhou Tang, Fu Huhao, Xiaolong Wang, Kecheng Zhan
Rok vydání: 2020
Předmět:
Zdroj: JMIR Medical Informatics
JMIR Medical Informatics, Vol 9, Iss 4, p e23587 (2021)
ISSN: 2291-9694
Popis: Background Family history information, including information on family members, side of the family of family members, living status of family members, and observations of family members, plays an important role in disease diagnosis and treatment. Family member information extraction aims to extract family history information from semistructured/unstructured text in electronic health records (EHRs), which is a challenging task regarding named entity recognition (NER) and relation extraction (RE), where named entities refer to family members, living status, and observations, and relations refer to relations between family members and living status, and relations between family members and observations. Objective This study aimed to introduce the system we developed for the 2019 n2c2/OHNLP track on family history extraction, which can jointly extract entities and relations about family history information from clinical text. Methods We proposed a novel graph-based model with biaffine attention for family history extraction from clinical text. In this model, we first designed a graph to represent family history information, that is, representing NER and RE regarding family history in a unified way, and then introduced a biaffine attention mechanism to extract family history information in clinical text. Convolution neural network (CNN)-Bidirectional Long Short Term Memory network (BiLSTM) and Bidirectional Encoder Representation from Transformers (BERT) were used to encode the input sentence, and a biaffine classifier was used to extract family history information. In addition, we developed a postprocessing module to adjust the results. A system based on the proposed method was developed for the 2019 n2c2/OHNLP shared task track on family history information extraction. Results Our system ranked first in the challenge, and the F1 scores of the best system on the NER subtask and RE subtask were 0.8745 and 0.6810, respectively. After the challenge, we further fine tuned the parameters and improved the F1 scores of the two subtasks to 0.8823 and 0.7048, respectively. Conclusions The experimental results showed that the system based on the proposed method can extract family history information from clinical text effectively.
Databáze: OpenAIRE