Popis: |
This paper addresses the mixture symptom mention problem which appears in the structuring of Traditional Chinese Medicine (TCM). We accomplished this by disassembling mixture symptom mentions with entity relation extraction. Over 2,200 clinical notes were annotated to construct the training set. Then, an end-to-end joint learning model was established to extract the entity relations. A joint model leveraging a multihead mechanism was proposed to deal with the problem of relation overlapping. A pretrained transformer encoder was adopted to capture context information. Compared with the entity extraction pipeline, the constructed joint learning model was superior in recall, precision, and F1 measures, at 0.822, 0.825, and 0.818, respectively, 14% higher than the baseline model. The joint learning model could automatically extract features without any extra natural language processing tools. This is efficient in the disassembling of mixture symptom mentions. Furthermore, this superior performance at identifying overlapping relations could benefit the reassembling of separated symptom entities downstream. |