Study on Chinese Named Entity Extraction Rules Based on Boundary Location and Correction

Autor: LIU Pan, GUO Yanming, LEI Jun, LAO Mingrui, LI Guohui
Jazyk: čínština
Rok vydání: 2023
Předmět:
Zdroj: Jisuanji kexue, Vol 50, Iss 3, Pp 276-281 (2023)
Druh dokumentu: article
ISSN: 1002-137X
DOI: 10.11896/jsjkx.220200020
Popis: Compared with English text which is naturally composed of words,Chinese text has no word delimiters,so the combination of Chinese characters is more flexible,and it's more difficult to determine the entity boundaries in Chinese named entity recognition(NER).Current mainstream methods transform the NER task into a sequence labeling task.This paper studies the predicted label sequence under the BIOES tag scheme and calculates the entity boundary accuracy by separately considering the entity head label B or tail label E,which shows that increasing the boundary accuracy can further improve the accuracy of entity recognition.We expand the boundaries of entities with continuous labels,use the label type of the last character of the entity to correct the entity type,and use the word segmentation information to fill in the entity with incomplete labels.Finally,this paper proposes a BIO+ES labeling scheme that adds boundary labels to distinguish non-entity characters at entity boundaries and further improves the performance of Chinese NER.
Databáze: Directory of Open Access Journals