Chinese Named Entity Recognition for Hazard And Operability Analysis Text Based on Albert

Autor: Zhenhua Wang, Dong Gao, Beike Zhang
Rok vydání: 2020
Předmět:
Zdroj: 2020 Chinese Automation Congress (CAC).
DOI: 10.1109/cac51589.2020.9326618
Popis: HAZOP plays an important role in chemical safety, in the project of Chinese named entity recognition of HAZOP, In response to the problem of low accuracy of entity recognition and low efficiency of model training in the past methods, this paper proposes Albert-BiLSTM-CRF model. In the process of text pre-training, we use Albert model instead of word2vec, Bert, CNN and other traditional structures to train word vectors, and get word embedding at the character level related to context, which improves the ability of text feature extraction. The experiments show that: on the basis of the accuracy of entity recognition, the F1-score of this model on HAZOP test corpus reaches 91.09%, which is 2.27% and 2.36% higher than that of BiLSTM-CRF model and CNN-BiLSTM-CRF model, and is equivalent to Bert-BiLSTM-CRF model, with F1-score exceeding 90%. on the basis of the speed of model training, the average training time of each epoch is only 162 seconds, which is faster than the training speed of other models under the same configuration environment. The results reflect that the Albert-BiLSTM-CRF model can better solve the shortcomings of the previous models, not only to ensure high accuracy of entity recognition, but also to improve the training speed of the model. This model is more suitable for HAZOP in named entity recognition project.
Databáze: OpenAIRE