Named Entity Recognition Method of Brazilian Legal Text based on pre-training model

Autor: Yufan Wu, Cheng Peng, Pengbin Lei, Zhili Wang
Rok vydání: 2020
Předmět:
Zdroj: Journal of Physics: Conference Series. 1550:032149
ISSN: 1742-6596
1742-6588
Popis: Named entity recognition (NER) is a common task in Natural Language Processing (NLP). To this end, we propose a novel approach based on pre-training model to complete the sequence labeling tasks by learning the large-scale real-world data from Brazilian legal documents. Especially, combining iterated dilated convolution[1] (IDCNN) and Bi-LSTM, we develop the scalable sequence labeling model named Sequence Tagging Model (STM) and extensive experiments validate the effectiveness of STM for NER tasks. Furthermore, compared with the IDCNN-CRF model, the experimental results show that the STM is better and the F1 score is 93.23%, which provides an important basis for NER tasks.
Databáze: OpenAIRE