Improving Named Entity Recognition in Vietnamese Texts by a Character-Level Deep Lifelong Learning Model

Autor: Cam-Van Nguyen Thi, Mai-Vu Tran, Tri-Thanh Nguyen, Ngoc-Vu Nguyen, Quang-Thuy Ha, Thi-Lan Nguyen
Rok vydání: 2019
Předmět:
Zdroj: Vietnam Journal of Computer Science, Vol 6, Iss 4, Pp 471-487 (2019)
ISSN: 2196-8896
2196-8888
DOI: 10.1142/s219688881950026x
Popis: Named entity recognition (NER) is a fundamental task which affects the performance of its dependent task, e.g. machine translation. Lifelong machine learning (LML) is a continuous learning process, in which the knowledge base accumulated from previous tasks will be used to improve future learning tasks having few samples. Since there are a few studies on LML based on deep neural networks for NER, especially in Vietnamese, we propose a lifelong learning model based on deep learning with a CRFs layer, named DeepLML–NER, for NER in Vietnamese texts. DeepLML–NER includes an algorithm to extract the knowledge of “prefix-features” of named entities in previous domains. Then the model uses the knowledge in the knowledge base to solve the current NER task. Preprocessing and model parameter tuning are also investigated to improve the performance. The effect of the model was demonstrated by in-domain and cross-domain experiments, achieving promising results.
Databáze: OpenAIRE