Improving Named Entity Recognition in Vietnamese Texts by a Character-Level Deep Lifelong Learning Model
Autor: | Ngoc-Vu Nguyen, Thi-Lan Nguyen, Cam-Van Nguyen Thi, Mai-Vu Tran, Tri-Thanh Nguyen, Quang-Thuy Ha |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2019 |
Předmět: | |
Zdroj: | Vietnam Journal of Computer Science, Vol 6, Iss 4, Pp 471-487 (2019) |
Druh dokumentu: | article |
ISSN: | 2196-8888 2196-8896 21968888 |
DOI: | 10.1142/S219688881950026X |
Popis: | Named entity recognition (NER) is a fundamental task which affects the performance of its dependent task, e.g. machine translation. Lifelong machine learning (LML) is a continuous learning process, in which the knowledge base accumulated from previous tasks will be used to improve future learning tasks having few samples. Since there are a few studies on LML based on deep neural networks for NER, especially in Vietnamese, we propose a lifelong learning model based on deep learning with a CRFs layer, named DeepLML–NER, for NER in Vietnamese texts. DeepLML–NER includes an algorithm to extract the knowledge of “prefix-features” of named entities in previous domains. Then the model uses the knowledge in the knowledge base to solve the current NER task. Preprocessing and model parameter tuning are also investigated to improve the performance. The effect of the model was demonstrated by in-domain and cross-domain experiments, achieving promising results. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |