A New Lifelong Topic Modeling Method and Its Application to Vietnamese Text Multi-label Classification
Autor: | Quang-Thuy Ha, Thi-Ngan Pham, Thi-Cham Nguyen, Minh-Tuoi Tran, Tri-Thanh Nguyen, Van-Quang Nguyen, Thi-Hong Vuong |
---|---|
Rok vydání: | 2018 |
Předmět: |
Topic model
Multi-label classification Computer science business.industry Vietnamese Closeness Application framework 02 engineering and technology computer.software_genre language.human_language Focus (linguistics) Domain (software engineering) 020204 information systems 0202 electrical engineering electronic engineering information engineering language 020201 artificial intelligence & image processing Artificial intelligence Set (psychology) business computer Natural language processing |
Zdroj: | Intelligent Information and Database Systems ISBN: 9783319754161 ACIIDS (1) |
Popis: | Lifelong machine learning is emerging in recent years thanks to its ability to use past knowledge for current problem. Lifelong topic modeling algorithms, such as LTM and AMC, are proposed and they are very useful. However, these algorithms focus on learning bias on the topic level not the domain level. This paper proposes a lifelong topic modeling method, which focuses on learning bias on the domain level based on a proposed domain closeness measure, and an application framework for multi-label classification on Vietnamese texts. Experimental results on three previously solved Vietnamese texts, and five different current Vietnamese text datasets in combination with different topic set sizes showed that our proposed method is better than AMC method for all cases. |
Databáze: | OpenAIRE |
Externí odkaz: |