A Preliminary Performance Comparison of Machine Learning Algorithms for Web Author Identification of Vietnamese Online Messages
Autor: | Alisa Vorobeva, Bui N. Khanh |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2020 |
Předmět: | |
Zdroj: | Proceedings of the XXth Conference of Open Innovations Association FRUCT, Vol 26, Iss 1, Pp 166-173 (2020) |
Druh dokumentu: | article |
ISSN: | 2305-7254 2343-0737 |
DOI: | 10.23919/FRUCT48808.2020.9087531 |
Popis: | With the rapid development of the Internet and accompanying technologies, communication between people has become easier than ever. Email, news sites, social networking applications become an indispensable connection tool. However, the Internet is also a favorable environment for cybercriminals with malicious activities. Therefore, it is necessary to develop a method to determine which user is the author of the online message. There has been a lot of researches with different corpora and various languages. In this article, we propose an approach to identify the authors of online messages in Vietnamese based on machine learning algorithms. Algorithms used include Naive Bayes, SVM, Random Forest, and Logistic Regression. The algorithm that has yielded the best results is the Random Forest. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |