A performance comparison of machine learning classifiers for Covid-19 Arabic Quarantine tweets sentiment analysis

Autor: Asma M. Altabeeb, Abdulqader M. Mohsen, Yousef Ali, Belal Al-Fuhaidi, Wedad Al-Sorori, Naseebah Maqtary
Rok vydání: 2021
Předmět:
Zdroj: 2021 1st International Conference on Emerging Smart Technologies and Applications (eSmarTA).
DOI: 10.1109/esmarta52612.2021.9515749
Popis: The COVID-19 pandemic has spread across the world and has become an international public health emergency. The outbreak of COVID-19 has focused attention on the use of quarantine and social distancing as the primary defense strategies against community infection. Arabic language, as one of the most spoken languages in the world, and the fastest-growing language on the Internet motivates us to provide reliable automated tools that can perform sentiment analysis to reveal users' opinions. This paper was proposed to utilize machine learning (ML) methods for Arabic Sentiment Analysis to understand the positive and negative opinions related to quarantine and social distancing during the outbreak of COVID-19. We provided a model of different essential and ensembles ML classifiers and compared their effectiveness in classifying the collected imbalanced dataset. Moreover, the application of a variety of SMOTe (Synthetic Minority Over-sampling Technique) for our imbalanced dataset was evaluated. The results demonstrated that SMOTe Edited Nearest Neighbor (SMOTEENN) outperformed other SMOTe techniques. Moreover, the results showed that ensemble classifiers are more robust with imbalanced datasets than single classifiers. On the other hand, the overall average of F1 score of single classifiers is more robust than ensemble classifiers when applying SMOTEENN.
Databáze: OpenAIRE