A performance comparison of machine learning classifiers for Covid-19 Arabic Quarantine tweets sentiment analysis
Autor: | Asma M. Altabeeb, Abdulqader M. Mohsen, Yousef Ali, Belal Al-Fuhaidi, Wedad Al-Sorori, Naseebah Maqtary |
---|---|
Rok vydání: | 2021 |
Předmět: |
Coronavirus disease 2019 (COVID-19)
business.industry Arabic Computer science Sentiment analysis Machine learning computer.software_genre Variety (linguistics) language.human_language Support vector machine Performance comparison language The Internet Artificial intelligence business F1 score computer |
Zdroj: | 2021 1st International Conference on Emerging Smart Technologies and Applications (eSmarTA). |
DOI: | 10.1109/esmarta52612.2021.9515749 |
Popis: | The COVID-19 pandemic has spread across the world and has become an international public health emergency. The outbreak of COVID-19 has focused attention on the use of quarantine and social distancing as the primary defense strategies against community infection. Arabic language, as one of the most spoken languages in the world, and the fastest-growing language on the Internet motivates us to provide reliable automated tools that can perform sentiment analysis to reveal users' opinions. This paper was proposed to utilize machine learning (ML) methods for Arabic Sentiment Analysis to understand the positive and negative opinions related to quarantine and social distancing during the outbreak of COVID-19. We provided a model of different essential and ensembles ML classifiers and compared their effectiveness in classifying the collected imbalanced dataset. Moreover, the application of a variety of SMOTe (Synthetic Minority Over-sampling Technique) for our imbalanced dataset was evaluated. The results demonstrated that SMOTe Edited Nearest Neighbor (SMOTEENN) outperformed other SMOTe techniques. Moreover, the results showed that ensemble classifiers are more robust with imbalanced datasets than single classifiers. On the other hand, the overall average of F1 score of single classifiers is more robust than ensemble classifiers when applying SMOTEENN. |
Databáze: | OpenAIRE |
Externí odkaz: |