Twitter Sentiment Analysis using an Ensemble Weighted Majority Vote Classifier

Autor: Roza Hikmat Hama Aziz, Nazife Dimililer
Rok vydání: 2020
Předmět:
Zdroj: 2020 International Conference on Advanced Science and Engineering (ICOASE).
DOI: 10.1109/icoase51841.2020.9436590
Popis: Sentiment analysis extracts the emotions expressed in text and has been employed in many fields including politics, elections, movies, retail businesses and in recent years microblogs to understand, track and control the human sentiments or reactions toward products events or ideas. Nevertheless challenges such as different styles of writing, use of negation and sarcasm, existence of spelling mistakes, invention of new words etc. provide obstacle in the correct classification of sentiments. This paper provides an ensemble of classifiers framework for sentiment analysis. The proposed weighted majority voting ensemble method combines six models including Naive Bayes, Logistic Regression, Stochastic Gradient Descent, Random Forest, Decision Tree and Support Vector Machine to form a single classifier. Weights of the individual classifiers of the ensemble are chosen as accuracy or Fl-score by optimizing their performance. This approach combines models based on the simple majority voting as opposed to the one based on weighted majority voting. Additionally, a comparison is drawn among these six individual classifiers to evaluate their performance. The proposed ensemble model is tested on some existing sentiment datasets, including SemEval 2017 Task 4A, 4B and 4C. The results demonstrate that the Logistic Regression classifier is optimal as compared to other individual classifiers. Furthermore, the proposed ensemble weighted majority voting classifier with the six individual classifiers performs better compared to the simple majority voting and all independent classifiers.
Databáze: OpenAIRE