Impact of Text Pre-processing and Ensemble Learning on Arabic Sentiment Analysis

Autor: Samir Belfkih, Ayoub Ait Lahcen, Ahmed Oussous
Rok vydání: 2019
Předmět:
Zdroj: NISS
DOI: 10.1145/3320326.3320399
Popis: Nowadays, with the rapid growth and spread of web platforms such as social networks, online review websites and blogs, people can openly express and share their opinions. They can rate products or comment various subjects. Thus, a new field called web based Sentiment Analysis (SA) or Opinion Mining has emerged. In general, SA is the process of classifying opinions and sentiments as positive, negative or neutral. Many studies were performed on SA for languages such as English, Spanish and French. However, the research on SA of Arabic text is very limited. The goal of this paper is to measure the impact of the preprocessing phase on Arabic Sentiment Analysis in terms of various aspects such as accuracy, precision and recall. We have conducted experimentations using different stemming (Khoja, ISRI, Tashaphyne, Light10, and MOTAZ), n-gram, and stop words. The second goal is to study the impact of combining multiple classifiers on Arabic sentiment analysis. For this reason, the vote algorithm in conjunction with three classifiers, namely Naive Bayes, Support Vector Machine (SVM), and Maximum Entropy have been used and evaluated using k-fold cross-validation.
Databáze: OpenAIRE