Sentiment analysis of Arabic social media texts: A machine learning approach to deciphering customer perceptions

Autor:	Ohud Alsemaree, Atm S. Alam, Sukhpal Singh Gill, Steve Uhlig
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Arabic text Feature extraction Machine learning Sentiment analysis Social media Science (General) Q1-390 Social sciences (General) H1-99
Zdroj:	Heliyon, Vol 10, Iss 9, Pp e27863- (2024)
Druh dokumentu:	article
ISSN:	2405-8440
DOI:	10.1016/j.heliyon.2024.e27863
Popis:	Sentiment analysis (SA) is a subfield of artificial intelligence that entails natural language processing. This has become increasingly significant because it discerns the emotional tone of reviews, categorising them as positive, neutral, or negative. In the highly competitive coffee industry, understanding customer sentiment and perception is paramount for businesses seeking to optimise their product offerings. Traditional methods of market analysis often fall short of capturing the nuanced views of consumers, necessitating a more sophisticated approach to sentiment analysis. This research is motivated by the need for a nuanced understanding of customer sentiments across various coffee products, enabling companies to make informed decisions regarding product promotion, improvement, and discontinuation. However, sentiment analysis faces a challenge when it comes to analysing Arabic text due to the language's extraordinarily complex inflectional and derivational morphology. Consequently, to address this challenge, we have developed a new method designed to improve the precision and effectiveness of Arabic sentiment analysis, specifically focusing on understanding customer opinions about various coffee products on social media platforms like Twitter. We gathered 10,646 various coffee products' Twitter reviews and applied feature extraction techniques using the term frequency-inverse document frequency (TF-IDF) and minimum redundancy maximum relevance (MRMR). Subsequently, we performed sentiment analysis using four supervised learning algorithms: k-nearest neighbor, support vector machine, decision tree, and random forest. All the classification statements derived in the analysis were aggregated via ensemble learning to convey the final results. Our results demonstrated an increase in prediction accuracy, with our method achieving over 95.95% accuracy in the Hard voting and soft voting at 94.51 %.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/36be43bbe5af4d4b8537f699fb612b17 Zobrazit plný text záznamu View record in DOAJ