Sentiment analysis in Nepali: Exploring machine learning and lexicon-based approaches

Autor: Bhawna Piryani, David Pinto, Vivek Singh, Rajesh Piryani
Rok vydání: 2020
Předmět:
Zdroj: Journal of Intelligent & Fuzzy Systems. 39:2201-2212
ISSN: 1875-8967
1064-1246
DOI: 10.3233/jifs-179884
Popis: In recent times, sentiment analysis research has achieved tremendous impetus on English textual data, however, a very less amount of research has been focused on Nepali textual data. This work is focused towards Nepali textual data. We have explored machine learning approaches and proposed a lexicon-based approach using linguistic features and lexical resources to perform sentiment analysis for tweets written in Nepali language. This lexicon-based approach, first pre-process the tweet, locate the opinion-oriented features and then compute the sentiment polarity of tweet. We have investigated both conventional machine learning models (Multinomial Naïve Bayes (NB), Decision Tree, Support Vector Machine (SVM) and logistic regression) and deep learning models (Convolution Neural Network (CNN), Long Short-Term Memory (LSTM) and CNN-LSTM) for sentiment analysis of Nepali text. These machine learning models and lexicon-based approach have been evaluated on tweet dataset related to Nepal Earthquake 2015 and Nepal blockade 2015. Lexicon based approach has outperformed than conventional machine learning models. Deep learning models have outperformed than conventional machine learning models and lexicon-based approach. We have also created Nepali SentiWordNet and Nepali SenticNet sentiment lexicon from existing English language resources as by-product.
Databáze: OpenAIRE
Nepřihlášeným uživatelům se plný text nezobrazuje