Popis: |
A large amount of training data is needed to achieve accurate results in processing deep learning models. Although deep learning has played a successful role in sentiment analysis of texts, a new era in natural language processing has begun with the rise of pre-trained language models. Pre-trained models provide access to a large amount of training data. Emotion analysis in Persian text is challenging because, besides being one of the low resource languages, it needs more accurate methods for processing texts. In this study, a dataset that includes the emotional sentences of Persian literary texts was created. We compared two different methods for analyzing the emotion of Persian sentences. The first method uses the FastText and the Bi-LSTM model for emotion classification. In the second approach, we employ the XLM-R based on a deep bidirectional transformer to extract features from text and apply the Catboost algorithm to guide the model to focus on the most relevant class. The proposed model can be used in all languages, especially low-resource languages and imbalanced datasets. According to the experimental results, the proposed model achieved better binary classification results and employing the XLM-R, and Catboost algorithm yielded remarkable improvement in the multi-classification accuracy. |