Rule Based Fuzzy Computing Approach on Self-Supervised Sentiment Polarity Classification with Word Sense Disambiguation in Machine Translation for Hindi Language

Autor: Shweta Chauhan, Jayashree Premkumar Shet, Shehab Mohamed Beram, Vishal Jagota, Mohammed Dighriri, Mohd Wazih Ahmad, Md Shamim Hossain, Ali Rizwan
Rok vydání: 2023
Předmět:
Zdroj: ACM Transactions on Asian and Low-Resource Language Information Processing. 22:1-21
ISSN: 2375-4702
2375-4699
DOI: 10.1145/3574130
Popis: With increasing globalization, communication among people of diverse cultural backgrounds is also taking place to a very large extent in the present era. Issues like language diversity in various parts of the world can lead to hindrance in communication. The usage of social media and user-generated material has grown at an exponential rate and existing supervised sentiment polarity classification techniques need labelling for the training dataset. In this study, two problems have been analyzed. First, sentiment analysis of the Twitter dataset and sense disambiguation of morphologically rich Hindi language. A rule-based fuzzy logics-based system for self-supervised sentiment classification was used to compute and analyze the self-supervised or completely unsupervised sentiment categorization of a social-media dataset using three types of lexicons. The combination of fuzzy with three different types of lexicons gives sentiment analysis a new path. The unsupervised fuzzy rules integrate the fuzziness of both negative as well as positive scores, and fuzzy logic-based systems can cope with ambiguity and vagueness. The fuzzy-system uses an unsupervised/self-supervised fuzzy rule-based technique to identify text using natural language processing (NLP) and sense of word. We compared the results of fuzzy rule based self-supervised sentiment classification by using three types of lexicons on five different datasets, with unsupervised as well as supervised sentiment classification techniques. Second, using cross-lingual sense embedding rather than cross-lingual word embedding resolves the ambiguity issue. The word sense embeddings are produced for the source languages to learn multiple or various senses of the words. Different evaluation metrics depict an improved performance for English-Hindi language.
Databáze: OpenAIRE