A Semantic Conceptualization Using Tagged Bag-of-Concepts for Sentiment Analysis
Autor: | Massudi Mahmuddin, Yassin S. Mehanna |
---|---|
Rok vydání: | 2021 |
Předmět: |
semantic sentiment
General Computer Science Concept extraction Computer science Context (language use) Ontology (information science) computer.software_genre Lexicon Semantics Knowledge-based systems sentiment lexicon General Materials Science natural language processing Conceptualization business.industry Sentiment analysis General Engineering TK1-9971 sentiment analysis Task analysis text processing Electrical engineering. Electronics. Nuclear engineering Artificial intelligence business computer Natural language processing |
Zdroj: | IEEE Access, Vol 9, Pp 118736-118756 (2021) |
ISSN: | 2169-3536 |
DOI: | 10.1109/access.2021.3107237 |
Popis: | Sentiment could be expressed implicitly or explicitly in the text. Hence, it is the main challenge for current sentiment analysis (SA) approaches to identify hidden sentiments, other common challenges include false classification of opinion words, ignoring context information, and bad handling of a short text that arise from the bad interpretation of the text and lack of enough data required for analysis tasks. In this study, a semantic conceptualization method using tagged bag-of-concepts for SA is proposed to detect the correct sentiment towards the actual target entity that considers all affective and conceptual information conveyed in the text with a special focus on the short text. Tagged bag-of-concepts (TBoC) is a novel approach to analyze and decompose text to uncover latent sentiments while preserving all relations and vital information to boost the accuracy of SA. This study answers questions: Does the information provided via TBoC enhance sentiment classification results on different analysis levels? Is building a structure of concepts increases the accuracy of overall sentiment towards specific opinion target? Does TBoC approach enhance SA results for short text messages? The proposed solution has been applied on two datasets from the restaurant domain, sentiment analysis is performed using the TBoCs structure on multiple levels including document, aspect, aspect-category, and topic levels. TBoC method with domain-specific sentiment lexicon showed exceptional performance and outperformed other state-of-the-art NB, SVM, and NN methods, especially for aspect-level SA. The use of TBoC within the semantic conceptualization model that leverages NLP tasks, Ontology, and semantic methods proved its high capabilities for concept extraction while preserving the information about the context, interrelations, and latent feelings. |
Databáze: | OpenAIRE |
Externí odkaz: |