CIDER: Context-sensitive polarity measurement for short-form text.

Autor: James C Young, Rudy Arthur, Hywel T P Williams
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: PLoS ONE, Vol 19, Iss 4, p e0299490 (2024)
Druh dokumentu: article
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0299490&type=printable
Popis: Researchers commonly perform sentiment analysis on large collections of short texts like tweets, Reddit posts or newspaper headlines that are all focused on a specific topic, theme or event. Usually, general-purpose sentiment analysis methods are used. These perform well on average but miss the variation in meaning that happens across different contexts, for example, the word "active" has a very different intention and valence in the phrase "active lifestyle" versus "active volcano". This work presents a new approach, CIDER (Context Informed Dictionary and sEmantic Reasoner), which performs context-sensitive linguistic analysis, where the valence of sentiment-laden terms is inferred from the whole corpus before being used to score the individual texts. In this paper, we detail the CIDER algorithm and demonstrate that it outperforms state-of-the-art generalist unsupervised sentiment analysis techniques on a large collection of tweets about the weather. CIDER is also applicable to alternative (non-sentiment) linguistic scales. A case study on gender in the UK is presented, with the identification of highly gendered and sentiment-laden days. We have made our implementation of CIDER available as a Python package: https://pypi.org/project/ciderpolarity/.
Databáze: Directory of Open Access Journals
Nepřihlášeným uživatelům se plný text nezobrazuje