Automatic generation of lexica for sentiment polarity shifters
Autor: | Josef Ruppenhofer, Michael Wiegand, Marc Schulder |
---|---|
Rok vydání: | 2020 |
Předmět: |
Linguistics and Language
Lexical semantics Phrase Computer science Polarity (physics) business.industry 010102 general mathematics Sentiment analysis Bootstrapping (linguistics) 02 engineering and technology computer.software_genre Lexicon 01 natural sciences Language and Linguistics Negation Artificial Intelligence 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing English verbs Artificial intelligence 0101 mathematics business computer Software Natural language processing |
Zdroj: | Natural Language Engineering. 27:153-179 |
ISSN: | 1469-8110 1351-3249 |
DOI: | 10.1017/s135132492000039x |
Popis: | Alleviating pain is good and abandoning hope is bad. We instinctively understand how words like alleviate and abandon affect the polarity of a phrase, inverting or weakening it. When these words are content words, such as verbs, nouns, and adjectives, we refer to them as polarity shifters. Shifters are a frequent occurrence in human language and an important part of successfully modeling negation in sentiment analysis; yet research on negation modeling has focused almost exclusively on a small handful of closed-class negation words, such as not, no, and without. A major reason for this is that shifters are far more lexically diverse than negation words, but no resources exist to help identify them. We seek to remedy this lack of shifter resources by introducing a large lexicon of polarity shifters that covers English verbs, nouns, and adjectives. Creating the lexicon entirely by hand would be prohibitively expensive. Instead, we develop a bootstrapping approach that combines automatic classification with human verification to ensure the high quality of our lexicon while reducing annotation costs by over 70%. Our approach leverages a number of linguistic insights; while some features are based on textual patterns, others use semantic resources or syntactic relatedness. The created lexicon is evaluated both on a polarity shifter gold standard and on a polarity classification task. |
Databáze: | OpenAIRE |
Externí odkaz: |