A Random Walk–Based Model for Identifying Semantic Orientation
Autor: | Wanchen Lu, Dragomir R. Radev, Ahmed Hassan, Amjad Abu-Jbara |
---|---|
Rok vydání: | 2014 |
Předmět: |
Linguistics and Language
Training set Computer science business.industry Pattern recognition Lexicon Random walk Language and Linguistics Computer Science Applications Markov random walk Product reviews Artificial Intelligence Text filtering Graph (abstract data type) Word relatedness Artificial intelligence business |
Zdroj: | Computational Linguistics. 40:539-562 |
ISSN: | 1530-9312 0891-2017 |
DOI: | 10.1162/coli_a_00192 |
Popis: | Automatically identifying the sentiment polarity of words is a very important task that has been used as the essential building block of many natural language processing systems such as text classification, text filtering, product review analysis, survey response analysis, and on-line discussion mining. We propose a method for identifying the sentiment polarity of words that applies a Markov random walk model to a large word relatedness graph, and produces a polarity estimate for any given word. The model can accurately and quickly assign a polarity sign and magnitude to any word. It can be used both in a semi-supervised setting where a training set of labeled words is used, and in a weakly supervised setting where only a handful of seed words is used to define the two polarity classes. The method is experimentally tested using a gold standard set of positive and negative words from the General Inquirer lexicon. We also show how our method can be used for three-way classification which identifies neutral words in addition to positive and negative words. Our experiments show that the proposed method outperforms the state-of-the-art methods in the semi-supervised setting and is comparable to the best reported values in the weakly supervised setting. In addition, the proposed method is faster and does not need a large corpus. We also present extensions of our methods for identifying the polarity of foreign words and out-of-vocabulary words. |
Databáze: | OpenAIRE |
Externí odkaz: |