A data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews

Autor: Xiang Chen, Kuang Ching Li, Guang li Zhu, Shunxiang Zhang, Han qing Xu
Rok vydání: 2021
Předmět:
Zdroj: Soft Computing. 26:853-866
ISSN: 1433-7479
1432-7643
DOI: 10.1007/s00500-021-06228-9
Popis: New sentiment words in product reviews are valuable resources that are directly close to users. The data processing of new sentiment word extraction can provide information service better for users, and provide theoretical support for the related research of edge computing. Traditional methods for extracting new sentiment words generally ignored the context and syntactic information, which leads to the low accuracy and recall rate in the process of extracting new sentiment words. To tackle the mentioned issue, we proposed a data processing method based on sequence labeling and syntactic analysis for extracting new sentiment words from product reviews. Firstly, the probability that the new word is a sentiment word is calculated through the location rules derived from the sequence labeling result, and the candidate set of new sentiment words is obtained according to the probability. Then, the candidate set of new sentiment words is supplemented with the method of matching appositive words based on edit distance. Finally, the final set of new sentiment words is collected through fine-grained filtering, including the calculation of Point Mutual Information (PMI) and difference coefficient of positive and negative corpus (DC-PNC). The experimental results illustrate the effectiveness of new sentiment words extracted by the proposed method which can obviously improve the accuracy and recall rate of sentiment analysis.
Databáze: OpenAIRE