A hybrid model for opinion mining based on domain sentiment dictionary
Autor: | Dongping Huang, Kai Yang, Xue Lei, Haoran Xie, Zikai Zhou, Tak-Lam Wong, Yi Cai |
---|---|
Rok vydání: | 2017 |
Předmět: |
020203 distributed computing
Computer science business.industry Sentiment analysis Computational intelligence 02 engineering and technology computer.software_genre Field (computer science) Domain (software engineering) Task (project management) Support vector machine Artificial Intelligence Pattern recognition (psychology) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Computer Vision and Pattern Recognition Artificial intelligence Layer (object-oriented design) business computer Software Natural language processing |
Zdroj: | International Journal of Machine Learning and Cybernetics. 10:2131-2142 |
ISSN: | 1868-808X 1868-8071 |
DOI: | 10.1007/s13042-017-0757-6 |
Popis: | Sentiment classification is an application of sentiment analysis, which is a popular research field in NLP. It can classify documents into different categories according to their sentiments. For a sentiment classification task, the first step is to extract sentimental features from documents, and then classify them using some classifiers. In the first step, a traditional way to extract sentimental features is to apply sentiment dictionaries. However, sentiment words may have different sentiment tendencies in different contexts, and traditional sentiment dictionaries does not consider this situation where wrong sentiment tendencies may be selected for sentiment words. In our research, we find that sentiment words will not have diverse meanings when they associate with the nearby aspects and entities in documents. Then, we propose a three layers sentiment dictionary, which can associate sentiment words with the corresponding entities and aspects together to reduce their multiple meanings. In the second step of the sentiment classification task, many classification models, such as SVM, GBDT, can be used to classify documents according to the extracted sentiment words. However, different classifiers have different weaknesses. A Stacking-based hybrid model is applied to combine SVM and GBDT together to overcome their weaknesses and reach higher performance. This hybrid model contains two layers, and the output of the first layer will become the input of the second layer. The first layer will generate different classification results according to different classifiers, while the second layer will automatically learn how to select a probable one as the final result. The experimental results show that our hybrid model outperforms the baseline single models. |
Databáze: | OpenAIRE |
Externí odkaz: |