Weakly-Supervised Deep Embedding for Product Review Sentiment Analysis
Autor: | Beidou Wang, Ziyu Guan, Xiaofei He, Long Chen, Wei Zhao, Quan Wang, Deng Cai |
---|---|
Rok vydání: | 2018 |
Předmět: |
Training set
Artificial neural network Computer science business.industry Deep learning Feature extraction Sentiment analysis 02 engineering and technology Machine learning computer.software_genre Computer Science Applications Computational Theory and Mathematics 020204 information systems 0202 electrical engineering electronic engineering information engineering Feature (machine learning) Embedding 020201 artificial intelligence & image processing Artificial intelligence Representation (mathematics) business computer Sentence Information Systems |
Zdroj: | IEEE Transactions on Knowledge and Data Engineering. 30:185-197 |
ISSN: | 1041-4347 |
DOI: | 10.1109/tkde.2017.2756658 |
Popis: | Product reviews are valuable for upcoming buyers in helping them make decisions. To this end, different opinion mining techniques have been proposed, where judging a review sentence's orientation (e.g., positive or negative) is one of their key challenges. Recently, deep learning has emerged as an effective means for solving sentiment classification problems. A neural network intrinsically learns a useful representation automatically without human efforts. However, the success of deep learning highly relies on the availability of large-scale training data. We propose a novel deep learning framework for product review sentiment classification which employs prevalently available ratings as weak supervision signals. The framework consists of two steps: (1) learning a high level representation (an embedding space) which captures the general sentiment distribution of sentences through rating information; and (2) adding a classification layer on top of the embedding layer and use labeled sentences for supervised fine-tuning. We explore two kinds of low level network structure for modeling review sentences, namely, convolutional feature extractors and long short-term memory. To evaluate the proposed framework, we construct a dataset containing 1.1M weakly labeled review sentences and 11,754 labeled review sentences from Amazon. Experimental results show the efficacy of the proposed framework and its superiority over baselines. |
Databáze: | OpenAIRE |
Externí odkaz: |