Cross-Domain Sentiment Classification With Bidirectional Contextualized Transformer Language Models
Autor: | Shigetomo Kimura, Jie Li, Batsergelen Myagmar |
---|---|
Rok vydání: | 2019 |
Předmět: |
Text corpus
General Computer Science Computer science Transferability 02 engineering and technology computer.software_genre 03 medical and health sciences 0302 clinical medicine Data sequences 0202 electrical engineering electronic engineering information engineering General Materials Science 030212 general & internal medicine Architecture Transformer (machine learning model) business.industry General Engineering pre-trained language model Transfer learning cross-domain sentiment classification Representation language 020201 artificial intelligence & image processing lcsh:Electrical engineering. Electronics. Nuclear engineering Language model Artificial intelligence Transfer of learning business lcsh:TK1-9971 computer Natural language processing |
Zdroj: | IEEE Access, Vol 7, Pp 163219-163230 (2019) |
ISSN: | 2169-3536 |
DOI: | 10.1109/access.2019.2952360 |
Popis: | Cross-domain sentiment classification is an important Natural Language Processing (NLP) task that aims at leveraging knowledge obtained from a source domain to train a high-performance learner for sentiment classification on a target domain. Existing transfer learning methods applied on cross-domain sentiment classification mostly focus on inducing a low-dimensional feature representation shared across domains based on pivots and non-pivots, which is still a low-level representation of sequence data. Recently, there have been great progress in the NLP literature in developing high-level representation language models based on Transformer architecture, which are pre-trained on large text corpus and fine-tuned for specific task with an additional layer on top. Among such language models, the bidirectional contextualized Transformer language models of BERT and XLNet have greatly impacted NLP research field. In this paper, we fine-tune BERT and XLNet for the cross-domain sentiment classification. We then explore their transferability in the context of cross-domain sentiment classification through in-depth analysis of two models' performances and update the state-of-the-arts with a significant margin of improvement. Our results show that such bidirectional contextualized language models outperform the previous state-of-the-arts methods for cross-domain sentiment classification while using up to 120 times less data. |
Databáze: | OpenAIRE |
Externí odkaz: |