A Stance Detection Approach Based on Generalized Autoregressive pretrained Language Model in Chinese Microblogs
Autor: | Rong Cao, Huifeng Tang, Yaoyi Xi, Hangyu Pan, Zhizhong Su |
---|---|
Rok vydání: | 2021 |
Předmět: |
Grammar
business.industry Computer science Microblogging media_common.quotation_subject Machine learning computer.software_genre Semantics Identification (information) Autoregressive model Softmax function Social media Artificial intelligence Language model Data pre-processing business computer media_common |
Zdroj: | ICMLC |
DOI: | 10.1145/3457682.3457717 |
Popis: | Timely identification of Chinese Microblogs users' stance and tendency is of great significance for social managers to understand the trends of online public opinion. Traditional stance detection methods underutilize target information, which affects the detection effect. This paper proposes to integrate the target subject information into a Chinese Microblogs stance detection method based on a generalized autoregressive pretraining language model, and use the advantages of the generalized autoregressive model to extract deep semantics to weaken the high randomness of Microblogs self-media text language and lack of grammar. The impact of norms on text modeling. First carry out microblog data preprocessing to reduce the influence of noise data on the detection effect; then connect the target subject information and the text sequence to be tested into the XLNet network for fine-tuning training; Finally, the fine-tuned XLNet network is combined with the Softmax regression model for stance classification. The experimental results show that the value of the proposed method in the NLPCC2016 Chinese Microblogs detection and evaluation task reaches 0.75, which is better than the existing public model, and the effect is improved significantly. |
Databáze: | OpenAIRE |
Externí odkaz: |