Train No Evil: Selective Masking for Task-Guided Pre-Training

Autor:	Yuxian Gu, Xiaozhi Wang, Zhengyan Zhang, Zhiyuan Liu, Maosong Sun
Rok vydání:	2020
Předmět:	FOS: Computer and information sciences Masking (art) Computer Science - Computation and Language Source code Computer science Speech recognition media_common.quotation_subject Sentiment analysis 02 engineering and technology 010501 environmental sciences Security token 01 natural sciences Masking (Electronic Health Record) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Language model Computation and Language (cs.CL) 0105 earth and related environmental sciences media_common
Zdroj:	EMNLP (1) Scopus-Elsevier
Popis:	Recently, pre-trained language models mostly follow the pre-train-then-fine-tuning paradigm and have achieved great performance on various downstream tasks. However, since the pre-training stage is typically task-agnostic and the fine-tuning stage usually suffers from insufficient supervised data, the models cannot always well capture the domain-specific and task-specific patterns. In this paper, we propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns. Specifically, we design a method to measure the importance of each token in sequences and selectively mask the important tokens. Experimental results on two sentiment analysis tasks show that our method can achieve comparable or even better performance with less than 50% of computation cost, which indicates our method is both effective and efficient. The source code of this paper can be obtained from https://github.com/thunlp/SelectiveMasking. Accepted by EMNLP2020
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_dedup___::ec19b0fc52fbdc882ed243ab22d1abae https://doi.org/10.18653/v1/2020.emnlp-main.566 Zobrazit plný text záznamu