DE-CO: A Two-Step Spelling Correction Model for Combating Adversarial Typos

Autor:	Zhiyi Yin, Zhengxiao Liu, Fali Wang, Zheng Lin, Lei Wang
Rok vydání:	2020
Předmět:	Artificial neural network business.industry Computer science Treebank Variety (linguistics) computer.software_genre Spelling Data modeling ComputingMethodologies_PATTERNRECOGNITION Robustness (computer science) Classifier (linguistics) Artificial intelligence business computer Word (computer architecture) Natural language processing
Zdroj:	ISPA/BDCloud/SocialCom/SustainCom
DOI:	10.1109/ispa-bdcloud-socialcom-sustaincom51426.2020.00095
Popis:	The robustness of text classifier based on the deep neural networks can be improved by correcting adversarial spelling mistakes. The main challenge brought by these mistakes is that it is difficult for the text classifier to recognize the words in the text correctly. Existing works to deal with these mistakes can be divided into two types: optimization of training data and reconstruction of text classification model. However, these works are not suitable for the text classification model that has been deployed, as retraining or reconstruction is always involved. To address the above problem, we propose a two-step spelling correction model, which consists of a misspelled word detector and a misspelled word correcter, referred to as DE-CO. Specifically, we use a detector to recognize misspelled words in the text, use a corrector to correct the misspelled words, and then feed the correction results into the downstream classifier. In this way, without reconstruction or retraining, the normal recognition of words by the text classifier can be guaranteed. We evaluate DE-CO on adversarial examples generated from the Stanford Sentiment Treebank (SST). The classification accuracy of the downstream classifier is improved in a variety of attack scenarios, which demonstrates that DE-CO improves the robustness of the text classifier.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::5e574f560c69760715ef92652b665396 https://doi.org/10.1109/ispa-bdcloud-socialcom-sustaincom51426.2020.00095 Zobrazit plný text záznamu