Models in the Wild: On Corruption Robustness of Neural NLP Systems
Autor: | Przemyslaw Biecek, Dominika Basaj, Barbara Rychalska, Alicja Gosiewska |
---|---|
Rok vydání: | 2019 |
Předmět: |
Computer science
business.industry Deep learning Transferability Sentiment analysis Robustness testing 02 engineering and technology computer.software_genre 03 medical and health sciences 0302 clinical medicine Robustness (computer science) 030221 ophthalmology & optometry 0202 electrical engineering electronic engineering information engineering Embedding 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing |
Zdroj: | Neural Information Processing ISBN: 9783030367176 ICONIP (3) |
DOI: | 10.1007/978-3-030-36718-3_20 |
Popis: | Natural Language Processing models lack a unified approach to robustness testing. In this paper we introduce WildNLP - a framework for testing model stability in a natural setting where text corruptions such as keyboard errors or misspelling occur. We compare robustness of deep learning models from 4 popular NLP tasks: Q&A, NLI, NER and Sentiment Analysis by testing their performance on aspects introduced in the framework. In particular, we focus on a comparison between recent state-of-the-art text representations and non-contextualized word embeddings. In order to improve robustness, we perform adversarial training on selected aspects and check its transferability to the improvement of models with various corruption types. We find that the high performance of models does not ensure sufficient robustness, although modern embedding techniques help to improve it. We release the code of WildNLP framework for the community. |
Databáze: | OpenAIRE |
Externí odkaz: |