Using error decay prediction to overcome practical issues of deep active learning for named entity recognition
Autor: | Shankar Vembu, Haw-Shiuan Chang, Andrew McCallum, Sunil Mohan, Rheeya Uppaal |
---|---|
Rok vydání: | 2020 |
Předmět: |
FOS: Computer and information sciences
Computer Science - Machine Learning Computer Science - Computation and Language business.industry Computer science Sampling efficiency Machine Learning (stat.ML) 02 engineering and technology Machine learning computer.software_genre Decay curve Machine Learning (cs.LG) Named-entity recognition Statistics - Machine Learning Artificial Intelligence Robustness (computer science) 020204 information systems Sampling process 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business Computation and Language (cs.CL) computer Software |
Zdroj: | Machine Learning. 109:1749-1778 |
ISSN: | 1573-0565 0885-6125 |
DOI: | 10.1007/s10994-020-05897-1 |
Popis: | Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to labeling noise, and (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers, and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well. This is a pre-print of an article published in Springer Machine Learning journal. The final authenticated version is available online at: https://doi.org/10.1007/s10994-020-05897-1 |
Databáze: | OpenAIRE |
Externí odkaz: |