MuLaN: Multilingual Label propagatioN for Word Sense Disambiguation
Autor: | Tommaso Pasini, Luigi Procopio, Niccolò Campolungo, Roberto Navigli, Edoardo Barba |
---|---|
Jazyk: | angličtina |
Předmět: |
natural language processing
natural language semantics resources and evaluation Word-sense disambiguation Computer science business.industry 02 engineering and technology computer.software_genre 020204 information systems 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Artificial intelligence business computer Natural language processing Label propagation |
Zdroj: | Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence Scopus-Elsevier IJCAI |
DOI: | 10.24963/ijcai.2020/531 |
Popis: | The knowledge acquisition bottleneck strongly affects the creation of multilingual sense-annotated data, hence limiting the power of supervised systems when applied to multilingual Word Sense Disambiguation. In this paper, we propose a semi-supervised approach based upon a novel label propagation scheme, which, by jointly leveraging contextualized word embeddings and the multilingual information enclosed in a knowledge base, projects sense labels from a high-resource language, i.e., English, to lower-resourced ones. Backed by several experiments, we provide empirical evidence that our automatically created datasets are of a higher quality than those generated by other competitors and lead a supervised model to achieve state-of-the-art performances in all multilingual Word Sense Disambiguation tasks. We make our datasets available for research purposes at https://github.com/SapienzaNLP/mulan. |
Databáze: | OpenAIRE |
Externí odkaz: |