MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation
Autor: | Luigi Procopio, Roberto Navigli, Federico Martelli, Edoardo Barba |
---|---|
Rok vydání: | 2021 |
Předmět: |
Cross lingual
Word-sense disambiguation Word Alignment Computer science business.industry Word Sense Disambiguation Cross-lingual Label Projection Word Alignment computer.software_genre strategies tools standards for lexicographic resources (objective 3) WP3 Artificial intelligence Word Sense Disambiguation business computer Natural language processing Word (computer architecture) Cross-lingual Label Projection |
Zdroj: | IJCAI Scopus-Elsevier Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence |
DOI: | 10.24963/ijcai.2021/539 |
Popis: | Word Sense Disambiguation (WSD), i.e., the task of assigning senses to words in context, has seen a surge of interest with the advent of neural models and a considerable increase in performance up to 80% F1 in English. However, when considering other languages, the availability of training data is limited, which hampers scaling WSD to many languages. To address this issue, we put forward MULTIMIRROR, a sense projection approach for multilingual WSD based on a novel neural discriminative model for word alignment: given as input a pair of parallel sentences, our model – trained with a low number of instances – is capable of jointly aligning, at the same time, all source and target tokens with each other, surpassing its competitors across several language combinations. We demonstrate that projecting senses from English by leveraging the alignments produced by our model leads a simple mBERT-powered classifier to achieve a new state of the art on established WSD datasets in French, German, Italian, Spanish and Japanese. We release our software and all our datasets at https://github.com/SapienzaNLP/multimirror. |
Databáze: | OpenAIRE |
Externí odkaz: |