MultiMirror: Neural Cross-lingual Word Alignment for Multilingual Word Sense Disambiguation

Autor: Luigi Procopio, Roberto Navigli, Federico Martelli, Edoardo Barba
Rok vydání: 2021
Předmět:
Zdroj: IJCAI
Scopus-Elsevier
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI-21)
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
DOI: 10.24963/ijcai.2021/539
Popis: Word Sense Disambiguation (WSD), i.e., the task of assigning senses to words in context, has seen a surge of interest with the advent of neural models and a considerable increase in performance up to 80% F1 in English. However, when considering other languages, the availability of training data is limited, which hampers scaling WSD to many languages. To address this issue, we put forward MULTIMIRROR, a sense projection approach for multilingual WSD based on a novel neural discriminative model for word alignment: given as input a pair of parallel sentences, our model – trained with a low number of instances – is capable of jointly aligning, at the same time, all source and target tokens with each other, surpassing its competitors across several language combinations. We demonstrate that projecting senses from English by leveraging the alignments produced by our model leads a simple mBERT-powered classifier to achieve a new state of the art on established WSD datasets in French, German, Italian, Spanish and Japanese. We release our software and all our datasets at https://github.com/SapienzaNLP/multimirror.
Databáze: OpenAIRE