Improved Cross-Lingual Transfer Learning For Automatic Speech Translation

Autor:	Khurana, Sameer, Dawalatabad, Nauman, Laurent, Antoine, Vicente, Luis, Gimeno, Pablo, Mingote, Victoria, Glass, James
Rok vydání:	2023
Předmět:	Computer Science - Computation and Language Computer Science - Artificial Intelligence Electrical Engineering and Systems Science - Audio and Speech Processing Electrical Engineering and Systems Science - Signal Processing
Druh dokumentu:	Working Paper
Popis:	Research in multilingual speech-to-text translation is topical. Having a single model that supports multiple translation tasks is desirable. The goal of this work it to improve cross-lingual transfer learning in multilingual speech-to-text translation via semantic knowledge distillation. We show that by initializing the encoder of the encoder-decoder sequence-to-sequence translation model with SAMU-XLS-R, a multilingual speech transformer encoder trained using multi-modal (speech-text) semantic knowledge distillation, we achieve significantly better cross-lingual task knowledge transfer than the baseline XLS-R, a multilingual speech transformer encoder trained via self-supervised learning. We demonstrate the effectiveness of our approach on two popular datasets, namely, CoVoST-2 and Europarl. On the 21 translation tasks of the CoVoST-2 benchmark, we achieve an average improvement of 12.8 BLEU points over the baselines. In the zero-shot translation scenario, we achieve an average gain of 18.8 and 11.9 average BLEU points on unseen medium and low-resource languages. We make similar observations on Europarl speech translation benchmark.
Databáze:	arXiv
Externí odkaz:	http://arxiv.org/abs/2306.00789 Zobrazit plný text záznamu View this record from Arxiv