Autor: |
Zou, Aixiao, Wu, Xuanxuan, Li, Xinjie, Zhang, Ting, Cui, Fuwei, Xu, Jinan |
Předmět: |
|
Zdroj: |
Applied Intelligence; Sep2024, Vol. 54 Issue 17/18, p7958-7968, 11p |
Abstrakt: |
Stylized neural machine translation (NMT) aims to translate sentences of one style into sentences of another style, it is essential for the application of machine translation in a real-world scenario. Most existing methods employ an encoder-decoder structure to understand, translate, and transform style simultaneously, which increases the learning difficulty of the model and leads to poor generalization ability. To address these issues, we propose a curriculum pre-training framework to improve stylized NMT. Specifically, we design four pre-training tasks of increasing difficulty to assist the model to extract more features essential for stylized translation. Then, we further propose a stylized-token aligned data augmentation method to expand the scale of pre-training corpus for alleviating the data-scarcity problem. Experiments show that our method achieves competitive results on MTFC and Modern-Classical translation dataset. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|