Chinese-Indonesian Low-Resource Neural Machine Translation
Autor: | Muhammad Fhadli, 穆何曼 |
---|---|
Rok vydání: | 2019 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 107 Build a good machine translator out of low resource data is one of the challenges recently. In order to tackle this issue, transfer learning came up and bring a solution to this problem. Another substantial problem for machine translator is to build the good machine translator with no resource of parallel corpus available. One of the solutions for this problem is to use the teacher-student framework to produce a source-target machine-generated parallel corpus.In this thesis, we tried to tackle the problem of building low resource machine translatorby utilizing machine-generated corpus. Basically, we combinethe solution for tackling low resource and zero resource machine translation problem to address the problem of low resource machine translation itself. The idea is to produce huge machine-generated data that we can use for training and then use transfer learning to continue the training with low resource data.Our contribution in this thesis is proved that machine-generated data can help us to train low resource human-generated data. As far as we know, there is no previous paper that discusses Chinese-Indonesian machine translation because of the data scarcity problem. Therefore, we do Chinese-Indonesian machine translation with English as a pivot and utilize some of the Malaysian corpora. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |