End-to-end Pinyin to Character Language Model using Self-Attention Mechanism

Autor: Yeh, Han-Yun, 葉瀚允
Rok vydání: 2019
Druh dokumentu: 學位論文 ; thesis
Popis: 107
Deep nerual network with conventional automatic speech recognition structure has achieved huge improvement. Similarly, end-to-end speech recongnition structure got close performance in these two years, but with huge amout of data and computing resources. This study attempt to focus on end-to-end language model, training an end-to-end language model by sequence labeling method and self-attention seq2seq model (Transformer) which are common method in some NLP task, with syllable sequence converted from 440 million words chinese corpus through a proposed G2P system. And the syllable to character model with transformer achieved lower character error rate than the baseline trigram model in our outside test set.
Databáze: Networked Digital Library of Theses & Dissertations