Chinese Multiple Word Sense Labeling and Its Application to Language Modeling

Autor: Chen, Wei-Yuan, 陳威遠
Rok vydání: 2017
Druh dokumentu: 學位論文 ; thesis
Popis: 105
This thesis can be divided into two parts, the improvement of language model and Chinese word embedding and its application. In the improvement of the language model, we use the weighted finite state transducer on speech recognition. We use the correct phoneme sequence to replace the acoustic model, which result the speech recognition only depend on language model. By improving the post-processing of word segmentation and pronunciation dictionary can enhance accuracy of speech recognition. In Chinese word embedding, we study the polysemy effect on Chinese words vectors. To solve the problem of polysemy, we use unsupervised learning to label polysemy by multiple word sense vector which was learning from context and part-of-speech. We propose some qualitative analysis to measure the improvement. Finally, we construct a language model which contain the semantic information by word sense corpus which was labeled polysemy by multiple word sense vector.
Databáze: Networked Digital Library of Theses & Dissertations