HMM-based Chinese Spelling Check

Autor: Chi-Wei Lee, 李啟維
Rok vydání: 2017
Druh dokumentu: 學位論文 ; thesis
Popis: 105
First, we extracted the typos from UDN edit log, and do some analysis. By the above data, we create the first benchmark to examine the Chinese Spelling Check system for professional editor, like journalist, writer and so on. Second, we build a new confusion set which can reduce search time. By extracting the features from all the pairs of Chinese character, we can train a SVM classifier to explore potential confusion set based on known typos table. Last, we compared the result between HMM and beam search. With language model and noisy channel model, we tune the parameter to find the best accuracy from our benchmark. We found that beam search work much better than the method of HMM.
Databáze: Networked Digital Library of Theses & Dissertations