HMM-based Chinese Spelling Check
Autor: | Chi-Wei Lee, 李啟維 |
---|---|
Rok vydání: | 2017 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 105 First, we extracted the typos from UDN edit log, and do some analysis. By the above data, we create the first benchmark to examine the Chinese Spelling Check system for professional editor, like journalist, writer and so on. Second, we build a new confusion set which can reduce search time. By extracting the features from all the pairs of Chinese character, we can train a SVM classifier to explore potential confusion set based on known typos table. Last, we compared the result between HMM and beam search. With language model and noisy channel model, we tune the parameter to find the best accuracy from our benchmark. We found that beam search work much better than the method of HMM. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |