Incomplete Maximum Matching Segmentation Based on Semantics

Autor: Haixiong Lv, Xin Ding, Hang Su, Yong Liu, Chunlei Zhang, Hanqing Zhou
Rok vydání: 2021
Předmět:
Zdroj: 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET).
Popis: This paper combines the advantages of regular segmentation and statistical segmentation, proposing an incomplete maximum matching segmentation method based on semantics. On the basis of ensuring time consumption, the new method solves the defect of word adhesion in the maximum matching algorithm. The innovative work includes: in the preprocessing stage, the forward semantic similarity dictionary is constructed to realize the follow-up word recognition. In the stage of word segmentation, the formula of three feature weight is proposed to redefine the segmentation principle. The experimental results show that the new method has a certain improvement in the accuracy and recall rate, which is suitable for the field of text processing.
Databáze: OpenAIRE