Hierarchical classification of Chinese documents based onN-grams

Autor: Zhou Shui-geng, Guan Jihong, He Yanxiang
Rok vydání: 2001
Předmět:
Zdroj: Wuhan University Journal of Natural Sciences. 6:416-422
ISSN: 1993-4998
1007-1202
DOI: 10.1007/bf03160278
Popis: We explore the techniques of utilizingN-gram information to categorize Chinese text documents hierarchically so that the classifier can shake off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classifier is implemented. Experimental results show that hierarchically classifying Chinese text documents basedN-grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers.
Databáze: OpenAIRE