Hierarchical classification of Chinese documents based onN-grams
Autor: | Zhou Shui-geng, Guan Jihong, He Yanxiang |
---|---|
Rok vydání: | 2001 |
Předmět: |
Multidisciplinary
business.industry Computer science Pattern recognition Feature selection Shake Domain (software engineering) ComputingMethodologies_PATTERNRECOGNITION Categorization Classifier (linguistics) ComputingMethodologies_DOCUMENTANDTEXTPROCESSING Segmentation Artificial intelligence business |
Zdroj: | Wuhan University Journal of Natural Sciences. 6:416-422 |
ISSN: | 1993-4998 1007-1202 |
DOI: | 10.1007/bf03160278 |
Popis: | We explore the techniques of utilizingN-gram information to categorize Chinese text documents hierarchically so that the classifier can shake off the burden of large dictionaries and complex segmentation processing, and subsequently be domain and time independent. A hierarchical Chinese text classifier is implemented. Experimental results show that hierarchically classifying Chinese text documents basedN-grams can achieve satisfactory performance and outperforms the other traditional Chinese text classifiers. |
Databáze: | OpenAIRE |
Externí odkaz: |