News classification algorithm based on second order Hidden Markov Model
Autor: | Sun Xuan, Li Luqun, Jiang Longquan |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
ComputingMethodologies_PATTERNRECOGNITION
Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) news classification second order Hidden Markov Model (HMM) term frequency-inverse document frequency χ2 test feature word lcsh:Science (General) lcsh:Q1-390 |
Zdroj: | Journal of Shanghai Normal University (Natural Sciences), Vol 47, Iss 4, Pp 488-493 (2018) |
ISSN: | 1000-5137 |
Popis: | A novel algorithm based on second order Hidden Markov Model (HMM) was proposed to classify the documents of news,aiming to extract categorical feature words from news contents as a feature set.The feature set was considered as the observation sequence of different second order HMM classifiers,and the hidden state of which reflected the differences between the words in the relevant documents,and each state of which represented correlation of words occurring in the corpus.The experiment showed that the proposed classification algorithm based second order HMM had prominent advantage over k-Nearest Neighbor (kNN),Naive Bayes and Support Vector Machine (SVM) algorithms. |
Databáze: | OpenAIRE |
Externí odkaz: |