Probability Model Based on Cluster Analysis to Classify Sequences of Observations for Small Training Sets
Autor: | Sergey S. Yulin, Irina N. Palamar |
---|---|
Rok vydání: | 2020 |
Předmět: |
Statistics and Probability
Conditional random field Self-organizing map Dynamic time warping Control and Optimization Markov chain business.industry Computer science k-means clustering Pattern recognition Artificial Intelligence Signal Processing Computer Vision and Pattern Recognition Noise (video) Artificial intelligence Statistics Probability and Uncertainty business Cluster analysis Hidden Markov model Information Systems |
Zdroj: | Statistics, Optimization & Information Computing. 8:296-303 |
ISSN: | 2310-5070 2311-004X |
DOI: | 10.19139/soic-2310-5070-690 |
Popis: | The problem of recognizing patterns, when there are few training data available, is particularly relevant and arises in cases when collection of training data is expensive or essentially impossible. The work proposes a new probability model MC&CL (Markov Chain and Clusters) based on a combination of markov chain and algorithm of clustering (self-organizing map of Kohonen, k-means method), to solve a problem of classifying sequences of observations, when the amount of training dataset is low. An original experimental comparison is made between the developed model (MC&CL) and a number of the other popular models to classify sequences: HMM (Hidden Markov Model), HCRF (Hidden Conditional Random Fields),LSTM (Long Short-Term Memory), kNN+DTW (k-Nearest Neighbors algorithm + Dynamic Time Warping algorithm). A comparison is made using synthetic random sequences, generated from the hidden markov model, with noise added to training specimens. The best accuracy of classifying the suggested model is shown, as compared to those under review, when the amount of training data is low. |
Databáze: | OpenAIRE |
Externí odkaz: |