Probability Model Based on Cluster Analysis to Classify Sequences of Observations for Small Training Sets

Autor:	Sergey S. Yulin, Irina N. Palamar
Rok vydání:	2020
Předmět:	Statistics and Probability Conditional random field Self-organizing map Dynamic time warping Control and Optimization Markov chain business.industry Computer science k-means clustering Pattern recognition Artificial Intelligence Signal Processing Computer Vision and Pattern Recognition Noise (video) Artificial intelligence Statistics Probability and Uncertainty business Cluster analysis Hidden Markov model Information Systems
Zdroj:	Statistics, Optimization & Information Computing. 8:296-303
ISSN:	2310-5070 2311-004X
DOI:	10.19139/soic-2310-5070-690
Popis:	The problem of recognizing patterns, when there are few training data available, is particularly relevant and arises in cases when collection of training data is expensive or essentially impossible. The work proposes a new probability model MC&CL (Markov Chain and Clusters) based on a combination of markov chain and algorithm of clustering (self-organizing map of Kohonen, k-means method), to solve a problem of classifying sequences of observations, when the amount of training dataset is low. An original experimental comparison is made between the developed model (MC&CL) and a number of the other popular models to classify sequences: HMM (Hidden Markov Model), HCRF (Hidden Conditional Random Fields),LSTM (Long Short-Term Memory), kNN+DTW (k-Nearest Neighbors algorithm + Dynamic Time Warping algorithm). A comparison is made using synthetic random sequences, generated from the hidden markov model, with noise added to training specimens. The best accuracy of classifying the suggested model is shown, as compared to those under review, when the amount of training data is low.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::b02eaec13b574ab2661c65109f995e5b https://doi.org/10.19139/soic-2310-5070-690 Zobrazit plný text záznamu