A spark-based parallel distributed posterior decoding algorithm for big data hidden Markov models decoding problem

Autor: Abdelkrim Bekkhoucha, Samir Anter, Imad Sassi
Rok vydání: 2021
Předmět:
Popis: Hidden Markov models (HMMs) are one of machine learning algorithms which have been widely used and demonstrated their efficiency in many conventional applications. This paper proposes a modified posterior decoding algorithm to solve hidden Markov models decoding problem based on MapReduce paradigm and spark’s resilient distributed dataset (RDDs) concept, for large-scale data processing. The objective of this work is to improve the performances of HMM to deal with big data challenges. The proposed algorithm shows a great improvement in reducing time complexity and provides good results in terms of running time, speedup, and parallelization efficiency for a large amount of data, i.e., large states number and large sequences number.
Databáze: OpenAIRE