A spark-based parallel distributed posterior decoding algorithm for big data hidden Markov models decoding problem
Autor: | Abdelkrim Bekkhoucha, Samir Anter, Imad Sassi |
---|---|
Rok vydání: | 2021 |
Předmět: |
Spark
Data processing Information Systems and Management Speedup business.industry Computer science Big data Posterior decoding Cloud computing Parallel distributed approach Apache Artificial Intelligence Control and Systems Engineering Spark (mathematics) Hidden Markov models Electrical and Electronic Engineering business Hidden Markov model Time complexity Algorithm Decoding methods |
Popis: | Hidden Markov models (HMMs) are one of machine learning algorithms which have been widely used and demonstrated their efficiency in many conventional applications. This paper proposes a modified posterior decoding algorithm to solve hidden Markov models decoding problem based on MapReduce paradigm and spark’s resilient distributed dataset (RDDs) concept, for large-scale data processing. The objective of this work is to improve the performances of HMM to deal with big data challenges. The proposed algorithm shows a great improvement in reducing time complexity and provides good results in terms of running time, speedup, and parallelization efficiency for a large amount of data, i.e., large states number and large sequences number. |
Databáze: | OpenAIRE |
Externí odkaz: |