A graph-based big data optimization approach using hidden Markov model and constraint satisfaction problem
Autor: | Abdelkrim Bekkhoucha, Samir Anter, Imad Sassi |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Optimization
Graphical modeling Computer engineering. Computer hardware Information Systems and Management Computer Networks and Communications Computer science Big data Metaheuristics Information technology Machine learning computer.software_genre Big data analytics TK7885-7895 Hidden Markov model Metaheuristic Constraint satisfaction problem business.industry QA75.5-76.95 Solver T58.5-58.64 Mean absolute percentage error Hardware and Architecture Electronic computers. Computer science Time series forecasting Benchmark (computing) Graph (abstract data type) Artificial intelligence business computer Information Systems |
Zdroj: | Journal of Big Data, Vol 8, Iss 1, Pp 1-29 (2021) |
ISSN: | 2196-1115 |
Popis: | To address the challenges of big data analytics, several works have focused on big data optimization using metaheuristics. The constraint satisfaction problem (CSP) is a fundamental concept of metaheuristics that has shown great efficiency in several fields. Hidden Markov models (HMMs) are powerful machine learning algorithms that are applied especially frequently in time series analysis. However, one issue in forecasting time series using HMMs is how to reduce the search space (state and observation space). To address this issue, we propose a graph-based big data optimization approach using a CSP to enhance the results of learning and prediction tasks of HMMs. This approach takes full advantage of both HMMs, with the richness of their algorithms, and CSPs, with their many powerful and efficient solver algorithms. To verify the validity of the model, the proposed approach is evaluated on real-world data using the mean absolute percentage error (MAPE) and other metrics as measures of the prediction accuracy. The conducted experiments show that the proposed model outperforms the conventional model. It reduces the MAPE by 0.71% and offers a particularly good trade-off between computational costs and the quality of results for large datasets. It is also competitive with benchmark models in terms of the running time and prediction accuracy. Further comparisons substantiate these experimental findings. |
Databáze: | OpenAIRE |
Externí odkaz: |