MPEC: Distributed Matrix Multiplication Performance Modeling on a Scale-Out Cloud Environment for Data Mining Jobs
Autor: | Jeongchul Kim, Myungjun Son, Kyungyong Lee |
---|---|
Rok vydání: | 2022 |
Předmět: |
Computer Networks and Communications
Computer science business.industry Computation Estimator Cloud computing 02 engineering and technology computer.software_genre Matrix multiplication Computer Science Applications Task (computing) Mean absolute percentage error Kernel (image processing) Hardware and Architecture 020204 information systems Scalability 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Data mining business computer Software Information Systems |
Zdroj: | IEEE Transactions on Cloud Computing. 10:521-538 |
ISSN: | 2372-0018 |
DOI: | 10.1109/tcc.2019.2950400 |
Popis: | Many data mining workloads are being analyzed in large-scale distributed cloud computing environments which provide nearly infinite resources with diverse hardware configurations. To maintain cost-efficiency in such environments, understanding the characteristics and estimating the overheads of a distributed matrix multiplication task that is a core computation kernel in many machine learning algorithms are essential. This study aims to propose a Matrix Multiplication Performance Estimator on Cloud (MPEC) algorithm. The proposed algorithm predicts the latency incurred when executing distributed matrix multiplication tasks of various input sizes and shapes with diverse instance types and a different number of worker nodes on cloud computing environments. To achieve this goal, we first analyze the characteristics of distributed matrix multiplication tasks. With characteristics generated from qualitative analysis, we propose to apply an ensemble of non-linear regression algorithm to predict the execution time of arbitrary matrix multiplication tasks. Thorough experimental results reveal that the proposed algorithm demonstrates higher accuracy than a state-of-the-art machine learning task performance estimation engine, Ernest, by decreasing the Mean Absolute Percentage Error (MAPE) in half. |
Databáze: | OpenAIRE |
Externí odkaz: |