Towards Automatic Generation of Performance Models for Dynamic Tuning using Machine Learning

Autor:	Alcaraz, Jordi
Přispěvatelé:	Sikora, Anna, César Galobardes, Eduardo, Barbara Sikora, Anna
Rok vydání:	2021
Předmět:	Eines de rendiment Ciències Experimentals Herramientas de rendimiento Performance tools Aprenentatge automàtic Machine learning Computació d'altes prestacions High performance computing Aprendizaje automático Computación de altas prestaciones
Zdroj:	TDX (Tesis Doctorals en Xarxa) TDR. Tesis Doctorales en Red instname Dipòsit Digital de Documents de la UAB Universitat Autònoma de Barcelona
Popis:	New approaches are necessary to generate performance models in current systems due the het erogeneity found in new systems. An alternative to traditional analytical models could be the use of machine learning algorithms, which may help to automatically create performance models to predict the correct configuration for one or multiple application’s parameters. To be able to build performance models, metrics are used as inputs to calculate or select the proper values for one or multiple parameters which can impact performance. The selection of the correct metrics is important as information can be redundant or insufficient. In addition, multiple scenarios should be taken into consideration when generating models, such as different problem sizes, to obtain the behaviour under different conditions, which allows to generalize the relationships between metrics and avoid relationships tailored to only one scenario. In this thesis we tackle the two previously explained problems for multi-thread applications using OpenMP with the development of two methodologies. First, a methodology to find the proper set of metrics for characterizing the behaviour of a parallel code region is developed. Through the use of this methodology the number of metrics necessary to correctly characterize an application or a code region is reduced, decreasing the overhead when measuring all the necessary metrics. We have decided to use hardware performance counters as metrics to characterize the execution of OpenMP parallel regions. Using this methodology the number of hardware performance counters was reduced to less than half the available general purpose list of available counters while avoiding loss of information. The second methodology is developed to build a representative and balanced dataset of patterns found in parallel applications. Given a set of candidate parallel regions to be included in a dataset for performance tuning, each candidate is compared against the patterns already included in the dataset to find whether they cover, or not, a different region of the search space. This comparison is based in the correlation analysis of the metrics measured for the candidate. For example, in one of the tested systems, a dataset was generated with only 8 patterns from 33 parallel kernels extracted from STREAM and PolyBench benchmarks. The previously generated dataset becomes imbalanced when used for performance tuning because in a system some parameters’ values generally provide better performance than other values. Consequently, machine learning algorithms may under-perform due to underrepresented cases and techniques to counter the natural imbalance are necessary. An initial study is provided to find which machine learning algorithms provide better accu racy for tuning the number of threads. This study includes: a) data methods to balance the dataset for the target parameter; b) algorithmic methods to modify how the error is calculated; and c) ensemble methods, the combination of multiple models into a bigger one, providing a general hypothesis from each individual model. Universitat Autònoma de Barcelona. Programa de Doctorat en Informàtica
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::35c463e02a6f858ee3ea16160cec4755 http://hdl.handle.net/10803/675104 Zobrazit plný text záznamu