Popis: |
Automated algorithm engineering has become an important asset for academia and industry. irace, for instance, is an algorithm configurator (AC) that has successfully designed effective algorithms for optimization problems. The major advantage of irace is combining learning and parallelization, but no fully-functional automated machine learning (AutoML) system powered by irace has yet been proposed. This is rather striking, as some of the most relevant existing AutoML tools are powered by ACs, of which irace is one of the most effective examples.In this work, we propose iSklearn, an irace-powered AutoML system. Our proposal improves existing work applying an AC to engineer a machine learning (ML) pipeline. First, our configuration space represents a minimalist pipeline template, demonstrating that simpler pipelines can be competitive with elaborate approaches (e.g. ensembles). Second, our configuration setup improves the application of AC-based AutoML to time series (TS) problems, and is more flexible to fit other applications.We evaluate iSklearn on three major ML domains, namely computer vision (CV), natural language processing (NLP), and TS. Results prove competitive to AUTOSKLEARN, a state-of-the-art AutoML system also built on scikit-learn. Furthermore, the compositions of the pipelines devised vary with the problem domain and dataset considered, providing further evidence for the need of AutoML tools. We conclude our investigation ablating through the proposed configuration space and setup to understand their impact on the performance of iSklearn. |