Evolving Fully Automated Machine Learning via Life-Long Knowledge Anchors.

Autor: Zheng, Xiawu, Zhang, Yang, Hong, Sirui, Li, Huixia, Tang, Lang, Xiong, Youcheng, Zhou, Jin, Wang, Yan, Sun, Xiaoshuai, Zhu, Pengfei, Wu, Chenglin, Ji, Rongrong
Předmět:
Zdroj: IEEE Transactions on Pattern Analysis & Machine Intelligence; Sep2021, Vol. 43 Issue 9, p3091-3107, 17p
Abstrakt: Automated machine learning (AutoML) has achieved remarkable progress on various tasks, which is attributed to its minimal involvement of manual feature and model designs. However, most of existing AutoML pipelines only touch parts of the full machine learning pipeline, e.g., neural architecture search or optimizer selection. This leaves potentially important components such as data cleaning and model ensemble out of the optimization, and still results in considerable human involvement and suboptimal performance. The main challenges lie in the huge search space assembling all possibilities over all components, as well as the generalization ability over different tasks like image, text, and tabular etc. In this paper, we present a first-of-its-kind fully AutoML pipeline, to comprehensively automate data preprocessing, feature engineering, model generation/selection/training and ensemble for an arbitrary dataset and evaluation metric. Our innovation lies in the comprehensive scope of a learning pipeline, with a novel “life-long” knowledge anchor design to fundamentally accelerate the search over the full search space. Such knowledge anchors record detailed information of pipelines and integrates them with an evolutionary algorithm for joint optimization across components. Experiments demonstrate that the result pipeline achieves state-of-the-art performance on multiple datasets and modalities. Specifically, the proposed framework was extensively evaluated in the NeurIPS 2019 AutoDL challenge, and won the only champion with a significant gap against other approaches, on all the image, video, speech, text and tabular tracks. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index