More Machine Learning for Less: Comparing Data Generation Strategies in Mechanical Engineering and Manufacturing

Autor: Philipp Noodt, Alexia Fenollar Solvay, Tobias Meisen, Johannes Lipp, Vladimir Samsonov
Rok vydání: 2019
Předmět:
Zdroj: SSCI
DOI: 10.1109/ssci44817.2019.9002663
Popis: Supervised Machine Learning (ML) models require extensive training data to properly approximate the behavior of complex mechanical processes and systems. Real-world experiments or adequate simulations are expensive, time-consuming or incident-related and make the efficient acquisition of sample data a compelling necessity. In mechanical engineering and manufacturing, data is usually collected via established Design of Experiments (DOE) methods. At the same time, the topic of Active Learning (AL) is gaining in importance in the research community and promises a reduction in the amount of data, but is rarely used in industry.In this paper, we compare the most common data sampling methods with AL to achieve better predictive results with fewer samples on regression tasks. We propose a novel evaluation framework that allows to compare various sampling methods in a controlled and unbiased manner, regardless of their different requirements. Using three exemplary use cases (UCs), we evaluate when one should use AL or DOE methods for the task of data generation, by looking at the sample efficiency, stability and predictive accuracy of the resulting ML models. This paper provides practical guidance to both engineers and data scientists, who required highly efficient data collection for later use of ML.
Databáze: OpenAIRE