A study of overfitting in optimization of a manufacturing quality control procedure

Autor: Valentin Koblar, Bernard Ženko, Bogdan Filipič, Klemen Gantar, Tea Tušar
Jazyk: angličtina
Rok vydání: 2017
Předmět:
Zdroj: Applied soft computing, vol. 59, pp. 77-87, 2017.
Applied Soft Computing
ISSN: 1568-4946
Popis: Quality control of the commutator manufacturing process can be automated by means of a machine learning model that can predict the quality of commutators as they are being manufactured. Such a model can be constructed by combining machine vision, machine learning and evolutionary optimization techniques. In this procedure, optimization is used to minimize the model error, which is estimated using single cross-validation. This work exposes the overfitting that emerges in such optimization. Overfitting is shown for three machine learning methods with different sensitivity to it (trees, additionally pruned trees and random forests) and assessed in two ways (repeated cross-validation and validation on a set of unseen instances). Results on two distinct quality control problems show that optimization amplifies overfitting, i.e., the single cross-validation error estimate for the optimized models is overly optimistic. Nevertheless, minimization of the error estimate by single cross-validation in general results in minimization of the other error estimates as well, showing that optimization is indeed beneficial in this context.
Databáze: OpenAIRE