Data acquisition in modeling using neural networks and decision trees

Autor: R. Sika, Z. Ignaszak
Jazyk: angličtina
Rok vydání: 2011
Předmět:
Zdroj: Archives of Foundry Engineering, Vol 11, Iss 2, Pp 113-122 (2011)
ISSN: 1897-3310
Popis: The paper presents a comparison of selected models from area of artificial neural networks and decision trees in relation with actualconditions of foundry processes. The work contains short descriptions of used algorithms, their destination and method of data preparation,which is a domain of work of Data Mining systems. First part concerns data acquisition realized in selected iron foundry, indicating problems to solve in aspect of casting process modeling. Second part is a comparison of selected algorithms: a decision tree and artificial neural network, that is CART (Classification And Regression Trees) and BP (Backpropagation) in MLP (Multilayer Perceptron) networks algorithms.Aim of the paper is to show an aspect of selecting data for modeling, cleaning it and reducing, for example due to too strong correlationbetween some of recorded process parameters. Also, it has been shown what results can be obtained using two different approaches:first when modeling using available commercial software, for example Statistica, second when modeling step by step using Excel spreadsheetbasing on the same algorithm, like BP-MLP. Discrepancy of results obtained from these two approaches originates from a priorimade assumptions. Mentioned earlier Statistica universal software package, when used without awareness of relations of technologicalparameters, i.e. without user having experience in foundry and without scheduling ranks of particular parameters basing on acquisition, can not give credible basis to predict the quality of the castings. Also, a decisive influence of data acquisition method has been clearly indicated, the acquisition should be conducted according to repetitive measurement and control procedures. This paper is based on about 250 records of actual data, for one assortment for 6 month period, where only 12 data sets were complete (including two that were used for validation of neural network) and useful for creating a model. It is definitely too small portion in case of artificial neural networks, but it shows a scale of danger of unprofessional data acquisition.
Databáze: OpenAIRE