The Impact of Missing Data on Data Mining

Autor:	John F. Kros, Marvin L. Brown
Rok vydání:	2003
Předmět:	Artificial neural network Association rule learning Computer science Data quality Outlier Decision tree Imputation (statistics) Data pre-processing Data mining computer.software_genre Missing data computer
DOI:	10.4018/978-1-59140-051-6.ch007
Popis:	Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable percentage of inaccurate data, pollution, outliers, and noise. The actual data-mining process deals significantly with prediction, estimation, classification, pattern recognition, and the development of association rules. Therefore, the significance of the analysis depends heavily on the accuracy of the database and on the chosen sample data to be used for model training and testing. The issue of missing data must be addressed since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::627dc14e8889e7e0601a33286e35c2fc https://doi.org/10.4018/978-1-59140-051-6.ch007 Zobrazit plný text záznamu