A Novel Algorithm for Imputing the Missing Values in Incomplete Datasets

Autor: Hutashan Vishal Bhagat, Manminder Singh
Rok vydání: 2022
DOI: 10.21203/rs.3.rs-1729251/v1
Popis: In today’s world, we completely rely on digital devices to collect data; a failure in such digital devices may result in huge information loss thereby making data mining a more tedious job for a Data Analyst. Missingness to a greater extent in a dataset subsequently comes out with inappropriate results and incomplete data analysis. Therefore, a need to develop an algorithm that can predict the missing values efficiently and accurately. This research paper proposes a novel splitting-based IMV-RE (Imputing the Missing Values in Real-Time Environment) algorithm to impute different missing values within a dataset. In the proposed IMV-RE algorithm, an upper limit is set for every class containing missing values that assist the algorithm to predict the missing values more accurately. The experimentation is performed on ten benchmark datasets that include completely numerical values as well as mixed data. Comparative experimental analysis indicates that the proposed IMV-RE algorithm outperforms the existing techniques in sensitivity to Accuracy, Root Mean Square Error (RMSE) and Coefficient of Determination (R2).
Databáze: OpenAIRE