K-nearest neighbor imputation for missing value in hepatitis data.

Autor: Alianso, Arifin Surya, Syafaah, Lailis, Faruq, Amrul
Předmět:
Zdroj: AIP Conference Proceedings; 7/25/2022, Vol. 2453 Issue 1, p1-5, 5p
Abstrakt: There has been a growing occurrence of errors in a dataset, one of which is the incomplete data on an attribute or commonly acknowledged as a missing value, affecting the results of an analysis conducted for researchers. Attempt to address such issue includes the imputation, a method of filling in the missing value by Replacing the missing value with a possible value based on dataset information. This study aims to deal with missing values in albumin attribute hepatic data by utilizing K-Nearest Neighbor (KNN) imputation, performed by calculating the weight mean estimation for the number of K which has been determined. K is thus the closest observation, where in this study, the K that would be utilized is when K=3, K=5, K=7, K=9, and K=15. To determine the accuracy of an imputation, an evaluation is performed by utilizing the Mean Square Error (MSE). Based on the results obtained in this study, the best accuracy of program calculations is obtained when K=7 and the best MSE is achieved when K=15. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index