Missing value imputation on gene expression data using bee-based algorithm to improve classification performance.

Autor:	Kritanat Chungnoy, Tanatorn Tanantong, Pokpong Songmuang
Jazyk:	angličtina
Rok vydání:	2024
Předmět:	Medicine Science
Zdroj:	PLoS ONE, Vol 19, Iss 8, p e0305492 (2024)
Druh dokumentu:	article
ISSN:	1932-6203
DOI:	10.1371/journal.pone.0305492
Popis:	Existing missing value imputation methods focused on imputing the data regarding actual values towards a completion of datasets as an input for machine learning tasks. This work proposes an imputation of missing values towards improvement of accuracy performance for classification. The proposed method was based on bee algorithm and the use of k-nearest neighborhood with linear regression to guide on finding the appropriate solution in prevention of randomness. Among the processes, GINI importance score was utilized in selecting values for imputation. The imputed values thus reflected on improving a discriminative power in classification tasks instead of replicating the actual values from the original dataset. In this study, we evaluated the proposed method against frequently used imputation methods such as k-nearest neighborhood, principal components analysis, nonlinear principal, and component analysis to compare root mean square error results and accuracy of using imputed datasets in a classification task. The experimental results indicated that our proposed method obtained the best accuracy results from all datasets comparing to other methods. In comparison to original dataset, the classification model from imputed datasets yielded 15-25% higher accuracy in class prediction. From analysis, the results showed that feature ranking used in a classification process was affected and lead to noticeably change in informativeness as the imputed data from the proposed method played the role to boost a discriminating power.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/db1017468afd4aa6bedb145d801525df Zobrazit plný text záznamu View record in DOAJ Plný text ve formátu PDF Plný text ve formátu HTML
Nepřihlášeným uživatelům se plný text nezobrazuje	K zobrazení výsledku je třeba se přihlásit.