Comparison of the Performance of Random Forest and K-Nearest Neighbor in Classifying Leukemia Using Principal Component Analysis

Autor: Sriani Sriani, Muhammad Ikhsan, lailan sofinah harahap
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Jurnal Sisfokom, Vol 13, Iss 2, Pp 288-292 (2024)
Druh dokumentu: article
ISSN: 2301-7988
2581-0588
DOI: 10.32736/sisfokom.v13i2.2165
Popis: Leukemia is the most common blood cancer in Asia, one of which is Indonesia. Leukemia can affect blood cells, bone marrow, lymph nodes and other parts of the lymphatic system. One way to detect leukemia is to use microarray technology by applying gene expression. Microarrays have a very large number of genes so it is necessary to reduce the number of genes in order to eliminate irrelevant features and increase the accuracy of the classification process. The leukemia feature/gene reduction process was carried out using PCA and the classification process was carried out using RF and KNN. The accuracy results from the RF classification method using 100 n_estimators were 78.57%, while using the KNN method the accuracy results with K=1 were 78.57%, K=3 and 5 were 85.71%, and K=7 and 9 were 71.42%. The best accuracy results use KNN with K=3 and 5.
Databáze: Directory of Open Access Journals