Autor: |
Amr A. Abd El-Mageed, Ahmed E. Elkhouli, Amr A. Abohany, Mona Gafar |
Jazyk: |
angličtina |
Rok vydání: |
2024 |
Předmět: |
|
Zdroj: |
Journal of Big Data, Vol 11, Iss 1, Pp 1-75 (2024) |
Druh dokumentu: |
article |
ISSN: |
2196-1115 |
DOI: |
10.1186/s40537-024-00902-z |
Popis: |
Abstract RNA Sequencing (RNA-Seq) has been considered a revolutionary technique in gene profiling and quantification. It offers a comprehensive view of the transcriptome, making it a more expansive technique in comparison with micro-array. Genes that discriminate malignancy and normal can be deduced using quantitative gene expression. However, this data is a high-dimensional dense matrix; each sample has a dimension of more than 20,000 genes. Dealing with this data poses challenges. This paper proposes RBNRO-DE (Relief Binary NRO based on Differential Evolution) for handling the gene selection strategy on (rnaseqv2 illuminahiseq rnaseqv2 un edu Level 3 RSEM genes normalized) with more than 20,000 genes to pick the best informative genes and assess them through 22 cancer datasets. The k-nearest Neighbor (k-NN) and Support Vector Machine (SVM) are applied to assess the quality of the selected genes. Binary versions of the most common meta-heuristic algorithms have been compared with the proposed RBNRO-DE algorithm. In most of the 22 cancer datasets, the RBNRO-DE algorithm based on k-NN and SVM classifiers achieved optimal convergence and classification accuracy up to 100% integrated with a feature reduction size down to 98%, which is very evident when compared to its counterparts, according to Wilcoxon’s rank-sum test (5% significance level). |
Databáze: |
Directory of Open Access Journals |
Externí odkaz: |
|