High Performance Classification of Cancer Types with Gene Microarray Datasets: Hybrid Approach

Autor:	Yılmaz ATAY, Muhterem Oğuzhan YILDIRIM, Cuma Umur DOĞAN
Jazyk:	English<br />Turkish
Rok vydání:	2021
Předmět:	ensemble method genetic algorithm cancer microarray naive bayes feature selection classification Engineering (General). Civil engineering (General) TA1-2040 Science Science (General) Q1-390
Zdroj:	Gazi Üniversitesi Fen Bilimleri Dergisi, Vol 9, Iss 4, Pp 811-827 (2021)
Druh dokumentu:	article
ISSN:	2147-9526
DOI:	10.29109/gujsc.1000926
Popis:	Currently the approach of biological meaningfulness detection from gene microarray datasets obtained with microarray technology is used effectively in many areas such as disease diagnosis and differentiation of cancer types. However, since datasets obtained with this technology measure gene expression profiles collectively, the number of features in the dataset can be quite high. The small number of samples in gene microarray datasets, the high number of features and where the data is noisy significantly complicates the preparation process of these datasets. In order for machine learning models to successfully classify, the number of features that represent the size of the dataset should be reduced. In the proposed method, gene microarray data is taken as input and Information Gain, Fisher Correlation Scoring, ReliefF and, Chi-Square methods are applied separately for feature selection. After this stage, a sub-dataset containing the new genes is obtained and a pool of genes for Genetic Algorithm is created according to this dataset. Bayes classifier is trained using the sub-dataset created with the genes of the most successful chromosome. Thus, the classification process of cancer data is successfully completed. The model proposed in this study was applied to datasets that are frequently used in the literature and high success rates were obtained in classification. As a result; acceptable feature selection methods and the hybrid method based on Genetic Algorithm generally provided the most appropriate results on the all test data.
Databáze:	Directory of Open Access Journals
Externí odkaz:	https://doaj.org/article/0a1d58aad02142138629fc05ed9958bf Zobrazit plný text záznamu View record in DOAJ