A Comparative Analysis of Feature Selection and Feature Extraction Models for Classifying Microarray Dataset.

Autor: Olaolu, Arowolo M., Abdulsalam, Sulaiman O., Mope, Isiaka R., Kazeem, Gbolagade A.
Předmět:
Zdroj: Computing & Information Systems; May2018, Vol. 22 Issue 2, p29-38, 10p
Abstrakt: Purpose: The purpose of this research is to apply dimensionality reduction methods to fetch out the smallest set of genes that contributes to the efficient performance of classification algorithms in microarray data. Design/Methodology/Approach: Using colon cancer microarray dataset, One-Way-Analysis of Variance is used as a feature selection dimensionality reduction technique, due to its robustness and efficiency to select relevant information in a high-dimension of colon cancer microarray dataset. Principal Component Analysis (PCA) and Partial Least Square (PLS) are used as feature extraction techniques, by projecting the reduced high-dimensional data into efficient lowdimensional space. The classification capability of colon cancer datasets is carried out using a good classifier such as Support Vector Machine (SVM). The study is analyzed using MATLAB 2015. Findings: The study obtained high accuracies and the performances of the dimension reduction techniques used are compared. The PLS-Based attained 95% accuracy having edge over the other dimension reduction methods (One-Way-ANOVA and PCA). Practical Implications: The major implication of this research is getting the local dataset in the environments which lead to the usage of an open resource dataset. Originality: This study gives an insight and implications of high dimensional data in microarray gene analysis. The application of dimensionality reduction helps in fetching out irrelevant information that halts the performance of a microarray data technology. [ABSTRACT FROM AUTHOR]
Databáze: Complementary Index