A cooperative feature gene extraction algorithm that combines classification and clustering

Autor: Winston Patrick Kuo, Keith C. C. Chan, Chi Kin Chow, Mark W. Lingen, Hailong Zhu, Jessica Lacy
Rok vydání: 2009
Předmět:
Zdroj: 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop.
Popis: In feature gene selection, filtering model concerns classification accuracy while ignoring gene redundancy problem. On the other hand, gene clustering finds correlated genes without considering their predictive abilities. It is valuable to enhance their performances by the help of each other. We report a new feature gene extraction algorithm, namely Double-thresholding Extraction of Feature Gene (DEFG), that combines gene filtering and gene clustering. It firstly pre-select feature gene set from the original dataset. A modified gene clustering is then applied to refine this set. In the gene clustering, specific designs are employed to balance the predictive abilities and the redundancies of the extracted feature gene. We have tested DEFG on a microarray dataset and compared its performance with that of two benchmark algorithms. The experimental results show that DEFG is superior to them in terms of internal validation accuracy and external validation accuracy. Also, DEFG can generalize the pattern structure by a small number of training samples.
Databáze: OpenAIRE