A cooperative feature gene extraction algorithm that combines classification and clustering
Autor: | Winston Patrick Kuo, Keith C. C. Chan, Chi Kin Chow, Mark W. Lingen, Hailong Zhu, Jessica Lacy |
---|---|
Rok vydání: | 2009 |
Předmět: |
business.industry
Computer science Feature extraction Gene redundancy Pattern recognition computer.software_genre Support vector machine Set (abstract data type) Statistical classification ComputingMethodologies_PATTERNRECOGNITION Feature (computer vision) Benchmark (computing) ComputingMethodologies_GENERAL Data mining Artificial intelligence business Cluster analysis computer |
Zdroj: | 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop. |
Popis: | In feature gene selection, filtering model concerns classification accuracy while ignoring gene redundancy problem. On the other hand, gene clustering finds correlated genes without considering their predictive abilities. It is valuable to enhance their performances by the help of each other. We report a new feature gene extraction algorithm, namely Double-thresholding Extraction of Feature Gene (DEFG), that combines gene filtering and gene clustering. It firstly pre-select feature gene set from the original dataset. A modified gene clustering is then applied to refine this set. In the gene clustering, specific designs are employed to balance the predictive abilities and the redundancies of the extracted feature gene. We have tested DEFG on a microarray dataset and compared its performance with that of two benchmark algorithms. The experimental results show that DEFG is superior to them in terms of internal validation accuracy and external validation accuracy. Also, DEFG can generalize the pattern structure by a small number of training samples. |
Databáze: | OpenAIRE |
Externí odkaz: |