A Novel Statistical Feature Selection Measure for Decision Tree Models on Microarray Cancer Detection
Autor: | Janardhan Reddy Ummadi, B. Eswara Reddy, B. Venkata Ramana Reddy |
---|---|
Rok vydání: | 2017 |
Předmět: |
Measure (data warehouse)
020205 medical informatics Computer science business.industry Supervised learning Decision tree Feature selection 02 engineering and technology Machine learning computer.software_genre Ensemble learning Random forest 03 medical and health sciences Tree (data structure) ComputingMethodologies_PATTERNRECOGNITION 0302 clinical medicine 030220 oncology & carcinogenesis Random tree 0202 electrical engineering electronic engineering information engineering Artificial intelligence business computer |
Zdroj: | Proceedings of International Conference on Computational Intelligence and Data Engineering ISBN: 9789811063183 |
DOI: | 10.1007/978-981-10-6319-0_20 |
Popis: | Recently, machine learning techniques have become popular and widely accepted for cancer detection and classification. Prediction of cancer disease focuses on three main objectives: susceptibility prediction, recurrence prediction, and survivability prediction. Most of the conventional classification techniques deal with limited attributes and small datasets. Random forest classifier is one of the ensemble learning models, which is capable to handle datasets with a large number of attributes. Machine learning algorithms used for cancer prediction are supervised learning with high prediction rate. In this paper, a novel statistical attribute selection measure was implemented for cancer disease prediction. In this work, we have used different decision tree models such as random tree, random forest, Hoeffding tree to evaluate the performance of cancer disease prediction using proposed attribute selection measure. Experimental results are evaluated on different types of microarray cancer datasets including lung cancer, ovarian, lung cancer, and DLBCL-Stanford. The performance of each model is compared in order to find the most efficient and optimized algorithm. Experimental results show that proposed model has high computational efficiency in terms of accuracy and true positive rate. |
Databáze: | OpenAIRE |
Externí odkaz: |