A Novel Statistical Feature Selection Measure for Decision Tree Models on Microarray Cancer Detection

Autor:	Janardhan Reddy Ummadi, B. Eswara Reddy, B. Venkata Ramana Reddy
Rok vydání:	2017
Předmět:	Measure (data warehouse) 020205 medical informatics Computer science business.industry Supervised learning Decision tree Feature selection 02 engineering and technology Machine learning computer.software_genre Ensemble learning Random forest 03 medical and health sciences Tree (data structure) ComputingMethodologies_PATTERNRECOGNITION 0302 clinical medicine 030220 oncology & carcinogenesis Random tree 0202 electrical engineering electronic engineering information engineering Artificial intelligence business computer
Zdroj:	Proceedings of International Conference on Computational Intelligence and Data Engineering ISBN: 9789811063183
DOI:	10.1007/978-981-10-6319-0_20
Popis:	Recently, machine learning techniques have become popular and widely accepted for cancer detection and classification. Prediction of cancer disease focuses on three main objectives: susceptibility prediction, recurrence prediction, and survivability prediction. Most of the conventional classification techniques deal with limited attributes and small datasets. Random forest classifier is one of the ensemble learning models, which is capable to handle datasets with a large number of attributes. Machine learning algorithms used for cancer prediction are supervised learning with high prediction rate. In this paper, a novel statistical attribute selection measure was implemented for cancer disease prediction. In this work, we have used different decision tree models such as random tree, random forest, Hoeffding tree to evaluate the performance of cancer disease prediction using proposed attribute selection measure. Experimental results are evaluated on different types of microarray cancer datasets including lung cancer, ovarian, lung cancer, and DLBCL-Stanford. The performance of each model is compared in order to find the most efficient and optimized algorithm. Experimental results show that proposed model has high computational efficiency in terms of accuracy and true positive rate.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::90d64dbde1bf1d25e184ca6d220d6d6b https://doi.org/10.1007/978-981-10-6319-0_20 Zobrazit plný text záznamu