ConfDTree: A Statistical Method for Improving Decision Trees
Autor: | Asaf Shabtai, Gilad Katz, Nir Ofek, Lior Rokach |
---|---|
Rok vydání: | 2014 |
Předmět: |
Incremental decision tree
Training set Computer science business.industry Decision tree learning ID3 algorithm Decision tree Machine learning computer.software_genre Multiple-criteria decision analysis Confidence interval Computer Science Applications Theoretical Computer Science Computational Theory and Mathematics Hardware and Architecture Outlier Theory of computation Alternating decision tree Data mining Artificial intelligence business computer Software |
Zdroj: | Journal of Computer Science and Technology. 29:392-407 |
ISSN: | 1860-4749 1000-9000 |
Popis: | Decision trees have three main disadvantages: reduced performance when the training set is small; rigid decision criteria; and the fact that a single “uncharacteristic” attribute might “derail” the classification process. In this paper we present ConfDTree (Confidence-Based Decision Tree) — a post-processing method that enables decision trees to better classify outlier instances. This method, which can be applied to any decision tree algorithm, uses easy-to-implement statistical methods (confidence intervals and two-proportion tests) in order to identify hard-to-classify instances and to propose alternative routes. The experimental study indicates that the proposed post-processing method consistently and significantly improves the predictive performance of decision trees, particularly for small, imbalanced or multi-class datasets in which an average improvement of 5%~9% in the AUC performance is reported. |
Databáze: | OpenAIRE |
Externí odkaz: |