Popis: |
In many machine learning applications, there are many scenarios when performance is not satisfactory by single classifiers. In this case, an ensemble classification is constructed using several weak base learners to achieve satisfactory performance. Unluckily, the construction of the ensemble classification is empirical, i.e., to try an ensemble classification and if performance is not satisfactory then discard it. In this paper, a challenging analytical problem of the estimation of ensemble classification using the prediction performance of the base learners is considered. The proposed formulation is aimed at estimating the performance of ensemble classification without physically developing it, and it is derived from the perspective of probability theory by manipulating the decision probabilities of the base learners. For this purpose, the output of a base learner (which is either true positive, true negative, false positive, or false negative) is considered as a random variable. Then, the effects of logical disjunction-based and majority voting-based decision combination strategies are analyzed from the perspective of conditional joint probability. To evaluate the forecasted performance of ensemble classifier by the proposed methodology, publicly available standard datasets have been employed. The results show the effectiveness of the derived formulations to estimate the performance of ensemble classification. In addition to this, the theoretical and experimental results show that the logical disjunction-based decision outperforms majority voting in imbalanced datasets and cost-sensitive scenarios. |