A two-level sampling strategy for pruning methods applied to credit scoring
Autor: | Luiz Vieira e Silva Filho, George D. C. Cavalcanti |
---|---|
Rok vydání: | 2020 |
Předmět: |
Computer science
business.industry Sampling (statistics) 020206 networking & telecommunications Context (language use) 02 engineering and technology Machine learning computer.software_genre Ensemble learning Task (computing) Complementarity (molecular biology) 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Pruning (decision trees) Artificial intelligence business computer Subspace topology |
Zdroj: | SMC |
DOI: | 10.1109/smc42975.2020.9283116 |
Popis: | Multiple Classifiers Systems (MCS) are based on the idea that the combination of the opinion of several experts can generate better results than when only one expert is used. Several MCS techniques have been developed; each one has its strengths and weaknesses depending on the context in which they are applied. This work presents a two-level sampling strategy for pruning methods that are applied to the credit scoring task. The first step of the proposal is to generate a pool using two well-known sampling methods, bagging and random subspace, that work complementarity in order to produce a diverse pool. After, a pruning method reduces the generated pool maintaining only the most competent classifiers. So, the proposal improves the MCS regarding the accuracy and the computational effort, since only a small percentage of the original pool is stored. The proposed architecture is evaluated in a credit scoring application, and the results showed that the proposed architecture obtained better accuracy rates than the single best approach and literature methods. These results were also obtained with ensembles whose sizes were around 20% of the original pools generated in the training phase. |
Databáze: | OpenAIRE |
Externí odkaz: |