An Improved C4.5 Algorithm using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute for Large Dataset
Autor: | B. Z. Yahaya, Jamila Musa Amshi, Abdulkadir Ahmad, L. J. Muhammad, Muhammed Besiru Jibrin, I.A. Mohammed Besiru Jibrin |
---|---|
Rok vydání: | 2020 |
Předmět: |
020205 medical informatics
Computer science Infinitesimal Decision tree 02 engineering and technology computer.software_genre 03 medical and health sciences Statistical classification ComputingMethodologies_PATTERNRECOGNITION 0302 clinical medicine C4.5 algorithm 0202 electrical engineering electronic engineering information engineering 030212 general & internal medicine Data mining computer Time complexity Scaling Selection (genetic algorithm) Arithmetic mean |
Zdroj: | 2020 10th International Conference on Computer and Knowledge Engineering (ICCKE). |
DOI: | 10.1109/iccke50421.2020.9303622 |
Popis: | Scaling up the data-mining classification algorithms to very large datasets has been attracting growing interest now a days. Many techniques have been employed to improve those algorithms but efficient data-mining classification algorithms that have a minimal decrease in accuracy with little increase in time complexity remain very important. The C4.5 algorithm is one of the data mining classification algorithms that have been used for uncovering hidden patterns and gleaning useful and novel knowledge in such large datasets. This work proposes a new C4.5 data mining algorithm with a lesser time complexity for large dataset compared with traditional C.45 algorithm, but however for smaller dataset traditional C.45 algorithm has lesser time complexity. The new algorithm was improved using Principle of Equivalent of Infinitesimal and Arithmetic Mean Best Selection Attribute. |
Databáze: | OpenAIRE |
Externí odkaz: |