Popis: |
As the key to big data, extracting the potential and core value from a large volume of data is very important. In this study, we propose a discrete data processing algorithm, known as the improved class-attribute contingency coefficient (ICACC) algorithm, based on the rough sets theory and class attribute strain coefficient standard. The proposed ICACC can reduce the data distortion rate effectively by selecting correct discrete points as well as adding the constraint of data inconsistency rate. By testing with the C4.5 algorithm, the proposed ICACC represents a nearly 10 % improvement in terms of the recognition rate and the calculation accuracy, compared with traditional algorithms. |