Popis: |
This paper proposed a new scheme to overcome the non-technical loss in the line loss of the distribution network. Based on the raw data derived from the power system, the differences between normal users and frauds, especially on the power consumption, are explored firstly. Combining with Spark platform, an ensemble decision tree model is developed which can be applied on the distributed system for big data analysis and efficiently confirm suspected customers. Experimental results show that comparing with individual decision model, the proposed ensemble model has strong parallel computing ability is suitable for the analysis of massive data and valuable in practical applications. |