A PCA-based variable ranking and selection approach for electric energy load forecasting

Autor: Francisco Elânio Bezerra, Flavio Grassi, Cleber Gustavo Dias, Fabio Henrique Pereira
Rok vydání: 2022
Předmět:
Zdroj: International Journal of Energy Sector Management. 16:1172-1191
ISSN: 1750-6220
DOI: 10.1108/ijesm-12-2019-0009
Popis: Purpose This paper aims to propose an approach based upon the principal component analysis (PCA) to define a contribution rate for each variable and then select the main variables as inputs to a neural network for energy load forecasting in the region southeastern Brazil. Design/methodology/approach The proposed approach defines a contribution rate of each variable as a weighted sum of the inner product between the variable and each principal component. So, the contribution rate is used for selecting the most important features of 27 variables and 6,815 electricity data for a multilayer perceptron network backpropagation prediction model. Several tests, starting from the most significant variable as input, and adding the next most significant variable and so on, are accomplished to predict energy load (GWh). The Kaiser–Meyer–Olkin and Bartlett sphericity tests were used to verify the overall consistency of the data for factor analysis. Findings Although energy load forecasting is an area for which databases with tens or hundreds of variables are available, the approach could select only six variables that contribute more than 85% for the model. While the contribution rates of the variables of the plants, plus energy exchange added, have only 14.14% of contribution, the variable the stored energy has a contribution rate of 26.31% being fundamental for the prediction accuracy. Originality/value Besides improving the forecasting accuracy and providing a faster predictor, the proposed PCA-based approach for calculating the contribution rate of input variables providing a better understanding of the underlying process that generated the data, which is fundamental to the Brazilian reality due to the accentuated climatic and economic variations.
Databáze: OpenAIRE