Analysis of feature matrix in machine learning algorithms to predict energy consumption of public buildings
Autor: | Yong Ding, Lingxiao Fan, Xue Liu |
---|---|
Rok vydání: | 2021 |
Předmět: |
Data collection
Computer science business.industry 020209 energy Mechanical Engineering Big data Supervised learning 0211 other engineering and technologies 02 engineering and technology Building and Construction Energy consumption Variance (accounting) Machine learning computer.software_genre Kernel (statistics) 021105 building & construction 0202 electrical engineering electronic engineering information engineering Feature (machine learning) Artificial intelligence Electrical and Electronic Engineering business computer Algorithm Decision tree model Civil and Structural Engineering |
Zdroj: | Energy and Buildings. 249:111208 |
ISSN: | 0378-7788 |
DOI: | 10.1016/j.enbuild.2021.111208 |
Popis: | With the development of building information and energy consumption data, machine learning methods are increasingly being used for predicting and analyzing building energy consumption. In this study, based on the actual energy consumption data of 2370 public buildings in Chongqing, we used six machine learning algorithms and recursive feature elimination to analyze the importance of each feature in the dataset. First, it is necessary to establish optimal prediction models for analyzing the importance of features, and XGboost has demonstrated its superiority in terms of accuracy and efficiency. Regardless of the algorithm, the cumulative contribution rate of the top ten features exceeds 80%, and there is an obvious diminishing marginal utility when the number of features continues to increase. The learning algorithms with similar kernels have similarities in judging feature importance. Tree model-based algorithms can achieve a satisfactory performance with fewer features compared to linear kernel-based algorithms. Furthermore, the dataset plays a crucial role in model performance. To achieve professional supervised learning, two conditions need to be considered simultaneously in data collection: the importance of features in physical processes and whether the samples have adequate variance on these features. Thus, this study can provide a reference for database establishment and big data analysis of urban building energy consumption. |
Databáze: | OpenAIRE |
Externí odkaz: |