Popis: |
With the expansion of power grid, unaffordable computational cost and time will pose serious challenges of time-efficient scheduling in unit commitment problem (UCP). However, existing optimization methods, i.e., mathematical programming methods and meta-heuristic algorithms, are powerless and time-consuming to handle computationally expensive UCP (CEUCP). Thus, reinforcement learning methods with strong inference and time-saving performances are motivated to solve the computationally expensive challenges in tackling CEUCPs. In this paper, a novel expert knowledge data-driven based actor–critic (AC) reinforcement learning methodology is proposed for solving CEUCPs. Specifically, in the proposed AC reinforcement learning methodology, expert knowledge, data-driven surrogate model, and improved meta-heuristic algorithm are integrated for further performance enhancement. Firstly, a novel action selection mechanism (based on the expert knowledge of thermal units characteristic) is integrated into AC to improve the efficiency of network training. Secondly, an improved extreme learning machine (ELM) data-driven surrogate model is proposed to build reward function in AC. In detail, original cost function in reward is replaced by a lightweight ELM model. Shape distance is integrated into ELM for enhancing accuracy. Finally, original marine predators algorithm (MPA) is improved for obtaining optimal dispatching decisions and rewards of AC method quickly and correctly. Original search pattern is replaced by quantum based representation for boosting convergence. The excellent performances of the proposed AC framework are verified by simulations of 10-units, 100-units, and 100-units with wind energy test systems. |