Improved Random Forest Algorithm Based on Decision Paths for Fault Diagnosis of Chemical Process with Incomplete Data
Autor: | Lei Luo, Xu Ji, Yiyang Dai, Yuequn Zhang |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Big Data
Majority rule incomplete data Chemical Phenomena Computer science Decision tree Sample (statistics) TP1-1185 computer.software_genre Biochemistry Fault detection and isolation Article Analytical Chemistry reliability scores Electrical and Electronic Engineering Instrumentation Reliability (statistics) Chemical technology Reproducibility of Results fault diagnosis Missing data decision path Atomic and Molecular Physics and Optics Random forest Tree (data structure) Data mining computer Algorithms random forest |
Zdroj: | Sensors, Vol 21, Iss 6715, p 6715 (2021) Sensors Volume 21 Issue 20 Sensors (Basel, Switzerland) |
ISSN: | 1424-8220 |
Popis: | Fault detection and diagnosis (FDD) has received considerable attention with the advent of big data. Many data-driven FDD procedures have been proposed, but most of them may not be accurate when data missing occurs. Therefore, this paper proposes an improved random forest (RF) based on decision paths, named DPRF, utilizing correction coefficients to compensate for the influence of incomplete data. In this DPRF model, intact training samples are firstly used to grow all the decision trees in the RF. Then, for each test sample that possibly contains missing values, the decision paths and the corresponding nodes importance scores are obtained, so that for each tree in the RF, the reliability score for the sample can be inferred. Thus, the prediction results of each decision tree for the sample will be assigned to certain reliability scores. The final prediction result is obtained according to the majority voting law, combining both the predicting results and the corresponding reliability scores. To prove the feasibility and effectiveness of the proposed method, the Tennessee Eastman (TE) process is tested. Compared with other FDD methods, the proposed DPRF model shows better performance on incomplete data. |
Databáze: | OpenAIRE |
Externí odkaz: |