Numerical Estimation of Drinking Water Quality Index Using Tree Methods and Combined Wavelet Approaches and Principal Component Analysis
Autor: | M.T. Sattari, S. Javidan |
---|---|
Jazyk: | perština |
Rok vydání: | 2023 |
Předmět: | |
Zdroj: | مجله آب و خاک, Vol 36, Iss 6, Pp 695-709 (2023) |
Druh dokumentu: | article |
ISSN: | 2008-4757 2423-396X |
DOI: | 10.22067/jsw.2022.78452.1196 |
Popis: | Introduction Surface and underground waters are one of the world's most important problems and environmental concerns. In the last few decades, due to the rapid growth of the population, the water needs have increased, followed by the input load to the water. In order to classify the quality of underground water and water level according to the type of consumption, there are many methods, one of the most used methods is the use of quality indicators. Considering the facilities available in water quality monitoring stations and the need to save time and money, using alternative methods of modern data mining methods can be good for predicting and classifying water quality. The process of water extraction for domestic use, agricultural production, mineral industrial production, electricity production, and ester methods can lead to the deterioration of water quality and quantity, which affects the aquatic ecosystem, that is, the set of organisms that live and interact. Therefore, it is very important to evaluate the quality of surface water in water-environmental management and in monitoring the concentration of pollutants in rivers. The aim of the current research was to estimate the numerical values of the drinking water quality index (WQI) using the tree method and investigate the effect of wavelet transformation, the Bagging method, and principal component analysis. Materials and Methods In this research, to calculate the WQI index from the quality parameters of the Bagh Kalaye hydrometric station including total hardness (TH), alkalinity (pH), electrical conductivity (EC), total dissolved solids (TDS), calcium (Ca), sodium (Na), Magnesium (Mg), potassium (K), chlorine (Cl), carbonate (CO3), bicarbonate (HCO3) and sulfate (SO4) were used in the statistical period of 23 years (1998-2020). Quantitative values calculated with the WQI index were considered as target outputs. By using the relief and correlation method, the types of input combinations were determined. The random tree method was used to estimate the numerical values of the WQI index. Then, the capability of the combined approach of wavelet, principal component analysis, and Bagging method with random tree base algorithm was evaluated. To compare the values obtained from the data mining methods with the values calculated from the WQI index, the evaluation criteria of correlation coefficient (R), root mean square error (RMSE), mean absolute error (MAE), and modified Wilmot coefficient (Dr) were used. Results and Discussion The use of the wavelet transform method and the Bagging method has improved the modeling results. Considering that the Bagging classification method with the random tree base algorithm is a combination of the results of several random trees, so using this method has increased the accuracy of the RT model. So, in general, it was concluded that the use of wavelet transformation and classification methods increases accuracy and reduces errors. The best scenario with the highest accuracy and the lowest error was related to scenario 10 of the W-B-RT model with Total Hardness, Electrical Conductivity, Total Dissolved Solid, Sulphate, Calcium, Bicarbonate, Magnesium, Chlorine, Sodium, and potassium parameters. The results showed that the effect impact of pH in estimating the numerical value of the WQI index is considered lower than other parameters. When the principal component analysis method was used, by reducing the value of the eigenvalue from F1 to F12, the value of the factor also decreased; As a result,so F1, F2, and F3 factors were selected as the basic components. Considering 3 main factors, modeling was done employed and R=0.98, RMSE=2.17, MAE=1.52, and Dr=0.97 were obtained. In general, the results showed that the PCA method, despite reducing the dimension of the input vectors and simplifying it, can improve the accuracy and speed of the model and is introduced as the best method for estimating the numerical value of the WQI index. Conclusion The results obtained from the present research showed that the use of wavelet transform, Bagging and PCA methods had a positive effect on improving the results and increasing higherthe accuracy. In estimating the numerical values of WQI index, PCA-B-RT method considering 3 main factors, with correlation coefficient equal to 0.98, root mean square error equal to 2.17, average absolute value error equal to 1.52 and tThe modified Wilmot coefficient equal to 0.97 had the highest accuracy. Considering that all the methods used in the estimation of quantitative values had acceptable accuracy, therefore, in case of lack of data and lack of access to all chemical parameters, it is possible to obtain appropriate and acceptable results by using a limited number of parameters and data mining methods achieved. |
Databáze: | Directory of Open Access Journals |
Externí odkaz: |