Assessing the environmental determinants of micropollutant contamination in streams using explainable machine learning and network analysis.
Autor: | Ban MJ; Department of Civil and Environmental Engineering, Dongguk University-Seoul, Seoul, 04620, South Korea., Lee DH; Department of Civil and Environmental Engineering, Dongguk University-Seoul, Seoul, 04620, South Korea., Lee BT; Central Research Facilities, Gwangju Institute of Science and Technology, Gwangju, South Korea. Electronic address: btlee@gist.ac.kr., Kang JH; Department of Civil and Environmental Engineering, Dongguk University-Seoul, Seoul, 04620, South Korea. Electronic address: joohyon@dongguk.edu. |
---|---|
Jazyk: | angličtina |
Zdroj: | Chemosphere [Chemosphere] 2024 Dec 31; Vol. 370, pp. 144041. Date of Electronic Publication: 2024 Dec 31. |
DOI: | 10.1016/j.chemosphere.2024.144041 |
Abstrakt: | Even at trace concentrations, micropollutants, including pesticides and pharmaceuticals, pose considerable ecological risks, and the increasing presence of synthetic chemical substances in aquatic systems has emerged as a growing concern. Moreover, limited machine-learning (ML) approaches exist for analyzing environmental data, and the increasing complexity of ML models has made it challenging to understand predictor-outcome relationships. In particular, understanding complex interactions among multiple variables remains challenging. This study applies and integrates explainable ML techniques and network analysis to identify the sources of micropollutants in a large watershed and determine the factors affecting micropollutant levels. We assessed the performance of four ML algorithms-support vector machine, random forest, extreme gradient boosting (XGB), and autoencoder-XGB-in predicting micropollutant levels based on the spatial characteristics of the watershed. We applied the synthetic minority oversampling technique to address the data imbalance. The XGB model demonstrated superior predictive performance, particularly for high concentration levels, achieving an accuracy of 87%-99%. Shapley additive explanations (SHAP) analysis identified temperature and rainfall as significant factors. Moreover, agricultural activities contributed to pesticide pollution, whereas urban activities contributed to pharmaceutical contamination. The network analysis corroborated the SHAP findings and revealed event-specific contamination characteristics. This included distinct discharge pathways during a dry summer event and shared pathways during a wet winter event. This approach enhances an understanding of contamination sources and pathways and subsequently aids in developing control measures and making informed policy decisions to preserve water quality in mixed land-use areas. Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. (Copyright © 2024 Elsevier Ltd. All rights reserved.) |
Databáze: | MEDLINE |
Externí odkaz: |