Exploring the potential of machine learning to understand the occurrence and health risks of haloacetic acids in a drinking water distribution system.

Autor: Yu Y; School of Environmental Science and Engineering, Xiamen University of Technology, Xiamen 361024, China; Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China; Key Laboratory of Water Resources Utilization and Protection, Xiamen city, Xiamen 361005, China., Hossain MM; Department of Civil and Environmental Engineering, South Dakota School of Mines and Technology, Rapid City, SD 57701, USA., Sikder R; Department of Civil and Environmental Engineering, South Dakota School of Mines and Technology, Rapid City, SD 57701, USA., Qi Z; Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China., Huo L; Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China., Chen R; School of Environmental Science and Engineering, Zhejiang Gongshang University, Hangzhou 310018, Zhejiang, China. Electronic address: chenruya2021@163.com., Dou W; Key Laboratory of Industrial Pollution Control and Reuse of Jiangsu Province, College of Environmental Engineering, Xuzhou University of Technology, Xuzhou 221018, China., Shi B; Drinking Water Science and Technology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China., Ye T; Department of Civil and Environmental Engineering, South Dakota School of Mines and Technology, Rapid City, SD 57701, USA. Electronic address: Tao.Ye@sdsmt.edu.
Jazyk: angličtina
Zdroj: The Science of the total environment [Sci Total Environ] 2024 Nov 15; Vol. 951, pp. 175573. Date of Electronic Publication: 2024 Aug 15.
DOI: 10.1016/j.scitotenv.2024.175573
Abstrakt: Determining the occurrence of disinfection byproducts (DBPs) in drinking water distribution system (DWDS) remains challenging. Predicting DBPs using readily available water quality parameters can help to understand DBPs associated risks and capture the complex interrelationships between water quality and DBP occurrence. In this study, we collected drinking water samples from a distribution network throughout a year and measured the related water quality parameters (WQPs) and haloacetic acids (HAAs). 12 machine learning (ML) algorithms were evaluated. Random Forest (RF) achieved the best performance (i.e., R 2 of 0.78 and RMSE of 7.74) for predicting HAAs concentration. Instead of using cytotoxicity or genotoxicity separately as the surrogate for evaluating toxicity associated with HAAs, we created a health risk index (HRI) that was calculated as the sum of cytotoxicity and genotoxicity of HAAs following the widely used Tic-Tox approach. Similarly, ML models were developed to predict the HRI, and RF model was found to perform the best, obtaining R 2 of 0.69 and RMSE of 0.38. To further explore advanced ML approaches, we developed 3 models using uncertainty-based active learning. Our findings revealed that Categorical Boosting Regression (CAT) model developed through active learning substantially outperformed other models, achieving R 2 of 0.87 and 0.82 for predicting concentration and the HRI, respectively. Feature importance analysis with the CAT model revealed that temperature, ions (e.g., chloride and nitrate), and DOC concentration in the distribution network had a significant impact on the occurrence of HAAs. Meanwhile, chloride ion, pH, ORP, and free chlorine were found as the most important features for HRI prediction. This study demonstrates that ML has the potential in the prediction of HAA occurrence and toxicity. By identifying key WQPs impacting HAA occurrence and toxicity, this research offers valuable insights for targeted DBP mitigation strategies.
Competing Interests: Declaration of competing interest The authors declare that we have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
(Copyright © 2024 Elsevier B.V. All rights reserved.)
Databáze: MEDLINE