Prediction of arsenic concentration in groundwater of Chapainawabganj, Bangladesh: machine learning-based approach to spatial modeling.

Autor: Khatun MF; Department of Geology and Mining, University of Rajshahi, Rajshahi, Bangladesh., Reza AHMS; Department of Geology and Mining, University of Rajshahi, Rajshahi, Bangladesh. sreza69@yahoo.com., Sattar GS; Department of Geology and Mining, University of Rajshahi, Rajshahi, Bangladesh., Khan AS; Asia Arsenic Network (AAN), Jashore, Bangladesh., Khan MIA; Department of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh.
Jazyk: angličtina
Zdroj: Environmental science and pollution research international [Environ Sci Pollut Res Int] 2024 Jul; Vol. 31 (33), pp. 46023-46037. Date of Electronic Publication: 2024 Jul 09.
DOI: 10.1007/s11356-024-34148-2
Abstrakt: Groundwater in northwestern parts of Bangladesh, mainly in the Chapainawabganj District, has been contaminated by arsenic. This research documents the geographical distribution of arsenic concentrations utilizing machine learning techniques. The study aims to enhance the accuracy of model predictions by precisely identifying occurrences of groundwater arsenic, enabling effective mitigation actions and yielding more beneficial results. The reductive dissolution of arsenic-rich iron oxides/hydroxides is identified as the primary mechanism responsible for the release of arsenic from sediment into groundwater. The study reveals that in the research region, alongside elevated arsenic concentrations, significant levels of sodium (Na), iron (Fe), manganese (Mn), and calcium (Ca) were present. Statistical analysis was employed for feature selection, identifying pH, electrical conductivity (EC), sulfate (SO 4 ), nitrate (NO 3 ), Fe, Mn, Na, K, Ca, Mg, bicarbonate (HCO 3 ), phosphate (PO 4 ), and As as features closely associated with arsenic mobilization. Subsequently, various machine learning models, including Naïve Bayes, Random Forest, Support Vector Machine, Decision Tree, and logistic regression, were employed. The models utilized normalized arsenic concentrations categorized as high concentration (HC) or low concentration (LC), along with physiochemical properties as features, to predict arsenic occurrences. Among all machine learning models, the logistic regression and support vector machine models demonstrated high performance based on accuracy and confusion matrix analysis. In this study, a spatial distribution prediction map was generated to identify arsenic-prone areas. The prediction map also displays that Baroghoria Union and Rajarampur region under Chapainawabganj municipality are high-risk areas and Maharajpur Union and Baliadanga Union are comparatively low-risk areas of the research area. This map will facilitate researchers and legislators in implementing mitigation strategies. Logistic regression (LR) and support vector machine (SVM) models will be utilized to monitor arsenic concentration values continuously.
(© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.)
Databáze: MEDLINE