Hybrid FCMG-OP-FIS model approach to convert regression into classification data for machine learning-based AQI prediction

Autor: K.M. Ordenshiya, G.K. Revathi
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Heliyon, Vol 10, Iss 21, Pp e39759- (2024)
Druh dokumentu: article
ISSN: 2405-8440
DOI: 10.1016/j.heliyon.2024.e39759
Popis: Air pollution from vehicle emissions, industrial activities, and medical facilities poses significant health risks in urban areas, underscoring the necessity for robust air quality index (AQI) monitoring. This paper presents a novel method for AQI prediction by integrating a fuzzy centre merge graph with an optimal value-based fuzzy inference system (FCMG-OP-FIS) and machine learning (ML). Traditional ML techniques encounter difficulties when converting regression datasets into classification formats, particularly when unable to label the dataset using the traditional method. The proposed FCMG-OP-FIS model efficiently converts regression data into a classification framework. Unlike traditional AQI prediction methods that rely solely on pollutant data, this approach incorporates both pollutant and meteorological data to improve prediction accuracy. The innovative fuzzy centre merge graph (FCMG) balances the dataset for optimal solutions and facilitates input grouping for Simulink, simplifying rule management. The FCMG-OP-FIS model generates a regression output for AQI, which is subsequently classified into levels (healthy, moderate, or unhealthy) using IF-THEN rules. To enhance accuracy further, a random forest classifier (RFC) is trained on the FCMG-OP-FIS classified output data. The regression output of the FCMG-OP-FIS model is validated using metrics such as RMSE (0.48), MSE (0.23), MAE (0.23), and MAPE (1.77%). Additionally, the classification output from the RFC model employs advanced validation techniques including stratified shuffle validation, grid search cross-validation, and confusion matrix analysis, achieving an accuracy rate of 99%, with the F1 score, precision, and recall over all at 99%. These results demonstrate the effectiveness of the proposed model in accurately labelling data for classification and predicting AQI through ML, highlighting its potential for practical application in environmental monitoring and management.
Databáze: Directory of Open Access Journals