TY - JOUR
T1 - Monitoring the Industrial waste polluted stream - Integrated analytics and machine learning for water quality index assessment
AU - Ejaz, Ujala
AU - Khan, Shujaul Mulk
AU - Jehangir, Sadia
AU - Ahmad, Zeeshan
AU - Abdullah, Abdullah
AU - Iqbal, Majid
AU - Khalid, Noreen
AU - Nazir, Aisha
AU - Svenning, Jens Christian
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/4
Y1 - 2024/4
N2 - The Water Quality Index (WQI) is a primary metric used to evaluate and categorize surface water quality which plays a crucial role in the management of fresh water resources. Machine Learning (ML) modeling offers potential insights into water quality index prediction. This study employed advanced ML models to get potential insights into the prediction of water quality index for the Aik-Stream, an industrially polluted natural water resource in Pakistan with 19 input water quality variables aligning them with surrounding land use and anthropogenic activities. Six machine learning algorithms, i.e. Adaptive Boosting (AdaBoost), K-Nearest Neighbors (K-NN), Gradient Boosting (GB), Random Forests (RF), Support Vector Regression (SVR), and Bayesian Regression (BR) were employed as benchmark models to predict the Water Quality Index (WQI) values of the polluted stream to achieve our objectives. For model calibration, 80% of the dataset was reserved for training, while 20% was set aside for testing. In our comparative analyses of predictive models for water quality index, the Gradient Boost (GB) model stood out the fittest for its precision, utilizing a combination of just seven parameters (chemical oxygen demand, total organic carbon, oil & grease, Ammonia- nitrogen, arsenic, nickel and zinc), surpassing other models by achieving better results in both training (R2 = 0.88, RMSE = 7.24) and testing (R2 = 0.85, RMSE = 8.67). Analyzing feature importance showed that all the selected variables, except for NO3 N, TDS and temperature had an impact on the accuracy of the models predictions. It is concluded that the application of machine learning to assess water quality in polluted environments enhances accuracy and facilitates real-time tracking, enabling proactive risk mitigations.
AB - The Water Quality Index (WQI) is a primary metric used to evaluate and categorize surface water quality which plays a crucial role in the management of fresh water resources. Machine Learning (ML) modeling offers potential insights into water quality index prediction. This study employed advanced ML models to get potential insights into the prediction of water quality index for the Aik-Stream, an industrially polluted natural water resource in Pakistan with 19 input water quality variables aligning them with surrounding land use and anthropogenic activities. Six machine learning algorithms, i.e. Adaptive Boosting (AdaBoost), K-Nearest Neighbors (K-NN), Gradient Boosting (GB), Random Forests (RF), Support Vector Regression (SVR), and Bayesian Regression (BR) were employed as benchmark models to predict the Water Quality Index (WQI) values of the polluted stream to achieve our objectives. For model calibration, 80% of the dataset was reserved for training, while 20% was set aside for testing. In our comparative analyses of predictive models for water quality index, the Gradient Boost (GB) model stood out the fittest for its precision, utilizing a combination of just seven parameters (chemical oxygen demand, total organic carbon, oil & grease, Ammonia- nitrogen, arsenic, nickel and zinc), surpassing other models by achieving better results in both training (R2 = 0.88, RMSE = 7.24) and testing (R2 = 0.85, RMSE = 8.67). Analyzing feature importance showed that all the selected variables, except for NO3 N, TDS and temperature had an impact on the accuracy of the models predictions. It is concluded that the application of machine learning to assess water quality in polluted environments enhances accuracy and facilitates real-time tracking, enabling proactive risk mitigations.
KW - Artificial Intelligence
KW - Assessment and monitoring
KW - Industrial wastewater
KW - Machine learning
KW - Water Quality Index
UR - http://www.scopus.com/inward/record.url?scp=85189547260&partnerID=8YFLogxK
U2 - 10.1016/j.jclepro.2024.141877
DO - 10.1016/j.jclepro.2024.141877
M3 - Journal article
AN - SCOPUS:85189547260
SN - 0959-6526
VL - 450
JO - Journal of Cleaner Production
JF - Journal of Cleaner Production
M1 - 141877
ER -