WAPOT: Data Driven Approach for Water Potability Detection using Machine Learning
Abstract
Water potability grading is crucial to public health and safety. It is a critical responsibility of regulatory authorities and water treatment facilities to guarantee that individuals have access to potable and secure drinking water, an inherent human right. The water potability classification is a preventative measure to detect potential impurities or contaminants that may present adverse health effects upon ingestion. This study examines a machine learning approach for classifying the potability of drinking water, utilizing ensemble learning methods (WAPOT) such as Stacking classifiers. Stacking, as a form of ensemble learning, consistently outperforms standalone classifiers and other existing research works, offering improved accuracy of 97% in potability classification. The findings underscore the capacity of machine learning to significantly contribute to the monitoring and managing of water treatment processes.
References
L. Lin, H. Yang, X., Xu, (2022) Effects of Water Pollution on Human Health and Disease Heterogeneity: A Review. Frontiers in Environmental Science, 10, 880246. https://doi.org/10.3389/fenvs.2022.880246
N. Akhtar, M.I. Syakir Ishak, S.A. Bhawani, K. Umar, (2021) Various Natural and Anthropogenic Factors Responsible for Water Quality Degradation: A Review. Water, 13(19), 2660. https://doi.org/10.3390/w13192660
UN-Water. (n.d.). Water and climate change. Retrieved 2023, from https://www.unwater.org/water-facts/water-and-climate-change
A. Singh, A. Sharma, R.K. Verma, R.L. Chopade, P.P. Pandit, V. Nagar, V. Aseri, S.K. Choudhary, G. Awasthi, K.K. Awasthi, M.S. Sankhla, (2022) Heavy Metal Contamination of Water and Their Toxic Effect on Living Organisms. In The Toxicity of Environmental Pollutants. Intechopen. https://doi.org/10.5772/intechopen.105075
M. Zhu, J. Wang, X. Yang, Y. Zhang, L. Zhang, H. Ren, B. Wu, L. Ye, (2022) A Review of the Application of Machine Learning in Water Quality Evaluation. Eco-Environment & Health, 1(2), 107-116. https://doi.org/10.1016/j.eehl.2022.06.001
Water Quality Indicators: Available at: https://www.rampalberta.org/river/water+sediment+quality/chemical.aspx
S.Y. Muhammad, M. Makhtar, A. Rozaimee, A.A. Aziz, A.A. Jamal, (2015). Classification Model for Water Quality Using Machine Learning Techniques. International Journal of Software Engineering and its Applications, 9(6), 45-52. http://dx.doi.org/10.14257/ijseia.2015.9.6.05
T.H.H. Aldhyani, M. Al-Yaari, H. Alkahtani, M. Maashi, (2020) [Retracted] Water Quality Prediction Using Artificial Intelligence Algorithms. Applied Bionics and Biomechanics, 2020(1), 6659314. https://doi.org/10.1155/2020/6659314
S. Selvaraj, (2021). Water Potability Prediction with Machine Learning. Medium. https://parudhi.medium.com/water-potability-prediction-with-machine-learning-8eea74c29708
M.I.T.K. Haq, F.D. Ramadhan, F. Az-Zahra, L. Kurniawati, A. Helen, (2021) Classification of Water Potability Using Machine Learning Algorithms. In 2021 International Conference on Artificial Intelligence and Big Data Analytics IEEE. 1-5. https://doi.org/10.1109/ICAIBDA53487.2021.9689727
J. Patel, C. Amipara, T.A. Ahanger, K. Ladhva, R.K. Gupta, H.O. Alsaab, Y.S. Althobaiti, R. Ratna, (2022) A Machine Learning‐Based Water Potability Prediction Model by Using Synthetic Minority Oversampling Technique and Explainable AI. Computational Intelligence and Neuroscience, 2022(1), 9283293. https://doi.org/10.1155/2022/9283293
V. Flores, I. Bravo, M. Saavedra, (2023) Water Quality Classification and Machine Learning Model for Predicting Water Quality Status—A Study on Loa River Located in an Extremely Arid Environment: Atacama Desert. Water, 15(16), 2868. https://doi.org/10.3390/w15162868
T., Alkhudaydi, M.Q. Albalawi, J.S. Alanazi, W. Al-Anazi, R.M. Alfarshouti, (2023) Deep Learning for Combined Water Quality Testing and Crop Recommendation. International Journal of Advanced Computer Science and Applications, 14(4). https://dx.doi.org/10.14569/IJACSA.2023.0140450
M. Lu, Q. Hou, S. Qin, L. Zhou, D. Hua, X. Wang, L. Cheng, (2023) A Stacking Ensemble Model of Various Machine Learning Models for Daily Runoff Forecasting. Water, 15(7), 1265. https://doi.org/10.3390/w15071265
A.Patel, (2021). Water Quality Prediction: 7Model Input. Kaggle. https://www.kaggle.com/code/imakash3011/water-quality-prediction-7 model/input
P. Khurana, (2021). Distribution of data — Histogram. Medium. https://prvnk10.medium.com/distribution-of-data-histogram-a88c7e97728b
S. Alam, M.S. Ayub, S. Arora, M.A. Khan, (2023) An Investigation of The Imputation Techniques for Missing Values in Ordinal Data Enhancing Clustering and Classification Analysis Validity. Decision Analytics Journal, 9, 100341. https://doi.org/10.1016/j.dajour.2023.100341
Loukas, S. (2020). Everything you need to know about Min-Max normalization: A Python tutorial. Towards Data Science, 5.https://medium.com/data-science/everything-you-need-to-know-about-min-max-normalization-in-python-b79592732b79
Brownlee, J. (2020). Train-Test Split for Evaluating Machine-Learning Algorithms. Machine learning mastery, 23(7).https://machinelearningmastery.com/train-test-split-for-evaluating-machine-learning-algorithms/
Brownlee, J. (2020). How to Calculate Precision, Recall, and F-measure for Imbalanced Classification. Machine learning mastery, 1. https://machinelearningmastery.com/precision-recall-and-f-measure-for-imbalanced-classification/
Copyright (c) 2025 Saleem Raja Abdul Samad, Maria Rajesh Antony, Pradeepa Ganesan, Sathya Ramasamy, Madhubala Radhakrishnan, Sajithabanu S

This work is licensed under a Creative Commons Attribution 4.0 International License.
Views: Abstract : 48 | PDF : 36
Plum Analytics


.png)
.png)
.png)


.png)