Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan

Water Quality Index (WQI) is a unique and effective rating technique for assessing the quality of water. Nevertheless, most of the indices are not applicable to all water types as these are dependent on core physico-chemical water parameters that can make them biased and sensitive towards specific a...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Mehreen Ahmed, Rafia Mumtaz, Syed Mohammad Hassan Zaidi
Formato: article
Lenguaje:EN
Publicado: IWA Publishing 2021
Materias:
iot
Acceso en línea:https://doaj.org/article/91a2d66715d743dbbde8c414382274df
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:91a2d66715d743dbbde8c414382274df
record_format dspace
spelling oai:doaj.org-article:91a2d66715d743dbbde8c414382274df2021-11-06T10:08:59ZAnalysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan1606-97491607-079810.2166/ws.2021.082https://doaj.org/article/91a2d66715d743dbbde8c414382274df2021-09-01T00:00:00Zhttp://ws.iwaponline.com/content/21/6/3225https://doaj.org/toc/1606-9749https://doaj.org/toc/1607-0798Water Quality Index (WQI) is a unique and effective rating technique for assessing the quality of water. Nevertheless, most of the indices are not applicable to all water types as these are dependent on core physico-chemical water parameters that can make them biased and sensitive towards specific attributes including: (i) time, location and frequency for data sampling; (ii) number, variety and weights allocation of parameters. Therefore, there is a need to evaluate these indices to eliminate uncertainties that make them unpredictable and which may lead to manipulation of the water quality classes. The present study calculated five WQIs for two temporal periods: (i) June to December 2019 obtained in real time (using the Internet of Things (IoT) nodes) at inlet and outlet streams of Rawal Dam; (ii) 2012–2019 obtained from the Rawal Dam Water Filtration Plant, collected through GIS-based grab sampling. The computed WQIs categorized the collected datasets as ‘Very Poor’, primarily owing to the uneven distribution of the water samples that has led to class imbalance in the data. Additionally, this study investigates the classification of water quality using machine learning algorithms namely: Decision Tree (DT), k-Nearest Neighbor (KNN), Logistic Regression (LogR), Multilayer Perceptron (MLP) and Naive Bayes (NB); based on the parameters including: pH, dissolved oxygen, conductivity, turbidity, fecal coliform and temperature. The classification results showed that the DT algorithm outperformed other models with a classification accuracy of 99%. Although WQI is a popular method used to assess the water quality, there is a need to address the uncertainties and biases introduced by the limitations of data acquisition (such as specific location/area, type and number of parameters or water type) leading to class imbalance. This can be achieved by developing a more refined index that considers various other factors such as topographical and hydrological parameters with spatial temporal variations combined machine learning techniques to effectively contribute in estimation of water quality for all regions. HIGHLIGHTS Evaluated five WQI based on six physico-chemical parameters to analyze their sensitivity toward selected location, type and frequency for data sampling.; Computed WQIs categorized the dataset as ‘Very Poor’ because of the uneven distribution of water samples leading to class imbalance.; Five ML models used in which Decision Tree classification accuracy is 99%.; For refined index topographical and hydrological parameters should be considered.;Mehreen AhmedRafia MumtazSyed Mohammad Hassan ZaidiIWA Publishingarticledecision treeiotmachine learningphysico-chemicalwater quality indexwater quality monitoringWater supply for domestic and industrial purposesTD201-500River, lake, and water-supply engineering (General)TC401-506ENWater Supply, Vol 21, Iss 6, Pp 3225-3250 (2021)
institution DOAJ
collection DOAJ
language EN
topic decision tree
iot
machine learning
physico-chemical
water quality index
water quality monitoring
Water supply for domestic and industrial purposes
TD201-500
River, lake, and water-supply engineering (General)
TC401-506
spellingShingle decision tree
iot
machine learning
physico-chemical
water quality index
water quality monitoring
Water supply for domestic and industrial purposes
TD201-500
River, lake, and water-supply engineering (General)
TC401-506
Mehreen Ahmed
Rafia Mumtaz
Syed Mohammad Hassan Zaidi
Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan
description Water Quality Index (WQI) is a unique and effective rating technique for assessing the quality of water. Nevertheless, most of the indices are not applicable to all water types as these are dependent on core physico-chemical water parameters that can make them biased and sensitive towards specific attributes including: (i) time, location and frequency for data sampling; (ii) number, variety and weights allocation of parameters. Therefore, there is a need to evaluate these indices to eliminate uncertainties that make them unpredictable and which may lead to manipulation of the water quality classes. The present study calculated five WQIs for two temporal periods: (i) June to December 2019 obtained in real time (using the Internet of Things (IoT) nodes) at inlet and outlet streams of Rawal Dam; (ii) 2012–2019 obtained from the Rawal Dam Water Filtration Plant, collected through GIS-based grab sampling. The computed WQIs categorized the collected datasets as ‘Very Poor’, primarily owing to the uneven distribution of the water samples that has led to class imbalance in the data. Additionally, this study investigates the classification of water quality using machine learning algorithms namely: Decision Tree (DT), k-Nearest Neighbor (KNN), Logistic Regression (LogR), Multilayer Perceptron (MLP) and Naive Bayes (NB); based on the parameters including: pH, dissolved oxygen, conductivity, turbidity, fecal coliform and temperature. The classification results showed that the DT algorithm outperformed other models with a classification accuracy of 99%. Although WQI is a popular method used to assess the water quality, there is a need to address the uncertainties and biases introduced by the limitations of data acquisition (such as specific location/area, type and number of parameters or water type) leading to class imbalance. This can be achieved by developing a more refined index that considers various other factors such as topographical and hydrological parameters with spatial temporal variations combined machine learning techniques to effectively contribute in estimation of water quality for all regions. HIGHLIGHTS Evaluated five WQI based on six physico-chemical parameters to analyze their sensitivity toward selected location, type and frequency for data sampling.; Computed WQIs categorized the dataset as ‘Very Poor’ because of the uneven distribution of water samples leading to class imbalance.; Five ML models used in which Decision Tree classification accuracy is 99%.; For refined index topographical and hydrological parameters should be considered.;
format article
author Mehreen Ahmed
Rafia Mumtaz
Syed Mohammad Hassan Zaidi
author_facet Mehreen Ahmed
Rafia Mumtaz
Syed Mohammad Hassan Zaidi
author_sort Mehreen Ahmed
title Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan
title_short Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan
title_full Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan
title_fullStr Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan
title_full_unstemmed Analysis of water quality indices and machine learning techniques for rating water pollution: a case study of Rawal Dam, Pakistan
title_sort analysis of water quality indices and machine learning techniques for rating water pollution: a case study of rawal dam, pakistan
publisher IWA Publishing
publishDate 2021
url https://doaj.org/article/91a2d66715d743dbbde8c414382274df
work_keys_str_mv AT mehreenahmed analysisofwaterqualityindicesandmachinelearningtechniquesforratingwaterpollutionacasestudyofrawaldampakistan
AT rafiamumtaz analysisofwaterqualityindicesandmachinelearningtechniquesforratingwaterpollutionacasestudyofrawaldampakistan
AT syedmohammadhassanzaidi analysisofwaterqualityindicesandmachinelearningtechniquesforratingwaterpollutionacasestudyofrawaldampakistan
_version_ 1718443810118172672