Water is essential for life, as it supports bodily functions, nourishes crops, and maintains ecosystems. Drinking water is crucial for maintaining good health and can also contribute to economic development by reducing healthcare costs and improving productivity. In this study, we employed five different machine learning algorithms – logistic regression (LR), decision tree classifier (DTC), extreme gradient boosting (XGB), random forest (RF), and K-nearest neighbors (KNN) – to analyze the dataset, and their prediction performance were evaluated using four metrics: accuracy, precision, recall, and F1 score. Physiochemical parameters of 30 groundwater samples were analyzed to determine the Water Quality Index (WQI) of Pano Aqil city, Pakistan. The samples were categorized into the following four classes based on their WQI values: excellent water, good water, poor water, and unfit for drinking. The WQI scores showed that only 43.33% of the samples were deemed acceptable for drinking, indicating that the majority (56.67%) were unsuitable. The findings suggest that the DTC and XGB algorithms outperform all other algorithms, achieving overall accuracies of 100% each. In contrast, RF, KNN, and LR exhibit overall accuracies of 88, 75, and 50%, respectively. Researchers seeking to enhance water quality using machine learning can benefit from the models described in this study for water quality prediction.

  • Groundwater quality is evaluated using the Water Quality Index method.

  • Machine learning algorithms are used for forecasting groundwater quality.

  • The predictive capabilities of decision tree classifier, extreme gradient boosting, logistic regression, random forest, and K-nearest neighbors models have been evaluated and compared.

dS/m

deci Siemens per meter

DTC

decision tree classifier

EC

electrical conductivity

GIS

geographic information system

KNN

K-nearest neighbors

LR

logistic regression

Mg

magnesium

mg/l

milligrams per liter

pH

power of hydrogen

RF

random forest

TDS

total dissolved solids

TH

total hardness

UC

Union Council

WHO

World Health Organization

WQI

Water Quality Index

XGB

extreme gradient boosting

Water consumption helps to sustain bodily processes and avoids dehydration, making it crucial for human existence. In many areas, groundwater is a significant supply of drinking water, but it must be managed properly to avoid pollution. Drinking water can include dangerous bacteria, viruses, and chemicals that, if not adequately handled, can cause waterborne diseases and other health issues. Both natural and man-made influences, which alter the physical and chemical properties of groundwater, have a significant impact on the degradation of groundwater quality (Nordin et al. 2021). Nations rely on clean drinking water for social stability, economic growth, and public health. A healthy and thriving society and the achievement of sustainable development goals both depend on having access to clean water. Many people still need a basic supply of drinking water (Anigrou et al. 2022). Groundwater quality is a significant aspect and a basic need of many countries, including Pakistan (Panhwar et al. 2022; Solangi et al. 2022). It is estimated that ∼33% of the global population uses groundwater for drinking, irrigation, agriculture, and industrial purposes due to the ease with which less contaminated groundwater is obtained compared to surface water (Solangi et al. 2017, 2018; Jamali et al. 2023). However, once groundwater is contaminated, it is quite difficult to restore its previous purity or quality (Arulbalaji & Gurugnanam 2017). With expanded research regarding the importance of drinking water quality to people and raw water quality to amphibian life, there is an urgent need to assess groundwater quality (Ouyang 2005). It is critical that the global availability of groundwater resources be fully considered (Subramani et al. 2010).

The type of soil and chemical composition plays a significant role in groundwater quality (Abbasnia et al. 2018). Synthetic organic compounds such as pharmaceuticals and personal care products, pesticides and herbicides, and industrial chemicals, and heavy metals such as iron (Fe), zinc (Zn), copper (Cu), nickel (Ni), chromium (Cr), cadmium (Cd), mercury (Hg), lead (Pb), and arsenic (As) are among the anthropogenic components that have the greatest impact on the quality and accessibility of safe drinking water (Ali et al. 2023). Water quality is currently a major research focus for a number of scientists all around the world. As a result of these developments and analyses, a number of numerical and factual models have been developed and used to assess the quality of both surface and groundwater in various parts of the world. Pollution Index of Groundwater (PIG), Synthetic Pollution Index, Integrated Water Quality Index, Overall Index of Pollution, hierarchical cluster analysis, and Aquatic Life Water Index are some of the methodologies that are now being used in water quality evaluations. The Water Quality Index (WQI) is frequently used to determine the impact of water pollutants on the quality of water. It defines whether water is suitable for drinking or not (Solangi et al. 2019a). Various researchers such as Dede et al. (2013), Shabbir & Ahmed (2015), Sener et al. (2017), Solangi et al. (2019a), and Jamali et al. (2023) with some statistical variations of various physicochemical parameters have applied WQIs across the globe. To manage water resources and create spatial databases for the distribution of groundwater quality throughout the world, researchers around the globe also apply the geographical information system (GIS) (Sener et al. 2017; Jamali et al. 2023).

Also, environmental engineering uses machine learning (ML) in a variety of ways, including examining massive data to spot patterns and trends and leveraging more recent developments in ML techniques to create models that can forecast future environmental conditions (Tahmasebi et al. 2020). The ML is a useful tool widely used nowadays for predicting the groundwater quality. It is ability to handle complex hydrological data makes it popular (Nordin et al. 2021). Selecting appropriate input parameters can enhance its efficiency. Haghiabi et al. (2018) have used several ML models, including support vector regression (SVR), artificial neural network (ANN), and decision tree (DT). They observed that the SVR model outperformed the other models in predicting water quality parameters. El Bilali et al. (2021) utilized four ML models: ANN, DT, random forest (RF), and SVR and found the RF as the most accurate model for groundwater quality forecasting. The integration of ML, GIS, and WQI yields significant information about the quality of water (Solangi et al. 2019b; Jamali et al. 2023).

Thus, in the present study, ML models such as decision tree classifier (DTC), extreme gradient boosting (XGB), K-nearest neighbors (KNN), RF, and logistic regression (LR) using Python programming have been used, and compared the performance of all models in terms of making predictions about the physicochemical properties of groundwater in a troubled area and to assess its suitability for drinking purposes. Also, a widely applied WQI and GIS have been used to assess and map the overall quality of groundwater of Pano Aqil city, Sindh, Pakistan. This investigation also took advantage of ML in terms of predicting and monitoring the groundwater quality of the study area so that the possibility of contamination, which could affect the water quality, can be reduced.

Study area description

Pano Aqil city is located in Sindh province, Pakistan, covering an area of approximately 182 km2. It is situated along the banks of the Indus River, which is a crucial water source for irrigation and other daily activities. The population of Pano Aqil is estimated to be around 436,372 people as of the latest census. The water available in Pano Aqil is primarily obtained from groundwater wells and tube wells, which are commonly used by the local population for domestic, agricultural, and commercial purposes. In the present study, 30 georeferenced groundwater samples were gathered at random from the various hand pumps/wells situated throughout the city (Figure 1). The samples were taken in the afternoon at the start of the month and sent to the laboratory on the same day via overnight delivery. The depth of boreholes varies from 40 to 60 feet. However, the increasing demand for water in Pano Aqil has led to the overexploitation of groundwater, resulting in a significant decline in the water table as well quality of water. This excessive use of groundwater is a growing concern for the area as it threatens the sustainability of the available water resources. Moreover, the excessive pumping of groundwater has also led to water quality degradation, making the water unsuitable for domestic use. The study area experiences a hot and arid climate, with temperatures ranging from 45 to 50°C in the summer months. The high temperatures, coupled with the water crisis, make it challenging for the local population to carry out their daily activities.
Figure 1

Study area and location of samples.

Figure 1

Study area and location of samples.

Close modal

Data set and sample analysis

The collected (30) water samples were tested for a variety of physicochemical properties, including electrical conductivity (EC), pH, total dissolved solids (TDS), calcium (Ca), magnesium (Mg), total hardness (TH), chloride (Cl), nitrates, nitrites, and sulfates. The water quality parameters' results were compared to WHO drinking water quality benchmarks from 2011. An integrated indexical technique, namely the WQI model, has been used to assess the overall quality of the groundwater, and then ML algorithms were employed to predict the water quality. A visual summary of the distribution or shape of a dataset, including the presence of outliers and the degree of skewness is presented in Figure 2. Outliers can make models overfit and result in less precise results. Due to the dataset limitations in our investigation, outliers have to be included in the analysis.
Figure 2

Box plot displaying the spread of data for individual parameters, indicating the lowest and highest values, as well as the median (represented by a central horizontal line) and outliers present.

Figure 2

Box plot displaying the spread of data for individual parameters, indicating the lowest and highest values, as well as the median (represented by a central horizontal line) and outliers present.

Close modal

Calculating the WQI

The WQI performs a systematic investigation to determine the appropriateness of water for a variety of uses. The specific index of groundwater, on the other hand, is more difficult to calculate (Singh et al. 2015; Jamali et al. 2023). The WQIs have been shown to be valuable and efficient tool for determining the overall quality of water; presently, researchers and water quality management teams all around the world are much more familiar with these types of indices. As a result, this research relies on the WQI of groundwater to assess water quality:
whereas,

Wi is the relative weight of a water quality parameter, wi is the weight to the ith parameter, and n stands for the total number of parameters, qi is the water quality rating, Ci is the observed concentration for each water quality parameter, and Si is the permissible level proposed by WHO for drinking water purposes.

In the present study, water quality parameters were assigned a weight (wi) from 1 to 5 depending upon their significance in water quality evaluation for human health (Shabbir & Ahmed 2015; Sener et al. 2017; Solangi et al. 2019b, 2019c; Jamali et al. 2023). However, respective relative weights were calculated by dividing the assigned weight with the overall sum of assigned weights of all parameters (Shabbir & Ahmed 2015; Sener et al. 2017; Solangi et al. 2019b, 2019c) (Tables 1 and 2).

ML prediction models

ML is a powerful tool for monitoring and predicting water quality in specific areas. Groundwater contamination can be costly to remediate, but ML can help by alerting authorities of changes in water quality. In this study, various ML models were utilized to predict water quality.

DTC

A DT is a type of ML algorithm that is used for supervised learning. It is made up of a hierarchical structure with root, branches, and leaf nodes. The algorithm uses entropy to determine which variable should be used at the root node, and then uses the values of other attributes to create the branches and leaf nodes. Essentially, a DT is a flowchart-like structure that helps to make decisions based on the values of different attributes. Impurity metrics are used by DTs to judge the quality of a split. Gini impurity and entropy are two common impurity measurements. Gini impurity quantifies the likelihood of misclassifying a randomly selected element if it were labeled randomly. Gini impurity ‘Gini(t)’ is calculated for a node t with J classes as:
Entropy is a measure of a node's disorder or unpredictability. Entropy ‘Entropy(t)’ is determined for a node t with J classes as follows:
where is the proportion of samples of class j in node t. There are four classes in our case. To compute Pj, we would count the number of data points in node t that belong to each class. Let us call these counts Nj for j= 1, 2, , J. The total number of data points in node t is calculated and denoted as Nt. Then, compute

XGB

XGB is a popular ensemble learning technique that uses DTs as predictors, built sequentially with each tree learning from the residuals of the previous ones. This boosting process reduces errors gradually, leading to a more accurate and robust model. XGB is known for its high accuracy and efficiency in handling large datasets (Jing et al. 2020).

The objective function of XGB is a hybrid of the loss function and the regularization term:

The specific loss function can be represented as L(y, F(x)), where y is the true target, and F(x) is the prediction of the XGB model for input x.

Regularization term (Ω): The regularization term consists of both L1 (Lasso) and L2 (Ridge) regularization components and can be written as

Here, λ and α are hyperparameters that control the strength of L2 and L1 regularization, respectively. The wi terms represent the weights of individual trees in the ensemble.

KNN

KNN is a non-parametric ML algorithm that works by finding the k-nearest data points to a given input point in the feature space. Here, k refers to the number of nearest neighbors to consider, which is a hyperparameter that can be adjusted by the user. The algorithm is slow when working with large datasets. However, it is still a useful algorithm for various ML tasks (Bramer 2007).

Let yi be the class label of the ‘i-th’ nearest neighbor to the new data point x in a classification job, and let K be the number of neighbors chosen. The anticipated class label for x indicated by the symbol ‘ypred’, may be calculated as follows:
where yi is the class label of the i-th nearest neighbor; y represents the possible class labels; is the Kronecker delta function, which equals 1 when yi is equal to y and 0 otherwise; and argmax returns the class label that appears most frequently among the KNN.

RF

RF is a powerful ML algorithm that uses an ensemble of classification and regression trees to make accurate predictions. The algorithm works by building multiple DTs, each using a different bootstrap sample from the original dataset (Voyant et al. 2017).

In a RF, the ultimate categorization choice is usually decided by majority vote. Each DT in the ensemble makes a prediction, and the class with the most votes from all trees is the final forecasted class.

This may be expressed mathematically as

The ‘Mode’ method returns the class prediction that appears the most frequently in the individual trees.

LR

The goal of LR is to find the best hyperplane that minimizes the distance between the predicted values and the actual values of the dependent variable for each data point, LR uses a method called least squares, which involves minimizing the sum of the squared differences between the predicted and actual values of the dependent variable (Khademi et al. 2016).

The LR prediction formula is based on the logistic (sigmoid) function and the coefficients gained during the training phase. The following is the prediction formula:
A threshold (usually 0.5) is applied to the likelihood to produce a binary classification decision:
P(y = 1) is the likelihood that the data point belongs to class 1; e is the natural logarithm's base approximately equal to 2.71828; bo is the intercept term, commonly known as the bias term; b1, b2,…bn are coefficients learned during the LR model's training phase; and x1, x2, ,xn is the feature values of the data point for which generate a forecast.
To assign probabilities to each class in multi-class classification, we may utilize the softmax function:
P(y = i) represents the likelihood that the data point belongs to class i, whereas bi represents the score or logit associated with class i and C represents the total number of classes.
Table 1

WQI range for quality of water

WQI rangeWater classification
< 50 Excellent water 
50–100 Good water 
> 100–200 Poor water 
> 200–300 Very poor water 
> 300 Unfit for drinking 
WQI rangeWater classification
< 50 Excellent water 
50–100 Good water 
> 100–200 Poor water 
> 200–300 Very poor water 
> 300 Unfit for drinking 
Table 2
ParameterWHO standardWeight (wi)Relative weight (Wi)
pH 8.6 0.10 
Chloride 250 (mg/l) 0.13 
Nitrates 10 (mg/l) 0.16 
Nitrites 0.02 (mg/l) 0.06 
Calcium 50 (mg/l) 0.06 
Sulfates 400 (mg/l) 0.06 
TDS 500 (mg/l) 0.13 
TH 500 (mg/l) 0.10 
EC 0.7 (ds/m) 0.13 
Mg 75 (mg/l) 0.06 
  wi 31   
ParameterWHO standardWeight (wi)Relative weight (Wi)
pH 8.6 0.10 
Chloride 250 (mg/l) 0.13 
Nitrates 10 (mg/l) 0.16 
Nitrites 0.02 (mg/l) 0.06 
Calcium 50 (mg/l) 0.06 
Sulfates 400 (mg/l) 0.06 
TDS 500 (mg/l) 0.13 
TH 500 (mg/l) 0.10 
EC 0.7 (ds/m) 0.13 
Mg 75 (mg/l) 0.06 
  wi 31   

Correlation analysis

Correlation analysis is a statistical technique that measures the strength and direction of the relationship between two variables. In the present study, the significance of the values of the correlation coefficients has been assessed on the Pearson p-value. Values closer to 1 indicate a positive correlation, values closer to −1 indicate a negative correlation, and values closer to 0 indicate no correlation. Correlation analysis can be used to identify which variables are strongly related to the target variable. This can help us to understand the underlying patterns in the data and make more accurate predictions.

  • The WQI is our target variable, and Table 3 shows the correlation coefficients between the WQI and other water quality parameters. We can observe that the WQI has a higher positive correlation with nitrites (0.953), TDS (0.425), EC (0.388), TH (0.353), magnesium (0.345), pH (0.327), and sulfates (0.310). This indicates that as these variables increase, the water quality tends to worsen, and vice versa.

  • WQI has a negative correlation indicating that as the level of nitrates in water increases, the water quality tends to worsen, and vice versa. The other variables have weaker correlations with WQI, with correlation coefficients ranging from 0.074 (nitrates) to 0.327 (pH).

  • The other variables have weaker correlations with the WQI, with correlation coefficients ranging from 0.074 (nitrates) to 0.229 (lime calcium).

Table 3

A correlation analysis table showing water quality parameter relationships and their impact on the WQI

ParametersPhChlorideNitratesNitritesLime calciumSulfatesTDSTHECMgWQI
pH 1.000           
Chloride 0.418 1.000          
Nitrates −0.150 0.129 1.000         
Nitrites 0.165 −0.028 0.030 1.000        
Lime calcium 0.277 0.066 −0.011 0.013 1.000       
Sulfates 0.571 0.309 0.065 0.043 0.680 1.000      
TDS 0.615 0.362 0.063 0.151 0.697 0.894 1.000     
TH 0.517 0.355 0.030 0.068 0.740 0.887 0.887 1.000    
EC 0.513 0.365 0.042 0.103 0.725 0.867 0.911 0.962 1.000   
Mg 0.498 0.338 0.038 0.059 0.747 0.882 0.880 0.994 0.968 1.000  
WQI 0.326 0.139 0.074 0.953 0.229 0.310 0.425 0.353 0.388 0.344 1.000 
ParametersPhChlorideNitratesNitritesLime calciumSulfatesTDSTHECMgWQI
pH 1.000           
Chloride 0.418 1.000          
Nitrates −0.150 0.129 1.000         
Nitrites 0.165 −0.028 0.030 1.000        
Lime calcium 0.277 0.066 −0.011 0.013 1.000       
Sulfates 0.571 0.309 0.065 0.043 0.680 1.000      
TDS 0.615 0.362 0.063 0.151 0.697 0.894 1.000     
TH 0.517 0.355 0.030 0.068 0.740 0.887 0.887 1.000    
EC 0.513 0.365 0.042 0.103 0.725 0.867 0.911 0.962 1.000   
Mg 0.498 0.338 0.038 0.059 0.747 0.882 0.880 0.994 0.968 1.000  
WQI 0.326 0.139 0.074 0.953 0.229 0.310 0.425 0.353 0.388 0.344 1.000 
Correlation analysis suggests that about seven parameters have stronger effect on the WQI.

Model construction process

Python programming was used to create, construct, and train the models. The process of constructing a model typically involves three stages: training, validation, and testing. During the training stage, the model is exposed to a set of input–output patterns. In the validation stage, the model's performance is evaluated on patterns it has not seen before. Finally, during the testing stage, the model's performance is evaluated on unknown patterns that it has not been trained on or validated with (Ahmed et al. 2019). For the current study, the majority of the data is used for training the model, while smaller portions are set aside for testing and validation. In this specific case, the training dataset consists of 55% of the original data, while the testing dataset is 25% and the validation dataset is 20%.

Groundwater quality statistical analysis

The statistical summary of the physicochemical analysis of groundwater of the study area in terms of permissible limit, minimum, maximum, average value, and standard deviation is described in Table 4.

Table 4

Statistical summary of various physicochemical parameters of groundwater

ParametersPermissible limitMinimumMaximumAverage valueStandard deviation
Ph 8.5 7.30 8.90 7.85 0.37 
Chlorides 250 mg/l 34 2,220 331.23 560.51 
Nitrates 10 mg/l 20 4.60 6.90 
Nitrites 0.02 mg/l 0.59 1.32 
Lime calcium 75 mg/l 15 590 81.93 93.82 
Sulfates 400 mg/l 100 1,700 390.67 459.27 
TDS 1,000 mg/l 180 5,100 954.33 1,199.13 
TH 500 mg/l 40 2,050 383.50 446.08 
EC 0.7 dS/m 0.24 8.03 1.55 1.86 
Mg 50 mg/l 39 1,730 305.90 380.27 
ParametersPermissible limitMinimumMaximumAverage valueStandard deviation
Ph 8.5 7.30 8.90 7.85 0.37 
Chlorides 250 mg/l 34 2,220 331.23 560.51 
Nitrates 10 mg/l 20 4.60 6.90 
Nitrites 0.02 mg/l 0.59 1.32 
Lime calcium 75 mg/l 15 590 81.93 93.82 
Sulfates 400 mg/l 100 1,700 390.67 459.27 
TDS 1,000 mg/l 180 5,100 954.33 1,199.13 
TH 500 mg/l 40 2,050 383.50 446.08 
EC 0.7 dS/m 0.24 8.03 1.55 1.86 
Mg 50 mg/l 39 1,730 305.90 380.27 

pH

The pH of water determines whether it is acidic or basic, pH is also an indicator of the corrosivity of water (lower the pH higher is the chances of corrosion in water). The pH limit of drinking water should be between 7.0 and 8.5 and between 6.5 and 8.5, respectively (WHO 2004). Human health is affected by the pH of water if it is beyond the allowable range, it may damage the mucous membrane and water supply system (Shakoor et al. 2022). In the present study, the pH in the groundwater samples ranged between 7.3 and 8.9 with a mean value of 7.85. The results of pH in the groundwater of the study area are like those reported by Shakoor et al. (2022) for the groundwater of Piryaloi Union Council (UC), Sindh, Pakistan.

EC

The EC determination aids us in immediately determining and analyzing mineralization, natural water changes, and wastewater, as well as the determination of chemical reagents to be mixed in a water sample (Panhwar et al. 2022). Greater EC shows the higher concentration of salts and indicates more saline nature of water. Also, the EC of water varies with temperature, with the increase in temperature EC is also increased. Solangi et al. (2019b) reported that the EC in drinking water should be less than 0.7 ds/m. Higher the concentration of dissolved ions in water, the higher will be the EC (Mohsin et al. 2013; Ali et al. 2023). It was found that the EC in the analyzed groundwater ranged from 0.24 to 8.03 dS/m, and the mean value was 1.55 dS/m (Table 4). A similar trend of EC values for groundwater of the Sukkur city in Sindh, Pakistan was reported by Ansari et al. (2021).

TDS

TDS is a main parameter that evaluates the suitability of water for domestic, industrial, and irrigation purposes (Solangi et al. 2019a). The mixture of dissolved inorganic and organic salts present in water is termed as TDS. TDS is highly dependent on the EC of water as the rise in EC increases the solubility of water which in turn increases the amount of dissolved solids in water. TDS in water can come from a variety of natural sources, including sewage, soil nature, and industrial waste. The weight of the substance on the water evaporation to dryness estimated by heating for 1 h at 180° gives the amount of dissolved stuff. In the study area, we found that the TDS values ranged between 180 and 5,100 mg/l, with a mean value of 954.33 mg/l. When TDS concentration in water is beyond 1,000 mg/l then the water is to be considered as unsuitable for drinking and causes gastrointestinal irritation to consumers. However, the desirable limit of TDS in drinking water is 500 mg/l. The observed values of TDS in the groundwater of the study area are higher than the values prescribed by WHO. A similar trend of TDS values for groundwater of the Sukkur city, Sindh, Pakistan (a neighboring city of the present study area), and Larkana taluka, Sindh, Pakistan were reported by Ansari et al. (2021) and Jamali et al. (2023), respectively.

Magnesium

The association of calcium and magnesium ions is responsible for water hardness (Solangi et al. 2019b). It was reported that a lack of magnesium content in the human body may cause various diseases. However, excessive magnesium content, i.e., higher than 125 mg/l is dangerous for human health and may give birth to laxative effects in human beings. The cations and anions of magnesium, calcium, carbonate, sulfate, chlorides, and bicarbonate are the main causes of temporary hardness in water. The allowable limit of Mg in drinking water according to the WHO (2006) is 50 mg/l. Rosanoff (2013) has described that universal drinking water and beverages containing moderate-to-high Mg amount can prevent more than 4 million heart diseases and deaths due to stroke annually. The Mg amount in the groundwater of the study area ranged between 39 and 1,730 mg/l and the mean value obtained is 305.90 mg/l. Ansari et al. (2021) have reported a similar trend of Mg concentration in the groundwater of Sukkur city, Sindh, Pakistan.

Nitrates and nitrites

Nitrites and nitrates are salts that can be found naturally or chemically and artificially in groundwater. Nitrites are mostly found in fertilizers, runoff water, mineral deposits, and sewage. In our study, we found that nitrites amount in the groundwater ranges from 0 to 5 mg/l and a mean value which we obtained is 0.59 mg/l. However, nitrates amount in the groundwater ranges from 0 to 20 mg/l and the mean value which we obtained is 4.60 mg/l. A similar trend of nitrates and nitrites concentration in the groundwater of Sukkur city was reported by Ansari et al. (2021).

Lime calcium

The lime calcium amount ranged from 15 to 590 mg/l and the mean value which we obtained is 81.93 mg/l. WHO (2006) has suggested the desirable value which is 75 mg/l. According to some evidence, taking calcium supplements may help to prevent heart diseases (Li et al. 2018). A similar trend of lime calcium concentration in the groundwater of Piryaloi UC was reported by Shakoor et al. (2022).

Chloride

Chloride is a mixture of chlorine gas, metal, and some small earth crust materials, although it is one of the principals dissolved minerals in most of the natural waters. The WHO guideline of chloride content in potable water is 250 mg/l. Its excessive concentration in water causes high blood pressure and gives birth to kidneys and heart diseases. Chloride can harm freshwater and lakes because it dissolves in water from a variety of sources, including heavy industrial waste and waste from treatment plants. In this study area, the chloride content ranges from 34 to 2,220 mg/l and the mean value which we obtained is 331.23 mg/l. A similar trend of chloride concentration in the groundwater was reported by Ansari et al. (2021) and Jamali et al. (2023) for groundwater of Sukkur city, and Larkana taluka, respectively.

Total hardness

The percentage of calcium and magnesium in the water is referred to as TH. In general, surface water is softer than groundwater. The WHO recommended a hardness limit of 500 mg/l. In the study area, the hardness value ranges from 40 to 2,050 mg/l and the mean value which we obtained is 383.50 mg/l. A similar trend of TH concentration in the groundwater of Larkana taluka and Sukkur city were reported by Jamali et al. (2022) and Ansari et al. (2021), respectively.

Sulfate

Sulfate can be either natural or man-made; naturally, it comes from rocks or soil, while man-made sulfate comes from fertilized land runoff. One of the most critical nutrients for plants is sulfur. The WHO recommends a sulfate limit of 400 mg/l. In this study, the sulfate ranged from 100 to 1,700 mg/l and the mean value which we obtained is 390.67 mg/l. Ansari et al. (2021) reported similar trends of sulfates in the groundwater of the Sukkar city of Sindh province of Pakistan.

Spatial distribution of category of water

The water quality parameters of the samples were thoroughly evaluated, and the WQI was calculated for each of them. The WQI values ranged from 40.4 to 1,578.06, with a mean and standard deviation of 315.58 and 427.62, respectively. Out of the total samples, 3 were categorized as excellent water, 10 as good water, 9 as poor water, and 8 as unsuitable for drinking. Notably, there were no samples categorized as very poor water. The spatial distribution, as depicted in the GIS map (Figure 3), clearly indicates that the majority of the samples (13 in total) fell in the better water quality category. The findings suggest that although the water quality of some samples was unsatisfactory, most of the samples tested in this study had acceptable water quality.
Figure 3

GIS map displaying spatial distribution of WQI.

Figure 3

GIS map displaying spatial distribution of WQI.

Close modal

Performance metrics and measures of accuracy

To evaluate how effectively an ML model performs on classification tasks, a confusion matrix is typically utilized. It shows how many of the model's predictions were correct and incorrect when compared to the actual outcomes. To develop a confusion matrix to test model accuracies, a popular ML toolkit in Python called ‘Scikit-learn’ was utilized.

In a four-class classification scenario (Table 5), the diagonal elements in the matrix show the number of correct predictions for each class, while the off-diagonal elements represent the number of misclassifications. The letter codes TN, TPJ, TPK, and TPL denote the number of true negatives for class 0 and the number of true positives for classes 1, 2, and 3, respectively. The codes FPI, FPII, FPIII, FJII, FJIII, FKIII, FJ, FK, FKII, FL, FLII, and FLIII indicate the misclassifications between predicted and actual classes, which could arise due to either false positives or false negatives.

Table 5

Confusion matrix for a four-class classification

Predicted class
Class 0Class 1Class 2Class 3
Actual class Class 0 TN FPI FPII FPIII 
Class 1 FJ TPJ FJII FJIII 
Class 2 FK FKII TPK FKIII 
Class 3 FL FLII FLIII TPL 
Predicted class
Class 0Class 1Class 2Class 3
Actual class Class 0 TN FPI FPII FPIII 
Class 1 FJ TPJ FJII FJIII 
Class 2 FK FKII TPK FKIII 
Class 3 FL FLII FLIII TPL 

The confusion matrix used for multi-class classification can be transformed into a series of binary-class confusion matrix. This transformation allows the computation of metrics such as precision, recall, accuracy, and F1 score for each class. The following binary-class confusion matrix (Table 6) is used for each class.

Table 6

Binary classification confusion matrix

Predicted class
  
Actual class TN FP 
FN TP 
Predicted class
  
Actual class TN FP 
FN TP 

True Positive (TP) indicates the number of correctly predicted positive instances; False Positive (FP) indicates the number of incorrectly predicted positive instances; True Negative (TN) indicates the number of correctly predicted negative instances; False Negative (FN) indicates the number of incorrectly predicted negative instances.

Accuracy, precision, recall, and F1 score are measures used in ML to assess a model's performance.

Accuracy

It evaluates how well the model for a certain dataset produced predictions. It is simple to comprehend and analyze, and it enables a direct comparison of different models based on how accurately they perform on the same data set:

Precision

Precision is valuable as it indicates the positive predictions that were accurate or true, it measures how well the model avoids false positives:

Recall

It is a crucial metric for evaluating the performance of a classification model, particularly in applications where missed instances of a particular class are more costly than false positives. It measures the correctly predicted positive labels out of all actual positive labels:

F1 score

F1 score is a popular evaluation metric in ML because it combines both precision and recall and, provides a balanced evaluation of the model's performance:

The labels in this study are divided into four classes: 0 = excellent water, 1 = good water, 2 = poor water, and 3 = unfit for drinking. The overall accuracies are 100, 100, 88, 75, and 50% for DTC, XGB, RF, KNN, and LR, respectively.

The confusion matrices for each algorithm drawn by a popular ML toolkit in Python called ‘Scikit-learn’ are displayed in Figure 4.
Figure 4

(a–e) Confusion matrix for each model.

Figure 4

(a–e) Confusion matrix for each model.

Close modal

Algorithms results

The groundwater data were analyzed and classified based on the WQI thresholds using a scale from 0 to 300. Actually, there are five classes based on the WQI values range (0 = excellent water, 1 = good water, 2 = poor water, 3 = very poor water, and 4 = unsuitable for drinking), but there is no sample under the category (very poor water) between the range 200–300, so, only four classes are used as target variables, and the WQI was predicted using 5 ML algorithms, LR, DTC, XGB, RF, and KNN, every ML model has its own prediction capabilities. Each model is predicted for all four classes, and the performance of each model for each class is shown in Figure 5.
Figure 5

(a) 3D visualization box graph for the LR model, (b) 3D visualization box graph for the KNN model, (c) 3D visualization box graph for DTC, (d) 3D visualization box graph for the RF model, and (e) 3D visualization box graph for the XGB model.

Figure 5

(a) 3D visualization box graph for the LR model, (b) 3D visualization box graph for the KNN model, (c) 3D visualization box graph for DTC, (d) 3D visualization box graph for the RF model, and (e) 3D visualization box graph for the XGB model.

Close modal

F1 score, recall, and precision are useful for forecasting water quality, especially in the context of classification tasks when we are attempting to categorize water samples into different quality classes (e.g., excellent water to unsuitable for drinking). These indicators aid in evaluating the efficacy of our prediction model. When their scores are high, it suggests that our model is doing well in predicting water quality. The LR model achieved perfect precision for class 0, but low recall, indicating that it correctly identified all positive instances in that class but missed many others. For class 1, the model has a low precision, indicating that many negative instances were misclassified as positives, but relatively high recall, indicating that many of the positive instances were correctly identified. Class 2 achieved the highest performance, with a relatively high precision, recall, and F1 score. The model completely failed to identify any positive instances in class 3. Overall, the model's accuracy was 50%, suggesting that the model's performance is not very good. The KNN model has performed well with high precision, recall, and F1 score for most of the classes, except for class 1, where it achieved a moderate performance. All classes have a perfect F1 score except for class 2, which has an F1 score of 0.50. The DTC and XGB models have performed perfectly, achieving a precision, recall, and F1 score of 1.00 for all classes. The RF model has high precision, recall, and F1 score for classes 0, 1, and 2, but the recall and F1 score for class 3 is 0, indicating poor performance for that class as shown in Figure 5.

Weighted average accuracy is a statistic that accounts for a model's performance in predicting distinct classes of water quality while taking the class distribution into account. It computes the average accuracy for each class and then weights these accuracies depending on the proportions of the classes.

Weighted average accuracy is determined mathematically as the sum of (proportion of each class * accuracy of that class). A high weighted average accuracy indicates that our model is making accurate predictions across all water quality classes. Weighted average of classes is used to evaluate the performance metrics of models, as each attribute has a different weight in calculating the WQI. The comparison of the weighted average of models (for precision, recall, and F1 score) is shown in Figure 6. Here in Figure 6, with a weighted average accuracy of 100%, the XGB and DTC models are producing excellent predictions across all classes and metrics. A weighted average accuracy of 100% indicates that the model makes no mistakes and successfully predicts every water quality class with perfect F1 score, precision, and recall.
Figure 6

Weighted average comparison of classification models.

Figure 6

Weighted average comparison of classification models.

Close modal

The weighted average accuracy of LR is quite low, indicating that the model's performance is poor. It implies that the model is making some incorrect classifications across four water quality classes. This low accuracy might be attributed to factors such as inadequate model selection or insufficient training data.

Analysis of groundwater based on the physicochemical parameters revealed that the EC ranged between 0.24 and 8.03 dS/m, with an average of 1.55 dS/m. The TDS ranged between 180 and 5,100 mg/l with a mean value of 954.33 mg/l. The Mg concentrations varied from 39 to 1,730 mg/l with its mean value of 305.90 mg/l. The lime calcium ranged between 15 and 590 mg/l with its mean value of 81.93 mg/l. The chloride concentration varied from 34 to 2,220 mg/l with its mean of 331.23 mg/l. The hardness varied from 40 to 2,050 mg/l with a mean of 383.5 mg/l. However, most of the samples (∼53%) possessed EC, TDS, calcium, magnesium, hardness, and chloride concentrations beyond the WHO drinking water quality guidelines. Also, the analysis based the WQI estimations, 43.33% of the samples were found suitable for drinking, while a majority (56.67%) of the samples were found unsuitable for drinking purposes. Moreover, analysis based on the ML algorithms to forecast the groundwater quality index, and comparison of model results indicated that the models' accuracy levels in predicting water quality follow the order of DTC and XGB being the most accurate, followed by RF and KNN, with LR being the least accurate. The precision, recall, and F1 scores for different classes were also measured using weighted averages, and it was found that DTC and XGB models had the highest weighted averages, while LR had the lowest weighted average. Further analysis based on the additional water quality parameters would be beneficial, as currently every attribute has 30 values, thus, it may be difficult to train ML algorithms. Overall, the analysis revealed that, in most of the areas of the Pano Aqil city, groundwater is unsuitable for drinking, thus, it should be treated well prior to its use for drinking.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abbasnia
A.
,
Alimohammadi
M.
,
Mahvi
A. H.
,
Nabizadeh
R.
,
Yousefi
M.
,
Mohammadi
A. A.
,
Pasalari
H.
&
Mirzabeigi
M.
2018
Assessment of groundwater quality and evaluation of scaling and corrosiveness potential of drinking water samples in villages of Chabahr city, Sistan and Baluchistan province in Iran
.
Data in Brief
16
,
182
192
.
Ahmed
A. N.
,
Othman
F. B.
,
Afan
H. A.
,
Ibrahim
R. K.
,
Fai
C. M.
,
Hossain
M. S.
,
Ehteram
M.
&
Elshafie
A.
2019
Machine learning methods for better water quality prediction
.
Journal of Hydrology
578
,
124084
.
Ali
Z.
,
Bilal
M.
,
Panhwar
S.
,
Azhar
M. S.
,
Asif
M. B.
,
Subhani
H.
&
Hassan
S. S.
2023
Detailed evaluation of physicochemical properties and microbial activities of Hanna Lake and Spin Karez
.
Water Practice & Technology
18
(
6
),
1500
.
Anigrou
Y.
,
Bahlami
A.
&
El Khlifi
M.
2022
Methodology for an ecological solution of subsurface flow constructed wetlands used in the treatment of greywater
.
Water Practice & Technology
17
(
12
),
2581
2597
.
Ansari
M. H.
,
Solangi
G. S.
,
Bhatti
N. B.
,
Akram
P.
,
Panhwar
S. S.
,
Shah
F. A.
&
Ansari
S.
2021
An integrated indexical assessment of groundwater quality of Sukkur City, Pakistan
.
International Journal of Economy, Energy and Environment
6
(
5
),
91
97
.
Bramer
M.
2007
Principles of Data Mining
, Vol.
180
.
Springer
,
London
, p.
2
.
El Bilali
A.
,
Taleb
A.
&
Brouziyne
Y.
2021
Groundwater quality forecasting using machine learning algorithms for irrigation purposes
.
Agricultural Water Management
245
,
106625
.
Haghiabi
A. H.
,
Nasrolahi
A. H.
&
Parsaie
A.
2018
Water quality prediction using machine learning methods
.
Water Quality Research Journal
53
(
1
),
3
13
.
Jamali
M. Z.
,
Solangi
G. S.
,
Keerio
M. A.
,
Keerio
J. A.
&
Bheel
N.
2023
Assessing and mapping the groundwater quality of Taluka Larkana, Sindh, Pakistan using water quality indices and geospatial tools
.
International Journal of Environmental Science and Technology
20
,
8849
8862
.
Jing
W.
,
Zhao
X.
,
Yao
L.
,
Di
L.
,
Yang
J.
,
Li
Y.
,
Guo
L.
&
Zhou
C.
2020
Can terrestrial water storage dynamics be estimated from climate anomalies?
Earth and Space Science
7
(
3
),
e2019EA000959
.
Khademi
F.
,
Jamal
S. M.
,
Deshpande
N.
&
Londhe
S.
2016
Predicting strength of recycled aggregate concrete using artificial neural network, adaptive neuro-fuzzy inference system and multiple linear regression
.
International Journal of Sustainable Built Environment
5
(
2
),
355
369
.
Li
K.
,
Wang
X. F.
,
Li
D. Y.
,
Chen
Y. C.
,
Zhao
L. J.
,
Liu
X. G.
,
Guo
Y. F.
,
Shen
J.
,
Lin
X.
,
Deng
J.
&
Zhou
R.
2018
The good, the bad, and the ugly of calcium supplementation: A review of calcium intake on human health
.
Clinical Interventions in Aging
13
,
2443
2452
.
Mohsin
M.
,
Safdar
S.
,
Asghar
F.
&
Jamal
F.
2013
Assessment of drinking water quality and its impact on residents health in Bahawalpur city
.
International Journal of Humanities and Social Science
3
(
15
),
114
128
.
Nordin
N. F. C.
,
Mohd
N. S.
,
Koting
S.
,
Ismail
Z.
,
Sherif
M.
&
El-Shafie
A.
2021
Groundwater quality forecasting modelling using artificial intelligence: A review
.
Groundwater for Sustainable Development
14
,
100643
.
Panhwar
M. Y.
,
Panhwar
S.
,
Keerio
H. A.
,
Khokhar
N. H.
,
Shah
S. A.
&
Pathan
N.
2022
Water quality analysis of old and new Phuleli Canal for irrigation purpose in the vicinity of Hyderabad, Pakistan
.
Water Practice & Technology
17
(
2
),
529
536
.
Sener
S.
,
Sener
E.
&
Davraz
A.
2017
Evaluation of water quality using water quality index (WQI) method and GIS in Aksu River (SW-Turkey)
.
Science of the Total Environment
584–585
,
131
144
.
Shakoor
A.
,
Solangi
G. S.
,
Babar
M. M.
,
Naila Gul Shaikh
N. G.
&
Brohi
R. Z.
2022
Groundwater quality assessment of U.C Piryaloi, District Khairpur Mir's, Sindh, Pakistan
.
International Research Journal of Modernization in Engineering Technology and Science
4
(
6
),
5864
5869
.
Solangi
G. S.
,
Siyal
A. A.
,
Babar
M. M.
&
Siyal
P.
2017
Groundwater quality mapping using geographic information system: A case study of District Thatta, Sindh
.
Mehran University Research Journal of Engineering and Technology
36
,
1059
1072
.
Solangi
G. S.
,
Siyal
A. A.
,
Babar
M. M.
&
Siyal
P.
2019a
Groundwater quality evaluation using the water quality index (WQI), the synthetic pollution index (SPI), and geospatial tools: A case study of Sujawal district, Pakistan
.
Human and Ecological Risk Assessment: An International Journal
26
(
6
),
1529
1549
.
Solangi
G. S.
,
Siyal
A. A.
,
Siyal
Z.
,
Siyal
P.
,
Panhwar
S.
,
Keerio
H. A.
&
Bhatti
N. B.
2022
Social and ecological climate change vulnerability assessment in the Indus Delta, Pakistan
.
Water Practice & Technology
17
(
8
),
1666
.
Subramani
T.
,
Rajmohan
N.
&
Elango
L.
2010
Groundwater geochemistry and identification of hydrogeochemical processes in a hard rock region, Southern India
.
Environmental Monitoring and Assessment
162
(
1
),
123
137
.
Tahmasebi
P.
,
Kamrava
S.
,
Bai
T.
&
Sahimi
M.
2020
Machine learning in geo-and environmental sciences: From small to large scale
.
Advances in Water Resources
142
,
103619
.
Voyant
C.
,
Notton
G.
,
Kalogirou
S.
,
Nivet
M. L.
,
Paoli
C.
,
Motte
F.
&
Fouilloy
A.
2017
Machine learning methods for solar radiation forecasting: A review
.
Renewable Energy
105
,
569
582
.
World Health Organization, & WHO
2004
Guidelines for Drinking-Water Quality
, Vol.
1
.
World Health Organization, Geneva
.
World Health Organization
2006
The World Health Report 2006: Working Together for Health
World Health Organization, Geneva
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).