The objective of this research was to arrive at a better assessment of the quality of surface water in the Constantine region. The focus was on the comparison of three classical indices WQINSF (National Sanitation Foundation Water Quality Index), WQICCME (Canadian Council of Ministers of the Environment Water Quality Index) and WQIAP (weighted arithmetical Water Quality Index), the development of a new index and the prediction by ANN (artificial neural network) of WQI indices. The principal components analysis (PCA) allows the selection of 10 parameters to be used in the calculation of the classical WQI, and eight principal components to be used as input for the new proposed index (regularized WQI). However the ANN is applied for the search for prediction models of classical WQI and developed WQI. The results show that the WQIAP index assesses water quality better, and that the regularized WQI further promotes the assessment of water quality. WQIR shows that, after the pollution peak, the water quality does not return to its initial state. The modeling approach by ANN offers an effective alternative to predict the WQI, it subsequently appears that the ANN predicts the new index WQIRregularized (R2 = 0.999) better than the classic model WQIAP (R2 = 0.99).

  • The first principal components with an eigenvalue greater than 1 are used as input in the calculation of the newly developed index (WQI regularized).

  • The regularized WQI index improves the assessment of the water quality of the Constantine catchment area compared to the classical indices WQIWeighted Arithmetic, WQINSF and WQICCME.

  • ANN prediction of classical and regularized WQI is evaluated using six fitness criteria: R, RMSE, MEA, NE, IOS and R%

  • The classical ANN model is mainly influenced by temperature, OS, NO3 and BOD5.

  • The regularized ANN model is influenced by Component 2 and Component 6, the component 2 is closely related to the parameters NO3, Tu, pH and temperature, while the component 6 is positively correlated with NO2 and Os.

Graphical Abstract

Graphical Abstract

Surface water quality assessment is important in hydro-environmental management. Many organizations and agencies have adopted water quality indices as a tool for water quality assessment and management. The water quality index (WQI) is a single-valued numerical expression that assesses the quality of a given body of water at a specific location and time. In addition, they are indicators that provide appropriate classification values describing the quality status of surface waters and allowing categorization of pollutant load and designation of classes (Khuan et al. 2002; Sujana Prajithkumar & Mane 2014; Sutadian et al. 2016).

Water Quality Indices (WQIs) reduce a large number of parameters into a simpler expression to allow easy and efficient interpretation of monitoring data (Sujana Prajithkumar & Mane 2014).

The quality parameters commonly measured to assess water quality are divided into physical parameters [Temperature – Electrical Conductivity – Taste – Total Suspended Solids (TSS) – Turbidity – Odor – Color – Total Dissolved Solids (TDS)], chemical parameters (pH – Biochemical Oxygen Demand (BOD) – Chemical Oxygen Demand (COD) – Dissolved Oxygen (DO) – Total Hardness – Phosphates – Pesticides – Nitrates Surfactants – Heavy Metals) and biological parameters [bacteria (fecal coliform, Escherichia coli, Cryptosporidium, Giardia lamblia), Virus – Fungi – Protozoa – Parasitic worms].

Most studies analyzing water quality indices using biological parameters are focused on marine systems (Liu et al. 2015; Frena et al. 2019) rather than rivers (Crabill et al. 1999; Lee et al. 2016; Seo et al. 2019). Often these biological parameters are treated as water quality factors (Lin & Ganesh 2013; Seo et al. 2019; Kothari et al. 2021).

Surface water quality assessment is important in hydro-environmental management. Many organizations and agencies have adopted WQIs as a tool for water quality assessment and management. The WQI is a single-valued numerical expression that assesses the quality of a given body of water at a specific location and time based on several water quality parameters.

WQIs reduce a large number of parameters into a simpler expression to allow the easy and efficient interpretation of monitoring data (Sujana Prajithkumar & Mane 2014). In addition, they are indicators that provide appropriate classification values, describing the quality status of surface waters allowing categorization of pollutant load and designation of classes (Khuan et al. 2002; Sujana Prajithkumar & Mane 2014; Sutadian et al. 2016).

The concept of WQI was first introduced in Germany in 1848, where the presence or absence of certain elements in water was used as an indicator of water quality (Abbasi & Abbasi 2012). Horton (Horton 1965) developed the very first modern WQI in 1965, since the birth of the WQI concept, Yidana et al. (2010) combined GIS with a multivariate statistical method to calculate a WQI; Achary (2017) used the WQI for the assessment of groundwater quality for drinking in the Bhubaneswar region of Odisha state, India.

Mukaite et al. (2019) improved a WQI (IWQI) that focused on both the desirable limit (DL) and the maximum allowable limit of a water parameter. In contrast, the new indicator WQImin suggested by Nong et al. (2020) consists of five crucial parameters, namely total phosphorus, fecal coliform, mercury, water temperature, and dissolved oxygen. The model was constructed using stepwise multiple linear regression analysis and showed excellent performance in assessing water quality.

The WPI made by Hossain & Patra (2020) is based on the standard permissible limits of grounds water parameter recommended by the BIS and WHO. It aims at evaluating the degree of pollution in groundwater for drinking purposes using water quality parameters.

Islam et al. (2020), proposed a modified integrated WQI (MIWQI) considering principal component analysis (PCA), and compared to entropy theory or the entropy WQI (EWQI). These indices are mainly based on some factors such as the selection of parameters, assignment of weights and relative weights, conversion to a specific range scale, i.e. calculation of sub-indices, and aggregation of sub-indices (Brown et al. 1970; Abbasi & Abbasi 2012; Sutadian et al. 2016; Hossain & Patra 2020). The aforementioned work indicates that, apart from the number and choice of parameters, the sample size and, in particular, the weighting of the parameters is of great importance in the assessment of water quality. The latter, weighting, can be fixed, variable and sometimes subjective depending on the opinions of experts (Abbasi & Abbasi 2012; Rezaei et al. 2017; Tripathi & Singal 2019).

To better understand water quality and to determine the parameters that directly influence the estimation of the indices we used surface water quality modeling. The latter is very difficult to model due to limited water quality data and the high cost of water quality monitoring, which poses a serious problem. Therefore, artificial intelligence offers the optimal solution for solving several types of environmental problems, as the computations are very fast and require far fewer parameters and input conditions than deterministic models (Hameed et al. 2016). Its application to simulate water quality parameters is cost effective, fast and reliable.

The common methods for water quality prediction include artificial neural network (ANN) techniques are intensively used for model comparison and optimization (Sujana Prajithkumar & Mane 2014), and for data compression, and prediction (Hameed et al. 2016; Sahaya Vasanthi & Adish Kumar 2019; Singh et al. 2021).

Various ANN applications are used to predict water quality and to determine the index using independent variables, to this end Sujana Prajithkumar & Mane (2014) used modular neural networks and radial basis function networks to create two models for predicting the WQI of the Panava River in India. Nourani et al. (2013) used a neural network to calculate WQI and found that it performed better than other conventional methods.

Hameed et al. (2016) used the ANN application to predict tropical water quality parameters in the Langat and Klang river basins in Malaysia. Two different models were applied to examine and imitate the relationship of WQI with water quality variables, namely the back propagation neural network (BPNN) and the radial basis function neural network (RBFNN).

A prediction of the WQI of Parakai Lake, Tamil Nadu, and India was conducted by Sahaya Vasanthi & Adish Kumar (2019), and shows that the ANN model performs better, and is more accurate than the multiple regression model (MLR). In 2018 Isiyaka et al. (2018) used an ANN and a multivariate statistical technique for the reduction of the number of parameters and the number of water quality monitoring stations in the Kinta River, Malaysia. (Sahoo et al. 2015) proposed the ANFIS for water quality prediction in the Brahmani river. Kouadri et al. (2021) used the ANN application to predict the WQI of groundwater in the El Merk region (South-East Algeria), using mineralization, TH, NO3 and NO2 as inputs.

Singh et al. (2021) used neural-based soft computing techniques, an ANN and generalized regression neural network (GRNN), and a hybrid soft computing technique, an ANFIS with four membership functions to predict WQIs in the Khorramabad, Biranshahr, and Alashtar subwatersheds in Iran.

In practice, there is no globally accepted methodology to improve a WQI. Indeed, the lack or wrong choice of parameters, the lack of knowledge and the non-adjustment of the importance of the parameters weighting as well as the small size of the databases are the main causes of the weakness of the models.

This work proposes an effective alternative approach to develop a new index in the context of the search for the best assessment and prediction of water quality in the Constantinois coastal watershed. The latter consists of the use of multivariate statistics (PCA) in combination with the ANN regression to improve the evaluation of classical indices by the data. The newly developed WQIR index takes advantage of the benefits of principal components in summarizing all the information into a reduced number of variables and was able to reveal the eclipsed pollution points.

  • In the first step, the application of PCA is performed to select the variables to be involved in the calculation of the classical WQI, and the components to be used as input to the developed model.

  • In the second step, the focus is on finding the best assessment of surface water quality in the basin by comparing three traditional WQIs: the Weighted Arithmetic Index (WQIAP), the National Sanitation Foundation Index (WQINSF), and the Canadian Council of Ministers of the Environment Water Quality Index (WQICCME).

  • In the third step, we sought to improve the evaluation of the retained WQI by proposing a new WQI index: the WQIR (regularized WQI, based on principal components). Particular attention was given to the weighting of the parameters.

  • In the fourth step, the neural network model prediction of the WQIAP indices and the new WQIR index is detailed.

Study area

The watershed area of Constantine ‘Coastel Constantinois’ is located on the northern coast of eastern Algeria (Figure 1). It is limited in the north by the Mediterranean, in the east by the Tunisian borders, in the west by the basin ‘Algeria Hodna-Soummam’, and in the south by the basins ‘Kebir-Rumel, Sybous-Medjerdah’. The Constantinian Coastal Basin covers an area of 11,509 km2 and is subdivided into three sub-basins:
  • western Constantinian coasts with an area of 2,424 km2;

  • central Constantinian with an area of 5,582 km2 ;

  • eastern Constantinian coasts with an area of 3,203 km2.

Figure 1

Map of the geographical situation of the Coastel Constantine watershed.

Figure 1

Map of the geographical situation of the Coastel Constantine watershed.

Close modal

Data description

Seven sampling stations (Table 1) were chosen to observe water quality in the coastal watershed in Constantine. The study was conducted monthly for a period of 9 years (January 2010–December 2018) by the agents of the National Water Resources Agency; only one sampling per month was taken into account for the temporal follow-up of the water quality, Eastern region (Algeria). The study area suffers from data gaps, a lack of some pollution parameters and small sample sizes.

Table 1

Location of the measuring stations in the Watershed ‘Coastel Constantinois’

CodeSub-basinstateX (m) LambertY (m) Lambert
Bge.Mexa
St.031609 
OuedKebir East El Taref (Bougous) 1,007,932 398,848 
Bge.ZitEmba
St.031102 
Oued Kebir Hammem Skikda (BekkoucheLakhdar) 909,629 383,821 
Bge. Zerdezas
St.030902 
Oued Safsaf Skikda (Zerdezas) 875,820 373,112 
Bge. Guenitra
St.030701 
Oued Guebli Skikda (OumToub) 851,771 385,930 
Bge. BniZid
St.030711 
Oude Guebli Skikda (BniZid) 836,630 406,182 
Bge. Cheffia
St.031501 
Cotiers Bounamoussa El Tarf (Cheffia) 977,367 380,540 
Bge.ElAgreme
St.030303 
Cotiers Jijel Jijel (Kaous) 779,250 385,450 
CodeSub-basinstateX (m) LambertY (m) Lambert
Bge.Mexa
St.031609 
OuedKebir East El Taref (Bougous) 1,007,932 398,848 
Bge.ZitEmba
St.031102 
Oued Kebir Hammem Skikda (BekkoucheLakhdar) 909,629 383,821 
Bge. Zerdezas
St.030902 
Oued Safsaf Skikda (Zerdezas) 875,820 373,112 
Bge. Guenitra
St.030701 
Oued Guebli Skikda (OumToub) 851,771 385,930 
Bge. BniZid
St.030711 
Oude Guebli Skikda (BniZid) 836,630 406,182 
Bge. Cheffia
St.031501 
Cotiers Bounamoussa El Tarf (Cheffia) 977,367 380,540 
Bge.ElAgreme
St.030303 
Cotiers Jijel Jijel (Kaous) 779,250 385,450 

Source: National River Basin Agency.

Twenty-three physical and chemical parameters were collected: turbidity (Tu), suspended solids (TSS), temperature (T), electrical conductivity (Cond), pH, phosphates (PO4), calcium (Ca), magnesium (Mg), sulfates (SO4), chlorides (Cl), bicarbonates (HCO3), sodium (Na), potassium (K), nitrates (NO3) nitrite (NO2), biochemical oxygen demand (BOD5), chemical oxygen demand (COD), dissolved oxygen (DO), saturating oxygen (OS), Organic Matter (OM), ammonium (NH4), total alkalinity (TA) and dry residue (Rs).

Normalization was performed for these 23 parameters for the different monitoring stations. The variability of these parameters is shown as box plots in Figure 2.
Figure 2

Z-score variability of the seven station data set.

Figure 2

Z-score variability of the seven station data set.

Close modal

The data set is presented by five values: extreme values (minimum and maximum values), median, quartiles, percentiles, and sometimes-remote values (extremes coded *) as shown in Figure 2. The top border of the box represents the 75th percentile and the bottom border represents the 25th percentile. The vertical length of the box represents the interquartile range while the centerline shows the median.

The results of the whisker box analysis show that for the parameters OM, Turbidity, NH4, PO4, NO2, K and TSS, the body of the box is small (the rectangles are squashed) and the whiskers are short, this indicates that the values are more uniform and less scattered; indeed they are closer to the median. The distribution is more elongated towards the maximum values for BOD5 and towards the minimum values for COD. Concerning the parameters NO3, SO4, pH, DO, Cond, TA, HCO3, Na and Temperature, the position of the median is in the center of the box; equal to the percentile (Q2 = 50 percentile), these parameters are homogeneous and the box shape is symmetrical. The box with whiskers indicates the presence of distant values (extremes) which denotes the great variability of the data.

Principal component analysis

There is no general rule for the selection of parameters for WQI models, experts propose several approaches, including the Delphi method which allows organizing the consultation of a group of experts in order to obtain a final and convergent opinion of the group (Saha 2014). The other commonly used approach is the use of statistical methods, including Pearson's correlation coefficient and principal component/factor analysis (PCA/FCA) (Abbasi & Abbasi 2012; Tripathi & Singal 2019). PCA accomplishes the task without losing much information (Diallo et al. 2014; Reggam et al. 2015; Bouslah 2017; Tripathi & Singal 2019; Islam et al. 2020). The PCA method is more accurate than the Delphi method and it is reasonable and scientific to apply it in calculating WQI indices (Liu et al. 2021). In this study, PCA will be applied to reduce the dimensionality of the dataset to avoid the bias of parameter selection, to identify the main possible sources of pollution, and to select the parameters to be used to develop the prediction model of artificial neural networks. All analyses were performed using R software (Rx64 4.1.2).

Artificial neural network

Artificial neural networks have been used to overcome many problems related to engineering mainly, in the environmental field of water quality, because it avoids ambiguity and the effect of eclipsing variables (Behboudian et al. 2014; Hameed et al. 2016; Isiyaka et al. 2018; Garcia et al. 2019). The ANN is featured by its capacity to model complex and non-linear processes without any form of prior knowledge of the relationship between input and output variables (Hameed et al. 2016).

In this work, a multilayer ANN (MLP) network with a single hidden layer (Figure 3) will be used with a back propagation algorithm, the sigmoid Tang function (TAGSIG) is selected as the transfer activation. It is written as follows:
(1)
Figure 3

The schematic representation of an artificial neural network.

Figure 3

The schematic representation of an artificial neural network.

Close modal

The signals are transmitted from the input layer (independent variables) to the hidden layer via a weighted connection system for processing and finally to the output layer (dependent variable), the network will be extensively trained to optimize the number of hidden nodes,

The data were divided into training (80%) and validation (20%) sets. The training dataset is used to adjust the weight to estimate and learn the parameter model, while the validation subset is used to evaluate the performance of the trained network (Thurston et al. 2011; Isiyaka et al. 2018). In this study, the networks will be trained and validated based on the formula proposed by Caudill (1988) (Benbouras et al. 2021).

The statistical performance indicators are mean absolute error (MAE), root mean square error (RMSE), Pearson correlation coefficient (R), index of dispersion (IOS), Nash–Sutcliffe coefficient (NE) and relative error (RE%) (Hameed et al. 2016; Benbouras et al. 2021).
(2)
(3)
(4)
(5)
(6)
(7)
where WQIa is the observed value of the WQI, WQIp is the predictive value of the WQI, and N is the number of samples. The highest R with minimum RMSE and MEA indicate the better predictive ability of the ANN model (Hameed et al. 2016; Isiyaka et al. 2018) to achieve the best structure of the ANN model, simulations using a different number of neurons will be performed. In order to decide which architecture is the most ideal, the performance results of each model will be recorded and compared (Kouadri et al. 2021).

Therefore, to verify the validation of the model and to estimate the generalization capacity of the learning model, we will use the K-flod cross-validation approach. The latter is an advanced approach, which revealed more accuracy and robustness when assessing the ability of the optimal model to overcome over-fitting and under–fitting problems in data learning. The approach relies on dividing the database into K equal splits. Hence, for each split, folds are utilized for the training phase and the last one for validation. This procedure is reiterated successively until the use of all splits for the validation step (Benbouras et al. 2021).

We used the connection weight approach for the importance analysis of the input variables on the prediction of the WQI. The computational procedure is to multiply the value of the connection weight of the hidden-output neurons, for each hidden neuron, by the values of the connection weights of the input-hidden layer. By doing this for each input neuron, we identified its contribution to the output (Sekiou 2014).

Calculation of WQI

The WQI is an index that expresses the overall quality of the water at a certain location and time (Saha 2014), thus determining whether the water in question is of good quality. Water Quality Indicators are used as a strategy to address several environmental issues by many water quality monitoring agencies and managers (Noori et al. 2019; Tripathi & Singal 2019; Kouadri et al. 2021).

In this work, we tested three classical indices (WQINSF, WQICCME and WQIAP) for the best evaluation of the surface water quality of the Constantinois basin. The three indices are presented in Table 2.

Table 2

Use, aggregation form and interpretation of WQI

NameUseAggregation formInterpretationReference
WQINFS General surface water quality evaluation 

:sub-index of parameter;
weight of parameter
Weighted geometrical average 
0: very bad
100: excellent 
Brown et al. (1970), Noori et al. (2019
WQICCME General surface water quality evaluation 

: (number of failed parameter/total number of parameters) 100,
: frequency = (number of failed tests/total number of tests) 100;
The measure for amplitude, F3 is calculated as follows:
Excursion is the number of times by which an individual concentration is greater than (or less than, when the objective is a minimum) the objective. When the test value does not exceed the objective:

For cases in which the test value exceeds the objective:

The collective amount by which individual tests are out of compliance is calculated by summing the excursions of individual tests from their objectives and dividing by the total number of tests (both those meeting objectives and those not meeting objectives). This variable, referred to as the normalized sum of excursions (nse) is calculated as:



F3 is then calculated by an asymptotic function that scales the normalized sum of the excursions from objectives (nse) to yield a range between 0 and 100.
 
0: bad
100: excellent 
CCME (2001), Haile & Gabbiye (2021)  
weighted arithmetical WQIAP General surface water quality evaluation 

: the quality rating scale for each parameter,
: the concentration of the parameters
: is the ideal value of this parameter in pure water, (except
The standard value of the parameter  
0: excellent
100: bad 
Abbasi & Abbasi (2012)  
NameUseAggregation formInterpretationReference
WQINFS General surface water quality evaluation 

:sub-index of parameter;
weight of parameter
Weighted geometrical average 
0: very bad
100: excellent 
Brown et al. (1970), Noori et al. (2019
WQICCME General surface water quality evaluation 

: (number of failed parameter/total number of parameters) 100,
: frequency = (number of failed tests/total number of tests) 100;
The measure for amplitude, F3 is calculated as follows:
Excursion is the number of times by which an individual concentration is greater than (or less than, when the objective is a minimum) the objective. When the test value does not exceed the objective:

For cases in which the test value exceeds the objective:

The collective amount by which individual tests are out of compliance is calculated by summing the excursions of individual tests from their objectives and dividing by the total number of tests (both those meeting objectives and those not meeting objectives). This variable, referred to as the normalized sum of excursions (nse) is calculated as:



F3 is then calculated by an asymptotic function that scales the normalized sum of the excursions from objectives (nse) to yield a range between 0 and 100.
 
0: bad
100: excellent 
CCME (2001), Haile & Gabbiye (2021)  
weighted arithmetical WQIAP General surface water quality evaluation 

: the quality rating scale for each parameter,
: the concentration of the parameters
: is the ideal value of this parameter in pure water, (except
The standard value of the parameter  
0: excellent
100: bad 
Abbasi & Abbasi (2012)  

Development of a new index the regularized WQI, a principal component-based index (WQIR)

To further, investigate the estimation of water quality by indices, we propose a new index whose entries will be the scores of the principal components with eigenvalues greater than 1. This variant of the index is used to improve the assessment of water quality, particularly where data are lacking or scarce. This new index is estimated based on the final aggregation of the Arithmetic Index with a weight redevelopment. The four steps necessary to calculate the new WQIR index.

  • 1.

    Selection of the input parameters for the WQIR calculation

This initial step consists in determining the new input parameters (principal components) matrix which is calculated using the following Equation (8):
(8)
  • PC1 = c11X1 + c12X2 +c13X3 +· · · + c1nXp (axis Y1)

  • PC2 = c21X1 + c22X2 +c23X3 +· · · + c2pXp (axis Y2)

  • . . .

  • PCp = cp1X1 + cp2X2 +cp3X3 +· · · + cnpXp (axis Yp)

where is the coefficient of the principal component score for the 23 variables on the PC1 and axis
denotes the total number of surface water samples; indicates the number of principal components with eigenvalues greater than 1.
  • 2.

    Calculation of the parameter rating scale

According to this new approach, the water quality is mainly based on the observed concentration () and the standard allowable concentration () of the new parameters; and can be adjusted to the total number of applied variables () as desired by the user.

The quality rating scale () for each new parameter is calculated using this expression:
where,
  • : The concentration of the new parameter _ij in the analyzed water is estimated.

  • : It is the recommended standard value of parameter.
  • ∶ The number of principal components.

  • 3. Calculation of weights

The third step for WQIR development is the establishment of weights.

In general, weights are assigned to parameters based on their relative importance and influence on the final index value (Sutadian et al. 2016). We propose to use the contribution of the component in the total variance as weights in order to avoid any subjectivity of choice and the arbitrary selection of weights.

  • 4.

    Calculation of the final index

In this model, the different water quality components are multiplied by a weighting factor and then aggregated using the simple arithmetic mean. Using the following equation to calculate the index:

The WQI was divided into five water quality classes based on the weighted arithmetic WQI method.

Determination of WQI entries

The results of the PCA analysis presented in this study relate to the first stage of WQI development, i.e., parameter selection. The parameters are divided into physical and chemical parameters and biological parameters (coliforms, heterotrophic bacteria), the latter are important indicators of water quality, related to human health. Researchers have used these parameters in the determination of indices and shown their great usefulness, but the major studies deal with coliforms as water quality factors. (Lin & Ganesh 2013; Seo et al. 2019; Kothari et al. 2021).

The number of parameters used varies from one model to another; it is just five to six parameters for the Malaysian index models (Hameed et al. 2016) and up to 47 parameters for the BCWQI, which considers a large number of variables. In general the most used parameters are the following 10 the temperature, turbidity, pH, TDS, fecal coliforms (FC), dissolved oxygen (DO), biochemical oxygen demand (BOD5), chemical oxygen demand (COD), nitrite (NO2) and nitric nitrogen (NH3).

The statistical treatment of the principal component analysis was carried out for the seven stations and 23 variables: COD, BOD5, MO, DO, NH4, NO2, NO3, PO4, SO4, Cl, Tu, TSS, Rs, Cond, TA, Mg, Na, Ca, T, pH, HCO3, K, OS.

Following the application of PCA, only eight principal components with eigenvalues greater than 1 were extracted according to Keiser's rule, which explains more than 70% of the total variance, as shown in Table 3 (Abbasi & Abbasi 2012).

Table 3

Results of determination of eigenvalues and explained variances

PC1PC2PC3PC4PC5PC6PC7PC8PC9
Eigenvalue 6.06 2.186 1.877 1.452 1.220 1.197 1.116 1.033 0.942 
Variability (%) 26.344 9.506 8.162 6.312 5.304 5.207 4.855 4.494 4.094 
Cumutative 26.344 35.851 44.013 50.325 55.629 60.837 65.692 70.186 74.280 
PC1PC2PC3PC4PC5PC6PC7PC8PC9
Eigenvalue 6.06 2.186 1.877 1.452 1.220 1.197 1.116 1.033 0.942 
Variability (%) 26.344 9.506 8.162 6.312 5.304 5.207 4.855 4.494 4.094 
Cumutative 26.344 35.851 44.013 50.325 55.629 60.837 65.692 70.186 74.280 

Table 4 shows the eight factor extracts of the principal component and Figure 5(b) presents the correlation matrix between the variables and the factors. The first component (PC1) explains the mineralization of the waters of the basin. While the components PC2 and PC6, mainly show the water load in nitrates and nitrites, which indicates nitrogen pollution. The third component PC3 and the fourth PC4 reflect organic pollution. While the eighth component PC8 reflects the degree of eutrophication of surface waters. The source of these pollutants is from anthropogenic activities such as biodegradable organic pollutants, domestic wastewater discharges and industrial effluents.
Table 4

Matrix of principal components

PC1PC2PC3PC4PC5PC6PC7PC8
BOD5 0.093 0.070 −0.465 0.606 −0.229 0.069 0.221 0.223 
COD 0.135 0.402 0.417 0.473 0.109 −0.163 −0.215 0.102 
MO 0.033 0.022 0.199 0.140 0.334 −0.031 0.077 0.513 
NH4 0.077 0.301 0.203 0.103 −0.110 0.320 0.016 −0.546 
NO2 0.333 0.376 −0.082 0.150 0.030 0.520 −0.218 0.141 
NO3 0.101 0.577 −0.043 −0.151 −0.239 0.219 −0.279 0.154 
OS −0.232 −0.191 0.758 −0.105 0.049 0.414 0.138 0.110 
PO4 −0.001 0.142 −0.170 0.360 0.264 0.317 0.374 −0.416 
SO4 0.839 0.111 −0.005 −0.031 0.194 0.109 −0.101 0.132 
Cl 0.725 −109 0.274 0.211 −067 −235 0.246 −148 
Tu −327 0.493 0.000 0.367 0.226 −0.098 0.012 0.006 
TSS 0.167 0.117 −0.452 −0.273 0.376 0.017 0.281 0.070 
RS 0.954 −0.010 0.063 −0.024 0.096 −0.059 0.067 0.001 
pH −0.025 −0.526 −0.162 0.120 0.050 0.394 0.359 0.260 
DO −0.276 0.459 0.525 −0.287 −0.184 0.089 0.479 0.167 
Cond 0.927 0.008 0.152 0.000 0.081 −0.122 0.046 −0.016 
Ca 0.701 0.205 −0.035 0.005 0.245 0.100 −0.121 0.072 
Mg 0.778 −0.192 0.022 −0.030 −0.162 0.006 −0.009 0.040 
Na 0.835 −0.109 0.252 0.063 0.029 −0.193 0.161 −0.104 
TA 0.742 0.074 −0.126 −0.237 −0.158 0.165 −0.012 −0.024 
HCO3 0.522 −0.018 −0.227 −0.259 −0.180 0.209 −0.033 −0.007 
−0.026 −0.716 0.236 0.227 0.215 0.271 −0.414 −0.082 
0.217 −0.165 0.041 0.361 −0.653 0.015 0.034 0.144 
PC1PC2PC3PC4PC5PC6PC7PC8
BOD5 0.093 0.070 −0.465 0.606 −0.229 0.069 0.221 0.223 
COD 0.135 0.402 0.417 0.473 0.109 −0.163 −0.215 0.102 
MO 0.033 0.022 0.199 0.140 0.334 −0.031 0.077 0.513 
NH4 0.077 0.301 0.203 0.103 −0.110 0.320 0.016 −0.546 
NO2 0.333 0.376 −0.082 0.150 0.030 0.520 −0.218 0.141 
NO3 0.101 0.577 −0.043 −0.151 −0.239 0.219 −0.279 0.154 
OS −0.232 −0.191 0.758 −0.105 0.049 0.414 0.138 0.110 
PO4 −0.001 0.142 −0.170 0.360 0.264 0.317 0.374 −0.416 
SO4 0.839 0.111 −0.005 −0.031 0.194 0.109 −0.101 0.132 
Cl 0.725 −109 0.274 0.211 −067 −235 0.246 −148 
Tu −327 0.493 0.000 0.367 0.226 −0.098 0.012 0.006 
TSS 0.167 0.117 −0.452 −0.273 0.376 0.017 0.281 0.070 
RS 0.954 −0.010 0.063 −0.024 0.096 −0.059 0.067 0.001 
pH −0.025 −0.526 −0.162 0.120 0.050 0.394 0.359 0.260 
DO −0.276 0.459 0.525 −0.287 −0.184 0.089 0.479 0.167 
Cond 0.927 0.008 0.152 0.000 0.081 −0.122 0.046 −0.016 
Ca 0.701 0.205 −0.035 0.005 0.245 0.100 −0.121 0.072 
Mg 0.778 −0.192 0.022 −0.030 −0.162 0.006 −0.009 0.040 
Na 0.835 −0.109 0.252 0.063 0.029 −0.193 0.161 −0.104 
TA 0.742 0.074 −0.126 −0.237 −0.158 0.165 −0.012 −0.024 
HCO3 0.522 −0.018 −0.227 −0.259 −0.180 0.209 −0.033 −0.007 
−0.026 −0.716 0.236 0.227 0.215 0.271 −0.414 −0.082 
0.217 −0.165 0.041 0.361 −0.653 0.015 0.034 0.144 
Figure 4

(a) Projection of the variables on the principal plane 1–2. (b) Correlation matrix between the variables and the principal components.

Figure 4

(a) Projection of the variables on the principal plane 1–2. (b) Correlation matrix between the variables and the principal components.

Close modal

Figure 4(a) shows the homogeneous distribution of the variables in the projection area 1–2 (Dim1–Dim2) and the absence of the size effect, this indicates the good presentation of all the pollution parameters.

The analysis of the factorial design F1 (Dim1) and F2 (Dim2) shows that 36% of the information is expressed. The F1 (Dim1) design displays 26.3% of the variance, and characterizes the mineralization of the waters. It is determined by the Rs, Cond, TA, SO4, Cl, Mg, Na and Ça, which are strongly correlated with each other and positively to the F1, since they define eigenvectors of the same direction. They present the following correlations: Cond&Rs (0.919), Na&Rs105 (0.852), Na&Cond (0.817), Cond&SO4 (0.799), Cond&Cl (0.726), and Mg&Rs1 (0.702)…etc. The correlations between these variables are stronger when the variables are positioned at the ends of the axis defined by Principal Component 1.

The factorial plane F2 (Dim2) represents only 9.5% of the information and is considered as an axis characterizing organic and agricultural pollution, it is determined by NO3, NO2, NH4, COD, DO, Tu, PO4, pH and Temperature (Figure 4(a)(b)).

The factorial plane F2 (Dim2) represents only 9.5% of the information and is considered as an axis characterizing organic and agricultural pollution. These variables are probably better explained by other principal components, other than CP1 and CP2.

On the basis of the 23 variables initially chosen, the use of PCA led us to use the following 10 variables: Cond, DO, BOD5, COD, NO3, NO2, PO4, TSS, pH and T (Figure 5) because of their high level of impact on the determination of water quality characteristics and their high correlations with the first principal components.
Figure 5

Pairs of physico-chemical parameters.

Figure 5

Pairs of physico-chemical parameters.

Close modal
The linear relations between the 10 parameters are evaluated by the calculation of the correlation matrix (Figure 6(a)), at the threshold of 5%, the critical correlation coefficient Rc = 0.19 (Sekiou 2014). Positive correlations are shown in blue and negative correlations in red. The intensity of the color and the size of the circles are proportional to the correlation coefficients. To the right of the correlogram, the color legend shows the correlation coefficients and the corresponding colors. Indication of non-significant correlations by a cross.
Figure 6

(a) Correlation matrix of the 10 variables. (b) Combination between the correlogram and the significance test.

Figure 6

(a) Correlation matrix of the 10 variables. (b) Combination between the correlogram and the significance test.

Close modal

Several significant correlations could be identified (Figure 6(b)), the correlation between OS with BOD5 and TSS, pH with temperature, NO3, COD and BOD5, between NO2 with NO3 and conductivity, and between TSS with OS and temperature. This shows the important and significant role that these elements play in determining the salt load of these waters.

Assessment of water quality by calculating WQIAP, WQINSF, WQICCME and WQIR

Calculation of the weighted WQIAP, WQINFS and WQICCME performed for the seven stations shows notable similarities and dissimilarities found in the assessment of water quality, as shown in Figure 7.

According to Figure 7, it can be seen that the variation of WQICCME and WQINFSQ indices are similar for the seven stations, both indices evolve the same monotonicity during the period from 2010 to 2013, and the values of the indices are between 51 and 80.
Figure 7

Comparison between the three Indices used in the assessment of surface water quality in the Constantine Watershed from 2010 to 2018.

Figure 7

Comparison between the three Indices used in the assessment of surface water quality in the Constantine Watershed from 2010 to 2018.

Close modal

While the weighted arithmetic index WQIAP has revealed the presence of pollution peaks in the seasons of summer 2013 for the station Mixa, the autumn of 2011 for the station Zit El-Emba; the autumn of 2012 for both stations Zerdezas and Guenitra. Moreover, the same index shows the presence of pollution in the following seasons: autumn 2016, winter and spring 2017 for the Zerdezas station; summer 2018 for the Guenitra station, these pollution peaks were not clear following the use of the two indices WQICCME and WQINFSQ.

The high WQIAP values recorded at all stations probably originate from the large volume of river flow in the wet season and the volume of storm flooding in the summer season in accordance with the comments of Noori et al. (2019) and also to the bypassing of wastewater to the natural environment instead of it being delivered to the treatment plant for treatment.

The WQIAP was able to show, contrary to the other indices, that the global quality of the waters in all the stations studied was, apart from some good values, considerably degraded ‘bad to very bad’ with the presence of a significant number of values unsuitable for consumption, whereas the two indices WQICCME and WQINFSQ show that these waters were respectively of fair and good quality. This result is reported by House (1989) and (Gao et al. 2020) who note that the weighted arithmetic index WQIAP provides the best results for indexing general water quality (Fernandez et al. 2005).

The differences in the evaluation of the three indices based on the same calculation parameters are due a priori to the final aggregation formula, which may give a different evaluation, to the specificity of these indices to the geographical region where they were generated, as well as their ranges of classification of water quality (Fernandez et al. 2005).

To further investigate water quality, we developed a new index based on principal components by replacing the classical index entries with the first eight principal components with eigenvalues greater than 1.

A comparison of the evaluation of WQIAP and WQIR was made for the seven stations; the visualization of the trend of the two indices is presented in Figure 8. The WQIR index shows more oscillations and indicator points of pollution than the WQIAP, and reveals, contrary to WQIAP (the classic index), the presence of pollution in the following seasons: spring 2010 and winter 2011 for the two stations Zerdezas, Mixa; winter 2014 and spring 2018 for the station Zit El-Emba; autumn 2010 and summer 2014 for the station Guenitra, winter 2016 for the two stations El Egreme and Chafia, and summer 2014 for the station BniZid.
Figure 8

Variation of the WQIAP and the WQIR of the surface waters of the Constantinois Coastal Watershed from 2010 to 2018.

Figure 8

Variation of the WQIAP and the WQIR of the surface waters of the Constantinois Coastal Watershed from 2010 to 2018.

Close modal

Also, the evolution of WQIR shows contrary to WQIAP that after the pollution peak (summer 2014, winter 2010 and autumn 2011) the water quality of Mixa, Zit El Emba and El Egreme stations successively did not return to its initial state.

In fact, the difference in evaluation between the Classic WQIAP and WQIR index probably comes down to the fact that the WQIAP uses 10 parameters (determined by PCA), whereas the second one uses the summary of the 23 parameters resulting from the use of the first eight principal components, which allowed the detection of the eclipsed pollution points.

Furthermore, the variations among WQIs were also authenticated by ANOVA results. The outcome of ANOVA indicated that WQIAP exhibited a significant difference when compared with WQINSF and WQICCME (P = 0.015 < 0.05) and had very highly significant variations with the WQIR (P < 0.001). The means are significantly different according to the Tukey test (this test has the ability to separate the means into groups), recorded the highest variance in WQIR), the mean in WQICCME and WQINFS and the lowest in WQIAP (.

Prediction of the classical weighted arithmetic index WQIAP and the regularized index WQIR

Two ANN prediction models were proposed, the first prediction model was built based on the 10 variables: DO, BOD5, COD, NH3, NO2, PO4, TSS, Cond, T and pH as input parameters and WQIAP as output parameter; while for the second model, the first eight principal components are considered as input and the WQIR (regularized weighted arithmetic index) as output.

Table 5 shows the best performance obtained, it appears that the model (10.8.1) 10 neurons of the input layer (number of water quality parameters), eight neurons of the hidden layer and one neuron of the output layer (WQIAP as a target) present the best combination of neurons of the first model with R2 = 0.99 and RMSE = 2.8595 (Figure 9).
Table 5

Tested ANN models of the first case

ModeArchitecture
Performance
InputsHidden neuronsOutputsMAERMSEIOSRNE
M1 10 11.6197 5.1003 0.0977 0.9842 0.9687 
M2 10 11.3513 4.0292 0.0772 0.9901 0.9805 
M3 10 9.7247 2.9393 0.0563 0.9948 0.9896 
M4 10 8 1 11.5119 2.8565 0.0547 0.9951 0.9902 
M5 10 11.4436 4.0187 0.0769 0.9908 0.9806 
M6 10 10 10.0134 3.6638 0.0702 0.9921 0.9838 
M7 10 11 10.1393 4.4953 0.0862 0.9877 0.9756 
M8 10 12 11.378 4.4286 0.0848 0.9882 0.9771 
ModeArchitecture
Performance
InputsHidden neuronsOutputsMAERMSEIOSRNE
M1 10 11.6197 5.1003 0.0977 0.9842 0.9687 
M2 10 11.3513 4.0292 0.0772 0.9901 0.9805 
M3 10 9.7247 2.9393 0.0563 0.9948 0.9896 
M4 10 8 1 11.5119 2.8565 0.0547 0.9951 0.9902 
M5 10 11.4436 4.0187 0.0769 0.9908 0.9806 
M6 10 10 10.0134 3.6638 0.0702 0.9921 0.9838 
M7 10 11 10.1393 4.4953 0.0862 0.9877 0.9756 
M8 10 12 11.378 4.4286 0.0848 0.9882 0.9771 
Figure 9

Architecture of the ANN model (10.8.1).

Figure 9

Architecture of the ANN model (10.8.1).

Close modal

This result is also reported by Singh et al. (2021) who note that the model (10.8.1) produces good agreement of the predicted values with R2 and RMSE equal to 0.9810 and 0.1324. We note that there is a simulation between the results. However the model is marked by Hameed et al. (2016) who mention that the tropical water index using the RBFNN model showed the best performance evaluation criteria of R2, RMSE, and NE (0.9872, 0.0157, and 0.9871, respectively) with six input layer neurons (number of water quality parameters), eight hidden layer neurons, and one output layer neuron (WQI as target) (6.8.1). In contrast Isiyaka et al. (2018) with a multilayer ANN shows that a model of 14 physicochemical parameters (as input parameter) and 10 nodes in the hidden layer with a target output (14.10.1.1) gives the best performance criteria (highest R2 = 0.998 with lowest RMSE = 0.432).

The results indicate that, apart from the number and choice of parameters, the nature of the learning algorithm and the type and architecture (structure) of the model (number of hidden layers) are of great importance to achieve the best prediction model.

For the second model, it was found that the model (8.4.1), eight input layer neurons (number of principal components with eigenvalues greater than 1), four hidden layer neurons, and one output layer neuron (WQIR as target) (Figure 10), yields the best model for predicting the WQI (Table 6).
Table 6

Performance of the selected ANN model

ModeArchitecture
Performance
InputsHidden neuronsOutputsMAERMSEIOSRNE
M1 0.2034 0.70852 0.00361 0.99994 0.9987 
M2 0.2693 1.09485 0.00558 0.99986 0.9969 
M3 8 4 1 0.2026 0.35369 0.00180 0.99998 0.9996 
M4 0.3175 1.76558 0.00901 0.99965 0.9922 
ModeArchitecture
Performance
InputsHidden neuronsOutputsMAERMSEIOSRNE
M1 0.2034 0.70852 0.00361 0.99994 0.9987 
M2 0.2693 1.09485 0.00558 0.99986 0.9969 
M3 8 4 1 0.2026 0.35369 0.00180 0.99998 0.9996 
M4 0.3175 1.76558 0.00901 0.99965 0.9922 
Figure 10

Architecture of the ANN model (8.4.1).

Figure 10

Architecture of the ANN model (8.4.1).

Close modal

For the second model, it was found that the model (8.4.1) eight input layer neurons (number of principal components with eigenvalues greater than 1 multiplied with raw data), four hidden layer neurons and one output layer neuron (WQIR as a target) (Figure 11), yielded the best model for predicting the WQI with R2 = 0.9998 and RMSE = 0.155 (Table 6).

The selected model appears less complex and more efficient (a single hidden layer, high R2 and low RMSE) than the two models proposed by Isiyaka et al. (2018) whose architecture is complex, the models contain two hidden layers (Table 7).

Table 7

Performance of the best model in the different phases

ArchitectureR2RMSE
Selected Model 8.4.1 0.9998 0.155 
Model 1 14.8.1.1 0,999 0,159 
Model 2 6.4.1.1 0,950 2,351 
ArchitectureR2RMSE
Selected Model 8.4.1 0.9998 0.155 
Model 1 14.8.1.1 0,999 0,159 
Model 2 6.4.1.1 0,950 2,351 

This indicates that the model with reduced data set dimensionality has the best combination of inputs and outputs capable of predicting WQI with high accuracy. The alternative approach that we proposed to develop a new index (WQIR) gave commendable results compared to the classical WQI because of the advantages of principal components in summarizing all the information in a reduced number of variables.

Table 8 summarizes the training, validation, and overall performance of the two ANN prediction models. The results show the high performance of the ANN models in predicting the WQIAP and WQIR indices with a marked improvement in favor of the WQIR prediction model, the latter is parsimonious and allows low RMSE and high R.

Table 8

Performance of the best model in the different phases

TrainingValidationAll
M1 RMSE 2.7045 3.3929 2.8565 
R 0.9953 0.99501 0.99512 
M2 RMSE 0.3805 0.2156 0.3536 
R 0.99998 0.99999 0.99998 
TrainingValidationAll
M1 RMSE 2.7045 3.3929 2.8565 
R 0.9953 0.99501 0.99512 
M2 RMSE 0.3805 0.2156 0.3536 
R 0.99998 0.99999 0.99998 

Figure 11(a) and 11(b) cross-reference the observed WQI values with the predicted WQI values, the points align on the first bisector, the correlations of the two models are very good (R2 = 0.99 for the simple ANN model and R2 = 0.999 for the ANN model based on the principal components). The ANN model based on principal components shows the best performance, its explanatory and predictive power is superior to the simple ANN model.
Figure 11

Correlation between predicted WQI and calculated WQI. (a) ANN model using 10 variables. (b) RNA model using eight principal components.

Figure 11

Correlation between predicted WQI and calculated WQI. (a) ANN model using 10 variables. (b) RNA model using eight principal components.

Close modal
Plotting the evolution of the relative error (ER %) illustrated in Figure 12(a) and 12(b) shows that for the first simple ANN model, (Figure 12(a)), the relative error of 90% of predictions is between −32% and +23% while for the second model the relative error of 90% of predictions is between 0.005% and 0.01% (Figure 12(b)).
Figure 12

Relative error distribution. (a) ANN model using 10 variables. (b) ANN model using eight principal components.

Figure 12

Relative error distribution. (a) ANN model using 10 variables. (b) ANN model using eight principal components.

Close modal
The analysis of the input variables according to the connection weights method (see Figure 13) shows the clear influence of temperature, Os, NO3, BOD5, and TSS on the simple ANN model for WQI prediction, while the weak effect goes to nitrite NO2. For the second model, component 2 is closely related to the parameters NO3, Tu, pH and Temperature, while component 6 is positively correlated with NO2 and Os, has a large effect on the ANN model of WQIR prediction, the two components CP5 and CP8 show the weakest effects (see Figure 14).
Figure 13

Relative importance of input variables on the ANN prediction model of WQIAP.

Figure 13

Relative importance of input variables on the ANN prediction model of WQIAP.

Close modal
Figure 14

Relative importance of input variables on the ANN model for predicting WQIR.

Figure 14

Relative importance of input variables on the ANN model for predicting WQIR.

Close modal

This study is part of the research of the best evaluation and prediction of the water quality of the Constantine coastal watershed, It consists of the use of principal component analysis (PCA) in combination with ANN regression to improve the data evaluation of classical indices, to develop a new WQI and to establish predictive ANN models.

So, data collection and analyses are discussed, along with the various water quality parameters and indices for the assessment of the environmental aspects of surface resources.

Principal component analysis simplified the initial WQI models constructed from 23 variables down to only 10 variables (Cond, DO, BOD5, COD, NO3, NO2, PO4, TSS, pH, and T) and extracted eight principal components of over 70% explained variance to be used in the development of a new WQIr index and a regularized ANN regression model.

Comparing the three indices CCME-WQI, NFS-WQI, and AP-WQI the latter adapts to the geographical region and better assesses the water quality of the coastal Constantinois watershed and reveals more peaks and seasons of pollution which is mainly due to the large volume of river flow in the wet season, the volume of stormy floods in the summer season, and to the bypassing of wastewater to the natural environment instead of to the treatment plant for treatment.

Generally, the use of the ANN is a technique to predict surface water quality for proper management of the river (threshold) so that adequate measures can be taken to keep pollution within permissible limits.

For the best assessment we have developed the WQIR whose inputs are the first principal component of eigenvalue greater than 1. This index is based on specific principal components to achieve data collection in a precise number of variables, it reveals water quality during and after pollution peaks.

This model is a new technique based on a flexible mathematical structure that is capable of identifying complex non-linear relation ships between input and output data when compared with other classical modeling techniques.

Hence, it should be that the ANN model can describe the behavior of water quality parameters more accurately than the linear regression models, and we observed that the WQIR is better than the classical WQIAP.

So, the WQIR is a suitable approach that really meets the demand of researchers since it can allow easy and efficient interpretation of monitoring data and so far ensure a sustainable and friendly green environment.

In fact, the use of multivariate statistics in combination with artificial intelligence techniques allowed us to better assess water quality; to reveal eclipsed pollution points, to develop a new index and to establish parsimonious prediction models.

In fact, the use of different statistics in combination with artificial intelligence techniques helps in the evaluation of water quality to reveal the obscured pollution point, to improve a more effective index and build other prediction models.

Our findings show a remarkable study as a first research work on the evolution of a new WQI that best suits the data-poor study of the region.

The authors are thankful to the ANRH for providing the necessary data for the study reported in the paper free of charge; and to the anonymous reviewers for their careful reading and precious comments.

The authors are not affiliated with or involved with any organization or entity with any financial interest or nonfinancial interest in the subject matter or materials discussed in this paper.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abbasi
T.
&
Abbasi
S. A.
2012
Water Quality Indices
.
Elsevier Science
,
Burlington, MA
,
375 pp.
Achary
G. S.
2017
International Journal of Current Engineering and Technology 7, 1745–1749
.
Behboudian
S.
,
Tabesht
M.
,
Falahnezhad
M.
&
Ghavanini
F. A.
2014
A long-term prediction of domestic water demand using preprocessing in artificial neural network
.
Journal of Water Supply: Research and Technology-Aqua
63
,
31
42
.
doi:10.2166/aqua.2013.085
.
Benbouras
M. A.
,
Petrisor
A.-I.
,
Zedira
H.
,
Ghelani
L.
&
Lefilef
L.
2021
Forecasting the bearing capacity of the driven piles using advanced machine-learning techniques
.
Applied Sciences
11
,
10908
.
doi: 10.3390/app112210908
.
Bouslah
S.
2017
Study of the Quality of Water Stored Upstream and Infiltration Water Downstream of Embankment Dams in Algeria
.
University Badji Mokhtar Annaba
,
Algeria
.
Brown
R. M.
,
McClelland
N. I.
,
Deininger
R. A.
&
Tozer
R. G.
1970
A water quality index: do we dare?
Water and Sewage Works
117 (
10
),
339
343
.
Fernández
N.
,
Ramírez
A.
&
Solano
F.
2004
Physicochemical water quality indices - a comparative review
.
GeoJournal
doi:10.24054/01204211.v1.n1.2004.9
Caudill
M.
1998
Neural networks primer, part 3
,
AI Expert
pp.
59
61
.
CCME
2001
CCME Index User's Manual, Canadian Water Quality Guidelines
.
Ministry of Environment and Resource Management
,
Saskatchewan
,
Canada
.
Crabill
C.
,
Donald
R.
,
Snelling
J.
,
Foust
R.
&
Southam
G.
1999
The impact of sediment fecal coliform reservoirs on seasonal water quality in Oak Creek, Arizona
.
Water Research
33
,
2163
2171
.
https://doi.org/10.1016/S0043-1354(98)00437-0
.
Diallo
A. D.
,
Ibno Namr
K.
,
N'diaye
A. D.
,
Garmes
H.
,
KanKou
M.
&
Wane
O.
2014
The interest of statistical analysis methods in the management of the monitoring of the physicochemical quality of the water of the right bank of the Senegal River
.
Larhyss Journal
17
,
101
114
.
Fernandez
N.
,
Ramirez
A.
&
Solano
F.
2005
Physico-chemical water quality indices -Acomparative Review.rev_bis_vol2_num1_art3 1. 19-30
.
Frena
M.
,
Santos
A. P. S.
,
Souza
M. R. R.
,
Carvalho
S. S.
,
Madureina
L. A. S.
&
Alexandre
M. R.
2019
Sterol biomarkers and fecal coliforms in a tropical estuary: seasonal distribution and sources
.
Marine Pollution Bulletin
139
,
111
116
.
https://doi.org/10.1016/j.marpolbul.2018.12.007
.
Gao
Y.
,
Qian
H.
,
Ren
W.
,
Wang
H.
,
Liu
F.
&
Yang
F.
2020
Hydrogeochemical characterization and quality assessment of groundwater based on integrated-weight water quality index in a concentrated urban area
.
Journal of Cleaner Production
260
,
121006
.
https://doi.org/10.1016/j.jclepro.2020.121006.
Garcia
C. A. B.
,
Silva
I. S.
,
Mendonça
M. C. S.
&
Garcia
H. L.
2018
Evaluation of water quality indices: use, evolution and future perspectives
.
In: Advances in Environmental Monitoring and Assessment
.
IntechOpen.
https://doi.org/10.5772/intechopen.79408.
Haile
D.
&
Gabbiye
N.
2021
The applications of Canadian water quality index for ground and surface water quality assessments of Chilanchil Abay watershed: The case of Bahir Dar city waste disposal site
.
Water Supply
22
(
1
),
89
109
.
https://doi.org/10.2166/ws.2021.286
.
Hameed
M.
,
SharqiS
S. S.
,
Yaseen
Z. M.
,
Afan
H. A.
,
Hussain
A.
&
Elshafie
A.
2016
Application of artificial intelligence (AI) techniques in water quality index prediction: a case study in tropical region, Malaysia
.
Neural Computing and Applications
28
,
893
905
.
doi:10.1007/s00521.016-2404-7
.
Horton
R. K.
1965
An index number system for rating water quality
.
Journal of the Water Pollution Control Federation
37
,
300
306
.
Hossain
M.
&
Patra
P. K.
2020
Water pollution index – a new integrated approach to rank water quality
.
Ecological Indicators
117
.
https://doi.org/10.1016/j.ecolind.2020.106668
.
House
M. A.
1989
A water quality index for river management
.
Journal of the Institute of Water & Environmental Management
3
,
336
344
.
https://doi.org/10.1111/j.1747-6593.1989.tb01538.x.
Isiyaka
H. A.
,
Mustapha
A.
,
Juahir
H.
&
Phil-Eze
P.
2018
Water quality modelling using artificial neural network and multivariate statistical techniques
.
Modeling Earth Systems and Environment
5
,
583
593
.
doi:10.1007/s40808-018-0551-9
.
Islam
A. R. M. T.
,
Al Mamun
A.
,
Rahman
M. M.
&
Zahid
A.
2020
Simultaneous comparison of modified-integrated water quality and entropy weighted indices: implication for safe drinking water in the coastal region of Bangladesh
.
Ecological Indicators
113
,
106229
.
https://doi.org/10.1016/j.ecolind.2020.106229
.
Khuan
L. Y.
,
Hamzah
N.
&
Jailani
R.
2002
Prediction of water quality index (WQI) based on artificial neutral network
.
Student Conference on Research and Development Proceedings, Shah Alam, 2002
.
doi:10.1109/SCORED.2002.1033081
Kothari
V.
,
Vij
S.
,
Sharma
S.
&
Gupta
N.
2021
Correlation of various water quality parameters and water quality index of districts of Uttarakhand
.
Environmental and Sustainability Indicators
9
.
https://doi.org/10.1016/j.indic.2020.100093
Kouadri
S.
,
Kateb
S.
&
Zegait
R.
2021
Spatial and temporal model for WQI prediction based on back-propagation neural network, application on El Merk region (Algerian southeast)
.
Journal of the Saudi Society of Agricultural Sciences
20 (5), 324–336. https://doi.org/10.1016/j.jssas.2021.03.004
.
Lee
H.-J.
,
Park
H.-K.
,
Lee
J. H.
,
Park
A. R.
&
Cheon
S.-U.
2016
Coliform pollution status of nakdong river and tributaries
.
Journal of Korean Society on Water Environment
32
,
271
280
.
http://dx.doi.org/10.15681/KSWE.2016.32.3.271
.
Lin
J.
&
Ganesh
A.
2013
Water quality indicators: bacteria, coliphages, enteric viruses
.
International Journal of Environmental Health Research
23
,
484
506
.
http://dx.doi.org/10.1080/09603123.2013.769201
.
Liu
W. C.
,
Chan
W. T.
&
Young
C. C.
2015
Modeling fecal coliform contamination in a tidal Danshuei River estuarine system
.
Science of the Total Environment
502
,
632
640
.
http://dx.doi.org/10.1016/j.scitotenv.2014.09.065
.
Mukaite
S.
,
Wagh
V.
,
Panaskar
D.
,
Jacobs
J. A.
&
Sawant
A.
2019
Development of new integrated water quality index (IWQI) model to evaluate the drinking suitability of water
.
Ecological Indicators
101
,
348
354
.
https://doi.org/10.1016/j.ecolind.2019.01.034
.
Nong
X.
,
Shao
D.
,
Zhong
H.
&
Liang
J.
2020
Evaluation of water quality in the South-to-North Water Diversion Project of China using the water quality index (WQI) method
.
Water Research
178
,
115781
.
https://doi.org/10.1016/j.watres.2020.115781
.
Noori
R.
,
Berndtsson
R.
,
Hosseinzadeh
M.
,
Adamowski
J. F.
&
Abyaneh
M. R.
2019
A critical review on the application of the National Sanitation Foundation Water Quality Index
.
Environmental Pollution
244
,
575
587
.
https://doi.org/10.1016/j.envpol.2018.10.076
.
Nourani
V.
,
Khanghan
T. R.
&
Sayydi
M.
2013
Application of the Artificial Neural Network to monitor the quality of treated water
.
International Journal of Management & Information Technology
3
(
01
),
38
45
.
Reggam
A.
,
Bouchelaghem
H.
&
Houhamdi
M.
2015
Physico-chemical quality of the waters of the Oued Seybouse (Northeastern Algeria): Characterization and Principal Component Analysis
.
Journal of Materials and Environmental Science JMES
6
(
5
),
1417
1425
.
Rezaei
A.
,
Hassani
H.
&
Jabbari
N.
2017
Evaluation of groundwater quality and assessment of pollution indices for heavy metals in North of Isfahan Province, Iran
.
Sustainable Water Resources Management
5
,
491
512
.
doi:10.1007/s40899-017-0209-1
.
Sahaya Vasanthi
S.
&
Adish Kumar
S.
2019
Application of artificial neural network techniques for predicting the water quality index in the Parakai Lake, Tamil Nadu, India
.
Applied Ecology and Environmental Research
17
,
1947
1958
.
http://dx.doi.org/10.15666/aeer/1702_19471958
.
Sahoo
M. M.
,
Patra
K. C.
&
Khatua
K. K.
2015
Inference of water quality index using ANFIA and PCA
.
Aquatic Procedia
4
,
1099
1106
.
doi: 10.1016/j.aqpro.2015.02.139
.
Sekiou
F.
2014
Modeling of Flocculation Efficiency Effects of Solid Phase Characteristics and Operating Parameters
.
Doctoral Thesis
,
University Mohamed Khider-Beskra
,
Algeria
.
Singh
B.
,
Sihag
P.
,
Singh
V. P.
,
Sepahvand
A.
&
Singh
K.
2021
Soft computing technique-based prediction of water quality index
.
Water Supply
21
,
4015
4029
.
doi: 10.2166/ws.2021.157
.
Sujana Prajithkumar
D. S. V.
&
Mane
S. J.
2014
Prediction of water quality inded of Pavna river using ANN model
.
International Journal of Engineering Research & Technology (IJERT)
3
(
12
),
121
125
.
Sutadian
D.
,
Muttil
N.
,
Yilmaz
A.
&
Perera
C.
2016
Development of river water quality indices – a review
.
Environmental Monitoring and Assessment
188
(
1
),
58
.
https://doi.org/10.1007/s10661-015-5050-0
.
Thurston
G. D.
,
Ito
K.
&
Lall
R.
2011
A source apportionment of U.S
.
fine particulate matter air pollution
.
Atmospheric Environment
45
(
24
),
3924
3936
.
https://doi.org/10.1016/j.atmosenv.2011.04.070.
Yidana
S. M.
,
Banoeng-Yakubo
B.
&
Akabzaa
T. M.
2010
Analysis of groundwater quality using multivariate and spatial analyses in the Keta basin, Ghana
.
Journal of African Earth Sciences
58
,
220
234
.
doi:10.1016/j.jafrearsci.2010.03.003
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).