In this study, two artificial intelligence techniques: (1) artificial neural networks (ANNs) using different algorithms such as Lavenberg–Marquardt (LM), Bayesian Regularization (BR), and Scaled Conjugate Gradient (SCG) and (2) Adaptive Neuro-Fuzzy Inference System (ANFIS) are used to predict velocity and pressure for Gadhra (DMA-5) real water distribution network (WDN), East Singhbhum district of Jharkhand, India. In case 1, flow rate and diameter are used as independent variables to predict velocity. In case 2, elevation and demand are used as independent variables to predict pressure. 80% of the data are used to train, test, and validate the ANN and ANFIS prediction models, while 20% of the data are used to evaluate data-driven models. Sensitivity analysis is performed in ANN-LM to understand the relationship between the independent and dependent variables. The performance indices of RMSE, MAE, and R2 are evaluated for ANN and ANFIS for different combinations. The ANN-LM, with 2-16-1 architecture, is found as a superior to predict velocity and ANN-LM with architecture 2-17-1 is found as a superior to predict pressure. ANN-LM had the best prediction in estimating velocity (RMSE = 0.0189, MAE = 0.0122, R2 = 0.9568) and pressure (RMSE = 0.3244, MAE = 0.2176, R2 = 0.9773).

  • Hydraulic simulation is performed in WaterGEMS for Gadhra WDN (DMA-5).

  • Predictions of the data-driven models are performed in MATLAB using ANFIS and ANN using LM, BR, and SCG.

  • Based on statistical performance, a best model is selected for sensitivity analysis.

  • Sensitivity analysis is done using ANN-LM to evaluate the impact of independent variables: diameter, flowrate, elevation, and demand on velocity and pressure.

Challenges faced by water distribution systems (WDSs), including system deterioration, leakage, pipeline disruptions, insufficient capacity to meet demand, unreliability, and mismanagement have emphasized the necessity of replacing conventional techniques with precise and efficient computer software and methods for designing WDSs (Vairavamoorthy et al. 2001; Longe et al. 2010). The modeling of WDSs has emerged as a critical aspect, facilitating hydraulic assessments to ensure that these systems can effectively meet demand and quality standards. The water distribution network (WDN) is modeled for velocity and pressure to predict the optimum diameter required in a WDN. Based on the optimum diameter the total cost of the network is decided. Several other factors also play a vital role while modeling WDN such as elevation, demand, and the height of the Elevated Service Reservoir (ESR). Recognizing the importance of modeling WDSs, a multidisciplinary team comprising professionals, researchers, scholars, engineers, and programmers joined forces to develop software specifically designed for the design and modeling of WDSs. These advanced hydraulic simulation and modeling software tools enable thorough analysis of water behavior (Sonaje & Joshi 2015).

In recent years, artificial neural networks (ANNs) have emerged as a promising and efficient technique for modeling and forecasting. Its applications are across various engineering fields. One notable early example of using backpropagation ANNs was demonstrated by (Crommelynck et al. 1992), who applied ANN to model daily and hourly water demand forecasts in select communities of Paris, France and compared the performance of ANN models against statistical models, and their findings indicated that the ANN models perform better than the statistical models. To accurately model water demand forecasts, (Miaou 1990) various types of data can be categorized into two main classes: socio-economic variables and climatic variables. Socio-economic variables, including population, income, water price, and housing characteristics, primarily influence the long-term patterns of water demand. On the other hand, climatic variables such as rainfall and maximum air temperature play a significant role in short-term seasonal fluctuations in water demand.

Jain et al. (2001) utilized ANNs to model short-term water demand at the Indian Institute of Technology (IIT) in Kanpur, India. They developed and compared six neural network models, five regression models, and two-time series models. The results indicated that the neural network models consistently outperformed the other models in terms of performance. (Bougadis et al. 2005) conducted comparable analyses to predict both peak daily demand and weekly water demand. Their findings revealed that ANN models outperformed time series and regression models, offering superior results. Furthermore, numerous other researchers, such as Chang & Makkeasorn (2006), Zhang et al. (2004) and Lui et al. (2003), among others, have also acknowledged and reported the success of ANN in water demand prediction and forecasting.

Almheiri et al. (2020) predict the average time to failure using three techniques: ANN, ridge regression and decision trees (DT). After applying these methods to a real case study, the authors recommend DT because of their simplicity and computational efficiency. Spatial clustering is commonly used to identify regions with high-failure rates (De Oliveira et al. 2011). This technique usually serves as a support to other predictive models, providing additional input information. Giraldo & Rodriguez (2020) used k-means clustering to create groups of pipes with similar characteristics and then estimate the total number of failures of each group using three regression models: linear regression, Poisson regression and evolutionary polynomial regression (EPR) where Poisson regression shows a superior accuracy compared to the other two models. Chen & Guikema (2020) merge spatial clustering and regression models to predict the number of pipe breaks in a real water network of the USA. Experts in the field have expressed their commitment to improve water network databases, mainly aided by advances in GIS (Barton et al. 2022).

Limited applications of a comparative study is performed using ANN and ANFIS in modeling pipeline failure trend exist, unlike in many hydrological applications. Tabesh et al. (2009) employed ANN and ANFIS to model pipe failure rate, considering five input parameters. They compared the results with a multivariate regression approach. Jafar et al. (2010) demonstrated an application of ANN in modeling the failure rate of 4,862 urban mains using a 14-year database from a city in northern France. Their study also involved estimating the optimal replacement time for individual pipes in an urban WDS. Ansari et al. (2020) employed a combination of the ANFIS model with GA and PSO optimization algorithms to enhance the accuracy of predicting influential variables in wastewater treatment plants. In addition to the use of ANN models, FIS models are also extensively employed in water science. Researchers have investigated the risk of water quality failure, pipe failure in the WDN, and the potential for leakage within the network using FIS models (Sadiq et al. 2007; Fares & Zayed 2010; Islam et al. 2011; Valis 2013; Zangenehmadar & Moselhi 2016; Pandey et al. 2020).

Numerous studies have been conducted to explore the application of statistical and soft models in various areas, including leak detection, calibration of pipe roughness coefficient, water quality prediction, and prediction of network pipe failure rate. The findings from these investigations demonstrate that hybrid models can achieve highly accurate predictions of diverse phenomena (Kapelan et al. 2003; Tu et al. 2005; Berardi et al. 2008; Xu et al. 2011, 2013; Soltani & Tabari 2012; Farmani et al. 2017). Despite the previous studies, very few studies performed use of data-driven methods (ANN and ANFIS) to predict velocity and pressure in a WDN by using ANN and ANFIS.

In this paper, advanced artificial intelligence (AI) techniques are used, including ANFIS, ANN-LM, ANN-BR, and ANN-SCG, to predict velocity and pressure. For case 1, Pipe diameter and flow rate are selected as independent variables, and velocity as dependent variable. For case 2, Node demand and node elevation are selected as independent variables, and pressure as dependent variable. 80% of the data is used to train, test, and validate the ANN and ANFIS models, while the remaining 20% is used to evaluate the models. The best model is selected based on statistical performance such as root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). Sensitivity analysis is performed using the best model obtained after statistical performance to assess the influence of each independent variable on velocity and pressure.

Gadhra is a census town in the East Singhbhum district of Jharkhand, India. It is located at longitude of 86.24° E and latitude of 22.74° N, and has a total area of 4.48 km2. The total population of Gadhra 18,801 as per census 2011. The location map of the Gadhra study area is illustrated in Figure 1. Source of water to Gadhra network is the Swarnrekha River. The distribution network is drawn in WaterGEMS as shown in Figure 2. The Gadhra network (DMA-5) consists of 74 links, 73 demand nodes, 1 ESR, and 1 isolation valves. Data required for the study purpose are taken from the Drinking Water and Sanitation (DW&S) Department, Jharkhand Government.
Figure 1

Study area – Gadhra.

Figure 1

Study area – Gadhra.

Close modal
Figure 2

Real WDN – DMA-05 (Gadhra).

Figure 2

Real WDN – DMA-05 (Gadhra).

Close modal

The pipe and node details of Gadhra WDN (DMA-5) are given in Supplementary material, Appendix A (Tables A1 and A2).

The dependent variable in this case is the velocity (V), which is influenced by the independent variables of flowrate (Q) and diameter (D), in order to prevent sediment accumulation in pipes.
(1)
The dependent variable in this case is the pressure (P), which is influenced by the independent variables of elevation and demand, in order to prevent pipe burst.
(2)
The first set of constraints assumes that mass is preserved at each node, which implies that the flow entering and leaving a given node, minus any external demand at that node, must be zero.
(3)

The second set of constraints is based on the law of energy conservation, which states that the sum of friction loss and the minor loss due to valves minus the grade difference between the two points of known energy minus the energy added to the liquid by the pump must equal zero.

The third set of restrictions states that each given node's pressure () must be higher than or equal to the minimum necessary pressure.
(4)
belongs to [D] where [D] is a set of commercial diameters.
(5)

Artificial neural network

The information exchange process that occurs in the human brain served as the inspiration for ANN, a data-driven learning process. It is distinct and adaptable in finding accurate solutions to underlying complex nonlinear equations that promote quicker learning. Artificial neurons are the fundamental building block of every NN. It is a component of the process that, after using an activation function and associated weights, receives an input signal and produces an output for a neighboring process. The architecture of a straightforward artificial neuron is shown in Figure 3. The independent variable (x1, x2,…, xn) is the input signal that the neurons receive and obtain the output as (Y) as shown in Equation (6) (Moreira et al. 2021).
(6)
(7)
Figure 3

Architecture of an artificial neuron.

Figure 3

Architecture of an artificial neuron.

Close modal

The function “v” is used to indicate the weights connected to each input and the total bias, where the weights for each connection are w1, w2,…, and wn. The input vector is given by (X = x1, x2, …, xn), the weight vector is given by (W′ = w1, w2,…, wn), and “b” stands for the bias as shown in Equation (7). The activation function is represented by “f”.

Feedforward backpropagation process

The process of training the model involves two steps in order to achieve the desired output by minimizing the error. The first step entails the forward computation of input weights, while the second step involves the backward process of updating weights based on the obtained error (Shaik et al. 2020). The fundamental structure of the Feedforward network comprises an input layer, a hidden layer, an output layer, and their connections, which are governed by adjustable synaptic weights. Each layer consists of a specific number of neurons or nodes that facilitate information exchange and decision-making. Each node receives inputs from its preceding nodes, calculates the weighted sum of the inputs along with the added bias, and then passes the result through an activation function at each layer to generate the intended prediction. The process is iteratively performed using randomly generated weights until the desired output is achieved. Due to its nature as a black box model (Ibnelouad et al. 2021), this technique effectively discovers the intricate and valuable relationship between the input and output variables in order to attain the target output.

The backpropagation learning process is employed to address issues such as overfitting, convergence at local maxima, and the convergence rate, thereby reducing their impact. The process begins from the output layer and progresses toward the first layer by updating the weights. At each step, the network generates an output based on the input signal. Subsequently, the error is calculated by comparing the obtained output with the actual target. In order to minimize the error and achieve the desired output, the weights are adjusted and modified using a training algorithm. This iterative process continues until the error reaches a predetermined range. The general equation used to calculate the partial derivative of the error at the nth layer is as follows.
(8)
Here, Yn represents the actual output at the nth layer, Xn denotes the targeted output at the nth layer, and Zn denotes the error generation at the nth layer. Figures 4 and 5 illustrate the schematic of the multilayer FFBP-ANN process used in the study with 2 input features,16 neurons, and velocity as the output for case 1; and with 2 input features, 17 neurons, and pressure as the output for case 2.
Figure 4

FFBP-ANN for velocity.

Figure 4

FFBP-ANN for velocity.

Close modal
Figure 5

FFBP-ANN for pressure.

Figure 5

FFBP-ANN for pressure.

Close modal
Figure 6

ANFIS layout structure (Source:Mehta & Jain 2009).

Figure 6

ANFIS layout structure (Source:Mehta & Jain 2009).

Close modal

Levenberg–Marquardt algorithm

The Levenberg–Marquardt (LM) algorithm was specifically developed to achieve second-order training speed without the need for calculating the Hessian matrix. In cases where the performance function can be expressed as a sum of squares, an approximation of the Hessian matrix can be utilized, and the gradient can be computed following the approach described in (Hagan & Menhaj 1994; Kisi & Uncuoghlu 2005) as shown in Equations (9) and (10).

The Jacobian matrix, denoted as J, comprises the first-order derivatives of the network errors with respect to the weights and biases. Meanwhile, the vector e represents the network errors. Computing the Jacobian matrix involves utilizing a standard backpropagation technique, which is significantly simpler compared to calculating the Hessian matrix. The LM algorithm employs this approximation of the Hessian matrix and incorporates it into the Newton-like update (11), where the connection weights are represented by x.
(9)
(10)
(11)

Bayesian regularization algorithm

The Bayesian Regularization (BR) training algorithm incorporates the principles of LM optimization (MacKay 1992; Foresee & Hagan 1997) to update the weights and bias values. It aims to minimize a combination of squared errors and weights, seeking the optimal combination that leads to a well-generalizing network (Pan et al. 2013). Additionally, BR introduces network weights into the training objective function, referred to as F(ω) in Equation (12).

In the BR framework, represents the sum of the network weights squared, while represents the sum of network errors. Both α and β serve as parameters for the objective function. Within this framework, the network weights are perceived as random variables, and the distribution of the weights and training set follows a Gaussian distribution. The α and β factors are defined using Bayes’ theorem, which establishes a relationship between two variables or events, A and B. This relationship is based on their prior or marginal probabilities and posterior or conditional probabilities, as described in Equation (13) (Li & Shi 2012).

To determine the optimal weight space, it is necessary to minimize objective function (12), which is equivalent to probability function denoted in Equation (14).

Here, P(A|B) represents the posterior probability of A given B, P(B|A) represents the prior probability of B given A, and P(B) represents the non-zero prior probability of event B, serving as a normalizing constant.

The factors α and β need to be optimized, while D represents the weight distribution and M represents the specific neural network architecture. The normalization factor is denoted as P(D|M), and P(α,β|M) corresponds to the uniform prior density for the regularization parameters. The likelihood function of D given α, β, M is represented by P(D|α,β,M). Maximizing the posterior function P(α,β| D,M), is equivalent to maximizing the likelihood function P(D|α,β,M). Consequently, the process yields optimal values for α and β within a given weight space.
(12)
(13)
(14)

Scaled conjugate gradient

The fundamental backpropagation algorithm modifies the weights by moving in the direction of steepest descent, which corresponds to the negative gradient. In most CG algorithms, the step size is adapted during each iteration. The search is conducted along the conjugate gradient direction to find the step size that minimizes the performance function along that particular path. Initially, all CG algorithms begin by searching in the direction of steepest descent in the first iteration as shown in Equation (15). Frequently, CG algorithms employ line search techniques, approximating the step size without calculating the Hessian matrix to determine the optimal distance for movement along the current search direction as shown in Equation (16). Subsequently, the subsequent search direction is chosen to be conjugate to the previous search direction as shown in Equation (17). The general procedure for determining the new search direction involves combining the new steepest descent direction with the previous search direction (Hagan et al. 1996).

For Scaled Conjugate Gradient (SCG), factor calculation and direction of the new search is shown in Equations (18) and (19) (Møller 1993).
(15)
(16)
(17)
(18)
(19)

ANFIS

Figure 1 illustrates the layout structure of ANFIS, comprising five distinct layers. Layer 1 is responsible for identifying the input and output variables and determining their descriptors. Layer 2 defines the membership functions for each input and output variable. Layer 3 constructs the rule base. Layer 4 performs Rule Evaluation, and the final layer, Layer 5, conducts Defuzzification.

In the first layer every node “i” in this layer is an adaptive node with a node membership function (MF).
(20)
(21)

Fuzzy MFs are of different shapes such as Gaussian, triangular, trapezoidal.

Layer 2: Calculates the firing strength of a rule via product operation.
(22)
Layer 3: The role of third layer is to normalize the computed firing strength by dividing each value for the total firing strength.
(23)
Layer 4: Each node represents a consequent part of the fuzzy rule. The linear coefficients of the rule consequent are trainable
(24)
Layer 5: Perform defuzzification of consequent part of rules by summing outputs of all the rules.
(25)

Statistical performance

The two-way analysis of variance (ANOVA) normally scrutinizes the influence of two independent variables on their outcome. In the present study, the effect of two independent variables (i.e., diameter and flow rate) on a dependent variable (velocity) for case 1 and the effect of two independent variables (i.e., elevation and demand) on a dependent variable (pressure) for case 2 has been determined by two-way ANOVA test at 5% significant level (α) as shown in Table 1.

In the present study, the F-value obtained from the ANOVA test is greater than the F-critical and the p-value is less than 0.05 at the significance level is 5% indicating that the null hypothesis is rejected.

Statistical measure

The following are the calculations for the statistical indices used:
(26)
(27)
(28)
where RMSE is the root mean square error, MAE is the mean absolute error and RSS is sum of squares of residuals and TSS is total sum of squares.

In WaterGEMS, the simulated output, denoted as represents the results obtained from hydraulic simulation. The mean of the simulated output, represented as . In MATLAB, the target output, denoted as, represents the desired results after optimization. The mean of the target output, represented as . The values of n represent the number of data points used in the hydraulic simulation for pipe links and nodes, which are used to calculate velocity and pressure, respectively.

Table 1

Statistical analysis using two-way analysis of variance (ANOVA)

Source of variationp-valueFFcrP < αF > FcrSignificant
Velocity 0.000157 15.07 3.91 True True Yes 
Pressure 0.00000245 4.82 3.91 True True Yes 
Source of variationp-valueFFcrP < αF > FcrSignificant
Velocity 0.000157 15.07 3.91 True True Yes 
Pressure 0.00000245 4.82 3.91 True True Yes 

ANN approach to predict velocity and pressure

A neural network was used to predict velocity and pressure using the backpropagation algorithm, includingLM, BR, and SCG. The parameters used in the model were collected from hydraulic design and distribution network data, and are shown in Table 2. The model has three layers: independent, hidden, and dependent. Neurons in the hidden layer use weights (w) and biases (b) to compute the results of a neural network. The input layer provides the hidden layer with data, and the hidden layer then passes its output to the output layer as shown in Figure 6.

Table 2

Symbols used for hydraulic design and distribution network data

Parameters usedUnitSymbol
Diameter d 
Flowrate cumec Q 
Velocity m/s V 
Demand MLD D 
Elevation Elev. 
Pressure P 
Parameters usedUnitSymbol
Diameter d 
Flowrate cumec Q 
Velocity m/s V 
Demand MLD D 
Elevation Elev. 
Pressure P 

The neural network was trained using the training data. The results were then adapted according to any changes identified in the model during ongoing training. To obtain the best possible ANN predictive model for velocity and pressure, we tested different values of the number of neurons in the hidden layer and learning algorithms. The results for training, test and all are shown in Tables 3 and 4 for (FFBN-ANN) considering LM, BR and SCG algorithms.

Table 3

Performance evaluation of velocity using ANN-LM, ANN-BR, and ANN-SCG

Learning algorithmStructureTrain
Test
All
RMSEMAER2RMSEMAER2RMSEMAER2
ANN-LM 2-4-1 0.022 0.015 0.941 0.016 0.011 0.929 0.020 0.014 0.947 
2-8-1 0.041 0.033 0.793 0.049 0.034 0.787 0.044 0.033 0.749 
2-12-1 0.021 0.013 0.818 0.062 0.022 0.807 0.040 0.016 0.815 
2-16-1 0.021 0.014 0.974 0.012 0.009 0.853 0.019 0.012 0.955 
2-5-1 0.046 0.037 0.747 0.043 0.032 0.736 0.045 0.035 0.738 
2-9-1 0.022 0.016 0.955 0.013 0.009 0.946 0.019 0.016 0.952 
2-13-1 0.031 0.024 0.535 0.101 0.040 0.549 0.063 0.030 0.566 
2-17-1 0.035 0.026 0.849 0.036 0.024 0.851 0.035 0.026 0.839 
ANN-BR 2-4-1 0.021 0.014 0.947 0.014 0.010 0.944 0.019 0.013 0.953 
2-8-1 0.021 0.011 0.917 0.013 0.010 0.951 0.019 0.012 0.953 
2-12-1 0.021 0.014 0.947 0.014 0.010 0.954 0.019 0.013 0.953 
2-16-1 0.021 0.013 0.948 0.013 0.010 0.952 0.019 0.012 0.953 
2-5-1 0.020 0.013 0.949 0.013 0.009 0.944 0.019 0.017 0.953 
2-9-1 0.020 0.013 0.939 0.013 0.010 0.935 0.019 0.012 0.953 
2-13-1 0.021 0.013 0.917 0.058 0.020 0.735 0.019 0.016 0.831 
2-17-1 0.021 0.014 0.917 0.138 0.010 0.904 0.019 0.012 0.953 
ANN-SCG 2-4-1 0.049 0.056 0.749 0.114 0.691 0.823 0.048 0.038 0.698 
2-8-1 0.050 0.049 0.730 0.117 0.095 0.688 0.049 0.038 0.682 
2-12-1 0.053 0.059 0.674 0.145 0.110 0.689 0.077 0.050 0.681 
2-16-1 0.047 0.054 0.726 0.117 0.094 0.711 0.046 0.037 0.723 
2-5-1 0.058 0.081 0.760 0.123 0.111 0.722 0.085 0.070 0.755 
2-9-1 0.072 0.087 0.744 0.087 0.062 0.718 0.081 0.069 0.749 
2-13-1 0.050 0.046 0.643 0.127 0.105 0.655 0.051 0.039 0.662 
2-17-1 0.046 0.057 0.641 0.101 0.075 0.668 0.066 0.049 0.642 
Learning algorithmStructureTrain
Test
All
RMSEMAER2RMSEMAER2RMSEMAER2
ANN-LM 2-4-1 0.022 0.015 0.941 0.016 0.011 0.929 0.020 0.014 0.947 
2-8-1 0.041 0.033 0.793 0.049 0.034 0.787 0.044 0.033 0.749 
2-12-1 0.021 0.013 0.818 0.062 0.022 0.807 0.040 0.016 0.815 
2-16-1 0.021 0.014 0.974 0.012 0.009 0.853 0.019 0.012 0.955 
2-5-1 0.046 0.037 0.747 0.043 0.032 0.736 0.045 0.035 0.738 
2-9-1 0.022 0.016 0.955 0.013 0.009 0.946 0.019 0.016 0.952 
2-13-1 0.031 0.024 0.535 0.101 0.040 0.549 0.063 0.030 0.566 
2-17-1 0.035 0.026 0.849 0.036 0.024 0.851 0.035 0.026 0.839 
ANN-BR 2-4-1 0.021 0.014 0.947 0.014 0.010 0.944 0.019 0.013 0.953 
2-8-1 0.021 0.011 0.917 0.013 0.010 0.951 0.019 0.012 0.953 
2-12-1 0.021 0.014 0.947 0.014 0.010 0.954 0.019 0.013 0.953 
2-16-1 0.021 0.013 0.948 0.013 0.010 0.952 0.019 0.012 0.953 
2-5-1 0.020 0.013 0.949 0.013 0.009 0.944 0.019 0.017 0.953 
2-9-1 0.020 0.013 0.939 0.013 0.010 0.935 0.019 0.012 0.953 
2-13-1 0.021 0.013 0.917 0.058 0.020 0.735 0.019 0.016 0.831 
2-17-1 0.021 0.014 0.917 0.138 0.010 0.904 0.019 0.012 0.953 
ANN-SCG 2-4-1 0.049 0.056 0.749 0.114 0.691 0.823 0.048 0.038 0.698 
2-8-1 0.050 0.049 0.730 0.117 0.095 0.688 0.049 0.038 0.682 
2-12-1 0.053 0.059 0.674 0.145 0.110 0.689 0.077 0.050 0.681 
2-16-1 0.047 0.054 0.726 0.117 0.094 0.711 0.046 0.037 0.723 
2-5-1 0.058 0.081 0.760 0.123 0.111 0.722 0.085 0.070 0.755 
2-9-1 0.072 0.087 0.744 0.087 0.062 0.718 0.081 0.069 0.749 
2-13-1 0.050 0.046 0.643 0.127 0.105 0.655 0.051 0.039 0.662 
2-17-1 0.046 0.057 0.641 0.101 0.075 0.668 0.066 0.049 0.642 
Table 4

Performance evaluation of pressure using ANN-LM, ANN-BR, and ANN-SCG

Learning algorithmStructureTrain
Test
All
RMSEMAER2RMSEMAER2RMSEMAER2
ANN-LM 2-4-1 0.447 0.372 0.959 0.612 0.474 0.930 0.507 0.406 0.940 
2-8-1 0.500 0.388 0.958 0.489 0.401 0.942 0.496 0.392 0.950 
2-12-1 0.615 0.435 0.940 0.490 0.395 0.945 0.577 0.422 0.934 
2-16-1 0.400 0.308 0.966 0.424 0.349 0.954 0.408 0.321 0.961 
2-5-1 0.975 0.546 0.750 1.33 0.513 0.710 1.280 0.533 0.700 
2-9-1 0.395 0.297 0.967 0.362 0.293 0.968 0.384 0.296 0.966 
2-13-1 0.453 0.328 0.958 0.294 0.224 0.960 0.407 0.294 0.964 
2-17-1 0.324 0.227 0.983 0.326 0.198 0.989 0.324 0.218 0.979 
ANN-BR 2-4-1 0.455 0.377 0.946 0.573 0.498 0.922 0.489 0.410 0.944 
2-8-1 0.458 0.380 0.948 0.564 0.490 0.920 0.488 0.411 0.944 
2-12-1 0.438 0.369 0.947 0.575 0.502 0.941 0.489 0.407 0.944 
2-16-1 0.445 0.367 0.947 0.563 0.485 0.942 0.487 0.405 0.945 
2-5-1 0.442 0.367 0.948 0.554 0.461 0.927 0.490 0.404 0.944 
2-9-1 0.445 0.372 0.949 0.553 0.478 0.924 0.491 0.412 0.944 
2-13-1 0.441 0.361 0.949 0.589 0.502 0.925 0.492 0.413 0.942 
2-17-1 0.446 0.366 0.948 0.562 0.475 0.926 0.486 0.402 0.945 
ANN-SCG 2-4-1 1.595 1.309 0.438 1.450 1.236 0.430 1.549 1.285 0.444 
2-8-1 0.735 0.588 0.900 0.799 0.688 0.853 0.756 0.621 0.886 
2-12-1 0.667 0.458 0.907 0.548 0.424 0.912 0.631 0.447 0.910 
2-16-1 0.514 0.419 0.941 0.598 0.432 0.802 0.543 0.423 0.862 
2-5-1 0.650 0.532 0.910 0.748 0.570 0.901 0.684 0.545 0.892 
2-9-1 0.593 0.457 0.934 0.675 0.554 0.901 0.621 0.489 0.911 
2-13-1 0.909 0.645 0.841 0.595 0.489 0.872 0.819 0.594 0.852 
2-17-1 0.678 0.502 0.903 0.635 0.557 0.871 0.734 0.586 0.883 
Learning algorithmStructureTrain
Test
All
RMSEMAER2RMSEMAER2RMSEMAER2
ANN-LM 2-4-1 0.447 0.372 0.959 0.612 0.474 0.930 0.507 0.406 0.940 
2-8-1 0.500 0.388 0.958 0.489 0.401 0.942 0.496 0.392 0.950 
2-12-1 0.615 0.435 0.940 0.490 0.395 0.945 0.577 0.422 0.934 
2-16-1 0.400 0.308 0.966 0.424 0.349 0.954 0.408 0.321 0.961 
2-5-1 0.975 0.546 0.750 1.33 0.513 0.710 1.280 0.533 0.700 
2-9-1 0.395 0.297 0.967 0.362 0.293 0.968 0.384 0.296 0.966 
2-13-1 0.453 0.328 0.958 0.294 0.224 0.960 0.407 0.294 0.964 
2-17-1 0.324 0.227 0.983 0.326 0.198 0.989 0.324 0.218 0.979 
ANN-BR 2-4-1 0.455 0.377 0.946 0.573 0.498 0.922 0.489 0.410 0.944 
2-8-1 0.458 0.380 0.948 0.564 0.490 0.920 0.488 0.411 0.944 
2-12-1 0.438 0.369 0.947 0.575 0.502 0.941 0.489 0.407 0.944 
2-16-1 0.445 0.367 0.947 0.563 0.485 0.942 0.487 0.405 0.945 
2-5-1 0.442 0.367 0.948 0.554 0.461 0.927 0.490 0.404 0.944 
2-9-1 0.445 0.372 0.949 0.553 0.478 0.924 0.491 0.412 0.944 
2-13-1 0.441 0.361 0.949 0.589 0.502 0.925 0.492 0.413 0.942 
2-17-1 0.446 0.366 0.948 0.562 0.475 0.926 0.486 0.402 0.945 
ANN-SCG 2-4-1 1.595 1.309 0.438 1.450 1.236 0.430 1.549 1.285 0.444 
2-8-1 0.735 0.588 0.900 0.799 0.688 0.853 0.756 0.621 0.886 
2-12-1 0.667 0.458 0.907 0.548 0.424 0.912 0.631 0.447 0.910 
2-16-1 0.514 0.419 0.941 0.598 0.432 0.802 0.543 0.423 0.862 
2-5-1 0.650 0.532 0.910 0.748 0.570 0.901 0.684 0.545 0.892 
2-9-1 0.593 0.457 0.934 0.675 0.554 0.901 0.621 0.489 0.911 
2-13-1 0.909 0.645 0.841 0.595 0.489 0.872 0.819 0.594 0.852 
2-17-1 0.678 0.502 0.903 0.635 0.557 0.871 0.734 0.586 0.883 

The ANN-LM with 16 neurons in the hidden layer had the best predictive performance for velocity, as shown in Table 3. The ANN-LM model with 17 neurons in the hidden layer had the best predictive performance for pressure, as shown in Table 4. Figures 7 and 8 show the model structure of the best possible predictive model for velocity and pressure. They represent how input and output data are linked through hidden layers. Figures 7 and 8 also describe the mechanism of calculating output values based on independent variables that have different weights and biases determined by neurons in hidden layers.
Figure 7

Architecture of a proposed neural network (to predict velocity).

Figure 7

Architecture of a proposed neural network (to predict velocity).

Close modal
Figure 8

Architecture of a proposed neural network (to predict pressure).

Figure 8

Architecture of a proposed neural network (to predict pressure).

Close modal
To predict velocity, Figure 9 depicts the regression analysis using the LM algorithm of 41 training, 10 testing, and 10 validation which is 80% of the total pipe data. To predict pressure, Figure 10 depicts the regression analysis using the LM algorithm of 41 training, 10 testing, and 10 validation data, which make total 80% of the node data. The values for each algorithm are clearly shown in the figures. It is expected that the better the regression among the LM, BR, and SCG algorithms in Figures 9 and 10, the prediction model will be more accurate.
Figure 9

Plots of ANN-LM regression performance with 16 neurons during training, testing, and validation. (Case 1 – Velocity).

Figure 9

Plots of ANN-LM regression performance with 16 neurons during training, testing, and validation. (Case 1 – Velocity).

Close modal
Figure 10

Plots of ANN-LM regression performance with 17 neurons during training, testing, and validation. (Case 2 – Pressure).

Figure 10

Plots of ANN-LM regression performance with 17 neurons during training, testing, and validation. (Case 2 – Pressure).

Close modal

ANFIS approach to predict velocity and pressure

ANFIS is trained using a backpropagation algorithm. The backpropagation algorithm adjusts the parameters of the ANFIS network to minimize the error between the network's output and the desired output. ANFIS uses membership functions to map independent and associated variables to produce the necessary output. ANFIS has five layers: input layer, input MF layer, rule layer, output MF layer, and output layer. In both case1 and case 2, nine rules are generated using two input variables to create the required ANFIS model as shown in Figure 11.
Figure 11

Structure of the ANFIS model for velocity and pressure.

Figure 11

Structure of the ANFIS model for velocity and pressure.

Close modal

Prior to running ANFIS on the dataset, the 74 pipe/link data and 73 node/junction data are divided into training and checking datasets. This ensured that the range of input and output data in the training dataset is representative of the range of input and output data in the checking dataset. This is important because it ensured that the checking samples could accurately represent the population and generate generalized ANFIS velocity and pressure models that could accurately predict velocity and pressure in the WDN.

The model generation process began after the dataset was classified. In the current scenario, for case 1, 74 data points are divided into 41 training samples, 10 testing samples, and 10 checking samples which are 80% of the total pipe data. For case 2, 73 data points are divided into 41 training, 10 testing samples, and 10 checking which are 80% of total node data. To generate the ANFIS model, training dataset is used and to assess the validity of the model and calculate the error, checking dataset is used in velocity and pressure prediction for case 1 and case 2, respectively.

After loading the training dataset, the subtractive clustering method is used to generate the ANFIS model, as shown in Figures 12 and 13. The generated model requires further tuning to minimize error using a hybrid algorithm. This algorithm combines least square and back propagation techniques to fine-tune the ANFIS model. By minimizing the errors associated with calculating the weights of each independent variable, the hybrid algorithm adjusts the model in both the forward and backward passes, ultimately producing the desired output value. The relationship between epochs and error is depicted in Figure 12 and 13, which provides insights into the limitations of the tuning process. The graph shows that there is a threshold beyond which tuning can lead to overfitting. Overfitting is characterized by a persistent increase in error over successive iterations. While using this method different types and number of MF input value and different types of MF output affects the Regression Analysis of the model as shown in Tables 5(a)–5(c) and 6(a)–6(c).
Table 5

Results obtained from different types of ANFIS structures and their performance evaluation in case of velocity – (a) training; (b) testing; and (c) all.

MF outputMF input3 MF
5 MF
RMSEMAER2RMSEMAER2
(a) Training 
Constant Tri 0.040 0.027 0.801 0.040 0.021 0.822 
Trap 0.040 0.027 0.807 0.041 0.027 0.817 
Gauss 0.040 0.026 0.825 0.022 0.021 0.886 
Linear Tri 0.041 0.024 0.820 0.041 0.027 0.821 
Trap 0.040 0.027 0.792 0.040 0.029 0.780 
Gauss 0.080 0.051 0.796 0.089 0.062 0.823 
(b) Testing 
Constant Tri 0.045 0.029 0.767 0.046 0.027 0.778 
Trap 0.047 0.022 0.762 0.042 0.026 0.759 
Gauss 0.046 0.028 0.834 0.021 0.026 0.898 
Linear Tri 0.045 0.027 0.781 0.045 0.028 0.769 
Trap 0.045 0.027 0.778 0.046 0.028 0.776 
Gauss 0.082 0.053 0.764 0.090 0.061 0.798 
(c) All 
Constant Tri 0.044 0.026 0.773 0.044 0.026 0.773 
Trap 0.044 0.026 0.773 0.044 0.028 0.773 
Gauss 0.047 0.025 0.820 0.020 0.022 0.875 
Linear Tri 0.044 0.026 0.784 0.044 0.026 0.773 
Trap 0.043 0.026 0.774 0.041 0.026 0.775 
Gauss 0.080 0.050 0.776 0.088 0.061 0.781 
MF outputMF input3 MF
5 MF
RMSEMAER2RMSEMAER2
(a) Training 
Constant Tri 0.040 0.027 0.801 0.040 0.021 0.822 
Trap 0.040 0.027 0.807 0.041 0.027 0.817 
Gauss 0.040 0.026 0.825 0.022 0.021 0.886 
Linear Tri 0.041 0.024 0.820 0.041 0.027 0.821 
Trap 0.040 0.027 0.792 0.040 0.029 0.780 
Gauss 0.080 0.051 0.796 0.089 0.062 0.823 
(b) Testing 
Constant Tri 0.045 0.029 0.767 0.046 0.027 0.778 
Trap 0.047 0.022 0.762 0.042 0.026 0.759 
Gauss 0.046 0.028 0.834 0.021 0.026 0.898 
Linear Tri 0.045 0.027 0.781 0.045 0.028 0.769 
Trap 0.045 0.027 0.778 0.046 0.028 0.776 
Gauss 0.082 0.053 0.764 0.090 0.061 0.798 
(c) All 
Constant Tri 0.044 0.026 0.773 0.044 0.026 0.773 
Trap 0.044 0.026 0.773 0.044 0.028 0.773 
Gauss 0.047 0.025 0.820 0.020 0.022 0.875 
Linear Tri 0.044 0.026 0.784 0.044 0.026 0.773 
Trap 0.043 0.026 0.774 0.041 0.026 0.775 
Gauss 0.080 0.050 0.776 0.088 0.061 0.781 
Figure 12

Generated ANFIS velocity model using hybrid algorithm.

Figure 12

Generated ANFIS velocity model using hybrid algorithm.

Close modal
Figure 13

Generated ANFIS pressure model using hybrid algorithm.

Figure 13

Generated ANFIS pressure model using hybrid algorithm.

Close modal
The model's validation is conducted using a separate checking dataset in its final stage. Figures 14 and 15 illustrate the graphical disparity between the velocity and pressure values generated by the ANFIS model (represented by red asterisks) and the actual values collected in the field (represented by blue plus symbols). The user provides the observed velocity and pressure values to the model for comparison. The error is quantified using the RMSE, which provides a numerical measure of the disparity between the model's predictions and the actual values.
Table 6

Results obtained from different types of ANFIS structures and their performance evaluation in case of pressure – (a) training; (b) testing; and (c) all.

MF output3 MF
5 MF
MF inputRMSEMAER2RMSEMAER2
(a) Training 
Constant Tri 0.550 0.410 0.946 0.591 0.360 0.929 
Trap 0.747 0.596 0.866 0.988 0.541 0.811 
Gauss 0.550 0.448 0.932 0.822 0.431 0.924 
Linear Tri 0.611 0.509 0.821 1.039 0.451 0.761 
Trap 1.514 0.571 0.717 1.387 0.497 0.677 
Gauss 0.638 0.404 0.919 0.566 0.348 0.929 
(b) Testing 
Constant Tri 0.556 0.406 0.933 0.575 0.349 0.922 
Trap 0.758 0.597 0.859 0.981 0.562 0.791 
Gauss 0.559 0.455 0.941 0.799 0.422 0.896 
Linear Tri 0.629 0.516 0.818 1.046 0.458 0.764 
Trap 1.515 0.579 0.711 1.401 0.493 0.670 
Gauss 0.644 0.406 0.923 0.622 0.344 0.933 
(c) All 
Constant Tri 0.549 0.400 0.934 0.574 0.356 0.926 
Trap 0.759 0.609 0.865 0.982 0.542 0.793 
Gauss 0.552 0.447 0.934 0.795 0.423 0.898 
Linear Tri 0.632 0.516 0.822 1.045 0.455 0.770 
Trap 1.500 0.577 0.713 1.398 0.492 0.672 
Gauss 0.645 0.401 0.920 0.579 0.348 0.925 
MF output3 MF
5 MF
MF inputRMSEMAER2RMSEMAER2
(a) Training 
Constant Tri 0.550 0.410 0.946 0.591 0.360 0.929 
Trap 0.747 0.596 0.866 0.988 0.541 0.811 
Gauss 0.550 0.448 0.932 0.822 0.431 0.924 
Linear Tri 0.611 0.509 0.821 1.039 0.451 0.761 
Trap 1.514 0.571 0.717 1.387 0.497 0.677 
Gauss 0.638 0.404 0.919 0.566 0.348 0.929 
(b) Testing 
Constant Tri 0.556 0.406 0.933 0.575 0.349 0.922 
Trap 0.758 0.597 0.859 0.981 0.562 0.791 
Gauss 0.559 0.455 0.941 0.799 0.422 0.896 
Linear Tri 0.629 0.516 0.818 1.046 0.458 0.764 
Trap 1.515 0.579 0.711 1.401 0.493 0.670 
Gauss 0.644 0.406 0.923 0.622 0.344 0.933 
(c) All 
Constant Tri 0.549 0.400 0.934 0.574 0.356 0.926 
Trap 0.759 0.609 0.865 0.982 0.542 0.793 
Gauss 0.552 0.447 0.934 0.795 0.423 0.898 
Linear Tri 0.632 0.516 0.822 1.045 0.455 0.770 
Trap 1.500 0.577 0.713 1.398 0.492 0.672 
Gauss 0.645 0.401 0.920 0.579 0.348 0.925 
Figure 14

Validation of the ANFIS model using checking dataset (Case 1: velocity).

Figure 14

Validation of the ANFIS model using checking dataset (Case 1: velocity).

Close modal
Figure 15

Validation of the ANFIS model using checking dataset (Case 2: pressure).

Figure 15

Validation of the ANFIS model using checking dataset (Case 2: pressure).

Close modal

Performance of ANN and ANFIS

To compare the performance of (ANN-LM, ANN-BR, and ANN-SCG) and ANFIS models. Figures 16 and 17 demonstrate a comparison between the observed and predicted values of velocity and pressure. The performance of the models is further evaluated using quantitative metrics, such as the MAE, RMSE, and coefficient of determination (R2). The results are summarized in Tables 7(a) and 7(b), which reveal that the ANN-LM model outperforms the ANN-BR model, ANN-SCG model, and ANFIS model as it exhibits the lowest errors, namely RMSE and MAE and high R2 value.
Table 7

Comparison of performance indices for (a) velocity and (b) pressure of ANN-LM, ANN-BR, ANN-SCG, and ANFIS

S. NoPerformance indexANN-LMANN-BRANN-SCGANFIS
(a) Velocity 
Structure 2-16-1 2-5-1 2-5-1 3-3-Gauss-const 
RMSE 0.0189 0.0188 0.0847 0.0201 
MAE 0.0122 0.0166 0.0695 0.0215 
R2 0.9568 0.9528 0.7549 0.8745 
(b) Pressure 
Structure 2-17-1 2-13-1 2-16-1 3-3-Gauss-const 
RMSE 0.3244 0.4923 0.5433 0.5520 
MAE 0.2176 0.4128 0.4231 0.4472 
R2 0.9773 0.9415 0.8617 0.9336 
S. NoPerformance indexANN-LMANN-BRANN-SCGANFIS
(a) Velocity 
Structure 2-16-1 2-5-1 2-5-1 3-3-Gauss-const 
RMSE 0.0189 0.0188 0.0847 0.0201 
MAE 0.0122 0.0166 0.0695 0.0215 
R2 0.9568 0.9528 0.7549 0.8745 
(b) Pressure 
Structure 2-17-1 2-13-1 2-16-1 3-3-Gauss-const 
RMSE 0.3244 0.4923 0.5433 0.5520 
MAE 0.2176 0.4128 0.4231 0.4472 
R2 0.9773 0.9415 0.8617 0.9336 
Figure 16

Comparison between velocity observed and predicted by ANN-LM, ANN-BR, ANN-SCG, and ANFIS.

Figure 16

Comparison between velocity observed and predicted by ANN-LM, ANN-BR, ANN-SCG, and ANFIS.

Close modal
Figure 17

Comparison between pressure observed and predicted by ANN-LM, ANN-BR, ANN-SCG, and ANFIS.

Figure 17

Comparison between pressure observed and predicted by ANN-LM, ANN-BR, ANN-SCG, and ANFIS.

Close modal

Sensitivity analysis in ANN-LM

Sensitivity analysis plays a crucial role in decision-making processes by providing valuable insights into the reliability and robustness of a model's predictions. In this study, sensitivity analysis is conducted using the NN-Edit toolbox of MATLAB software to evaluate the impact of individual independent variables namely diameter, flowrate, elevation, and demand on velocity and pressure. The analysis focused on examining the influence of each independent variable while keeping all other variables constant. Specifically, separate sensitivity analyses are performed for velocity and pressure. The outcomes of the sensitivity analysis for the independent variables are visually presented in Figures 1821, and the results are summarized in Tables 10 and 11. These findings offer valuable insights into the relative importance of each independent variable and their effect on velocity and pressure in the system under investigation.
Figure 18

Effect of the flowrate on velocity.

Figure 18

Effect of the flowrate on velocity.

Close modal
Figure 19

Effect of the diameter on velocity.

Figure 19

Effect of the diameter on velocity.

Close modal
Figure 20

Effect of the demand on pressure.

Figure 20

Effect of the demand on pressure.

Close modal
Figure 21

Effect of the elevation on pressure.

Figure 21

Effect of the elevation on pressure.

Close modal

During the sensitivity analysis of an independent variable, all other independent variables are set to their mean values as indicated in Tables 8 and 9. The slope of the graph that shows the relationship between the independent variable and the observed velocity is used to assess the individual effect of a one-unit change in the independent variable on velocity and pressure when all other variables are held constant. The results reveal that flowrate exhibits a positive correlation with velocity, while diameter demonstrates a negative correlation. Similarly, pressure shows a positive correlation with demand and a negative correlation with elevation. Tables 10 and 11 provide a comprehensive summary of the magnitude of change in velocity and pressure corresponding to a unit change in the value of each independent variable. Of all data-driven models, the ANN-LM perform best for velocity and pressure in terms of statistical performance therefore sensitivity analysis is carried out for ANN-LM.

Table 8

Hydraulic design variables at the WDN (for pipes)

Sl. No.DIA (m)Flow rate (MLD)Velocity (m/s) ANN-LM
0.25 3.888 0.27 
0.25 3.784 0.23 
0.25 3.784 0.26 
0.25 3.784 0.22 
0.25 3.784 0.28 
0.25 3.784 0.28 
0.1 0.018 0.06 
0.1 0.009 0.07 
0.1 0.003 0.02 
10 0.1 0.021 0.06 
11 0.1 0.04 0.03 
12 0.15 0.779 0.18 
13 0.35 8.668 0.16 
14 0.45 16.015 0.12 
Sl. No.DIA (m)Flow rate (MLD)Velocity (m/s) ANN-LM
0.25 3.888 0.27 
0.25 3.784 0.23 
0.25 3.784 0.26 
0.25 3.784 0.22 
0.25 3.784 0.28 
0.25 3.784 0.28 
0.1 0.018 0.06 
0.1 0.009 0.07 
0.1 0.003 0.02 
10 0.1 0.021 0.06 
11 0.1 0.04 0.03 
12 0.15 0.779 0.18 
13 0.35 8.668 0.16 
14 0.45 16.015 0.12 
Table 9

Hydraulic design variables at the WDN (for nodes)

Sl. No.Elevation (m)Demand (MLD)Pressure (m) ANN-LM
152.27 0.0018 10.05 
151.9 0.0036 10.04 
146.41 0.0043 11.73 
145.86 0.0035 11.62 
144.45 0.0036 11.94 
154.31 0.0019 9.17 
148.86 0.0073 9.82 
148.05 0.0023 10.10 
146.13 0.0029 10.91 
10 151.71 0.0095 9.15 
11 148.44 0.0035 10.04 
12 148.34 0.0029 9.90 
13 147.07 0.0023 10.34 
14 148.36 0.0028 9.89 
Sl. No.Elevation (m)Demand (MLD)Pressure (m) ANN-LM
152.27 0.0018 10.05 
151.9 0.0036 10.04 
146.41 0.0043 11.73 
145.86 0.0035 11.62 
144.45 0.0036 11.94 
154.31 0.0019 9.17 
148.86 0.0073 9.82 
148.05 0.0023 10.10 
146.13 0.0029 10.91 
10 151.71 0.0095 9.15 
11 148.44 0.0035 10.04 
12 148.34 0.0029 9.90 
13 147.07 0.0023 10.34 
14 148.36 0.0028 9.89 
Table 10

Results of sensitivity analysis in ANFIS for velocity

S.no.Independent variableRate of change
Flowrate V/▴Q = 0.016 m/s per cumec 
Diameter V/▴d= (−)a 0.01 m/s per 0.01 m 
S.no.Independent variableRate of change
Flowrate V/▴Q = 0.016 m/s per cumec 
Diameter V/▴d= (−)a 0.01 m/s per 0.01 m 

aA negative correlation is indicated by a (−) sign.

Table 11

Results of sensitivity analysis in ANFIS for pressure

S.no.Independent variableRate of change
Demand P/▴D = 0.03 m per 0.0001 MLD 
Elevation P/▴Elev.= (−)a 0.51 m per m 
S.no.Independent variableRate of change
Demand P/▴D = 0.03 m per 0.0001 MLD 
Elevation P/▴Elev.= (−)a 0.51 m per m 

aA negative correlation is indicated by a (−) sign.

Figures 18 and 19 show that there is a positive correlation of 0.016 m/s per cumec with flow rate and a negative correlation of (−) 0.01 m/s per 0.1 m with diameter when used as independent variable while predicting velocity. Figures 20 and 21 show that there is a positive correlation of 0.03 m per 0.0001 MLD with demand and a negative correlation of (−)0.51 m per m with elevation when used as independent variable while predicting pressure.

The effective management of WDSs plays a crucial role in enhancing water use efficiency in residential areas. Hence, this study introduces a seldom approach to assess the statistical performance of WDSs using data-driven models such as ANN-LM, ANN-BR, ANN-SCG, and ANFIS. The Gadhra WDN was employed to train and validate these models, and the data were collected from the DW&S, Jharkhand Government.

The results of this study can be summarized as follows:

  • (1)

    In the ANN modeling, the ANN-LM model outperforms the ANN-BR and ANN-SCG models in predicting velocity and pressure. The number of hidden layer neurons and the type of transfer functions used in the hidden and output layers significantly impact the performance of the ANN. The ANN-LM method exhibits the best prediction accuracy for estimating velocity (RMSE = 0.0189, MAE = 0.0122, R2 = 0.9568) and pressure (RMSE = 0.3244, MAE = 0.2176, R2 = 0.9773) when considering both training and testing data.

  • (2)

    The ANFIS model also demonstrates satisfactory performance in predicting velocity and pressure. By utilizing the Gaussian MF instead of the triangle and trapezoidal functions and increasing the number of membership functions in the univariate output mode, the model's performance improves. The ANFIS method yields reliable predictions for velocity (RMSE = 0.0201, MAE = 0.0215, R2 = 0.8745) and pressure (RMSE = 0.5520, MAE = 0.4472, R2 = 0.9336) when considering both training and testing data.

  • (3)

    The findings of the sensitivity analysis are significant as they identify potential parameters that influence velocity and pressure and provide numerical estimates of the magnitude of each independent variable's impact on these quantities. The sensitivity analysis reveals the following relationships for parameters (flow rate, diameter, demand, and elevation): ▴V/▴Q = 0.016 m/s per cumec, ▴V/▴d = (−)0.01 m/s per 0.01 m, ▴P/▴D = 0.03 m per 0.0001 MLD, and ▴P/▴Elev. = (−) .51 m per m. Utilizing this prediction model will aid in adjusting the values of these variables, facilitating effective management of velocity and pressure.

Limitations

The prediction model obtained for velocity and pressure has a good statistical performance by using four independent parameters (i.e., diameter, flow rate, elevation, and demand). The performance of the model could have been enhanced if the model is also predicted for head loss including velocity and pressure.

Future scope

This study demonstrates the reliability and practicality of ANN and ANFIS models for evaluating the performance of WDSs. While various data-driven models, including ANN-LM, ANN-BR, ANN-SCG, and ANFIS, performed well, ANN-LM exhibited lower errors and higher accuracy than all the other models. Therefore, it is advisable to prioritize the use of the ANN-LM model in future studies. The findings of this study serve as a valuable guide for selecting an appropriate model to assess the performance of WDNs. Consequently, instead of relying on time-consuming and complex conventional methods, it is recommended to employ ANN-LM, ANN-BR, ANN-SCG, and ANFIS models for evaluating the performance of WDNs.

The authors are thankful to the reviewers for their valuable suggestion which has enhanced the quality of this manuscript The authors are thankful to the *DW&S of the Jharkhand government for providing the details of Gadhra Water Distribution Networks.

All authors contributed to the study conception and design. Data collection was performed by A.R. and S.K. Analysis and optimization were performed by A.R. and S.K. The first draft of the manuscript was written by A.R. All authors read and approved the final manuscript.

Funding not received from any agency.

All analyses were made by licensed software WaterGEMS and MATLAB.

Authors gave their permission.

Authors gave their permission.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Almheiri
Z.
,
Meguid
M.
&
Zayed
T.
2020
Intelligent approaches for predicting failure of water mains
.
Journal of Pipeline Systems Engineering and Practice
11
(
4
),
1
15
.
Ansari
M.
,
Othman
F.
&
El-Shafie
A.
2020
Optimized fuzzy inference system to enhance prediction accuracy for influent characteristics of a sewage treatment plant
.
Science of The Total Environment
722,
137878
.
Berardi
L.
,
Giustolisi
O.
,
Kapelan
Z.
&
Savic
D. A.
2008
Development of pipe deterioration models for water distribution systems using EPR
.
Journal of Hydroinformatics
10
(
2
),
113
126
.
Bougadis
J.
,
Adamowski
K.
&
Diduch
R.
2005
Short-term municipal water demand forecasting
.
Hydrological Process
19
,
137
148
.
Chang
N.
&
Makkeasorn
A.
2006
Water demand analysis in urban regions by neural network models
. In
8th Annual Water Distribution Systems Analysis Symposium
.
ASCE Library
,
Cincinnati
.
Chen
T. Y.
&
Guikema
S. D.
2020
Prediction of water main failures with the spatial clustering of breaks
.
Reliability Engineering and System Safety
203
.
Article 107108
, 1–25.
Crommelynck
V.
,
Duquesne
C.
,
Mercier
M.
&
Miniussi
C.
1992
Daily and Hourly Water Consumption Forecasting Tools Using Neural Networks
. In
Proc. of the AWWA's Annual Computer Specialty Conference
,
Nashville, Tennessee
, pp.
665
676
.
De Oliveira
D. P.
,
Garrett
J. H.
&
Soibelman
L.
2011
A density-based spatial clustering approach for defining local indicators of drinking water distribution pipe breakage
.
Advanced Engineering Informatics
25
(
2
),
380
389
.
Fares
H.
&
Zayed
T.
2010
Hierarchical fuzzy expert system for risk of failure of water mains
.
Journal of Pipeline Systems Engineering and Practice
1
(
1
),
53
62
.
Farmani
R.
,
Kakoudakis
K.
,
Behzadian
M. K.
&
Butler
D.
2017
Pipe failure prediction in water distribution systems considering static and dynamic factors
.
Procedia Engineering
186
,
117
126
.
Foresee, F. D. & Hagan, M. T. 1997 Gauss-Newton approximation to Bayesian learning. In Proceedings of International conference on neural networks (ICNN'97) Houston, TX, USA, IEEE, 3, 1930–1935
.
Hagan
M. T.
&
Menhaj
M.
1994
Training feed-forward networks with the Marquardt algorithm
.
IEEE Transactions on Neural Networks
5
(
6
),
989
993
.
Hagan
M. T.
,
Demuth
H. B.
&
Beale
M. H.
1996
Neural Network Design
.
PWS Publishing
,
Boston, MA
.
Ibnelouad, A., Kari, A. E. L., Ayad, H. & Mjahed, M. 2021 Multilayer artificial approach for estimating optimal solar PV system power using the MPPT technique. Studies in Informatics and Control 30 (4), 109–120.
Islam
M. S.
,
Sadiq
R.
,
Rodriguez
M. J.
,
Francisque
A.
,
Najjaran
H.
&
Hoorfar
M.
2011
Leakage detection and location in water distribution systems using a fuzzy-based methodology
.
Urban Water Journal
8
(
6
),
351
365
.
Jafar
R.
,
Shahrour
I.
&
Juran
I.
2010
Application of artificial neural networks to model the failure of urban water mains
.
Mathematical and Computer Modelling
51
,
1170
1180
.
Jain
A.
,
Varshney
A. K.
&
Joshi
U. C.
2001
Short-term water demand forecast modelling at IIT Kanpur using artificial neural networks
.
IEE Transactions on Water Resources Management
15
(
1
),
299
321
.
Kapelan
Z. S.
,
Savic
D. A.
&
Walters
G. A.
2003
A hybrid inverse transient model for leakage detection and roughness calibration in pipe networks
.
Journal of Hydraulic Research
41
(
5
),
481
492
.
Kisi
O.
&
Uncuoghlu
E.
2005
Comparison of three backpropagation training algorithms for two case studies
.
Indian Journal of Engineering & Materials Sciences
12
,
434
442
.
Li, G. & Shi, J. 2012 Applications of Bayesian methods in wind energy conversion systems. Renewable Energy 43, 1–8
.
Longe
E. O.
,
Omole
D. O.
,
Adewumi
I. K.
&
Ogbiye
S. A.
2010
Water resources use, abuse and regulations in Nigeria
.
Journal of Sustainable Development in Africa
12
(
2
),
35
45
.
Lui
J.
,
Savenije
H.
&
Xu
J.
2003
Forecast of water demand in Weinan city in China using WDF-ANN model
.
Physics and Chemistry of the Earth
28
,
219
224
.
MacKay, D. J. C. 1992 Bayesian Interpolation. Neural Computation, MIT Press Direct, United States, 4 (3), 415–447.
Mehta, R. & Jain, S. K. 2009 Optimal Operation of a Multi-Purpose Reservoir using Neuro-Fuzzy Technique. Water Resources Management 23, 509–529
.
Moreira, M. O., Balestrassi, P. P., Paiva, A. P., Ribeiro, P. F. & Bonatto, B. D. 2021 Design of experiments using artificial neural network ensemble for photovoltaic generation forecasting. Renewable and Sustainable Energy Reviews 135, 1–14.
Pan, X., Lee, B. & Zhang, C. 2013 A comparison of neural network backpropagation algorithms for electricity load forecasting. IEEE International Workshop on Intelligent Energy Systems (IWIES), Vienna, Austria, 22–27.
Sadiq
R.
,
Kleiner
Y.
&
Rajani
B.
2007
Water quality failures in distribution networks risk analysis using fuzzy logic and evidential reasoning
.
Risk Analysis: An International Journal
27
(
5
),
1381
1394
.
Shaik, N. B., Pedapati, S. R., Taqvi, S. A. A., Othman, A. R., Z. & Dzubir, F. A. A. 2020 A feed-forward back propagation neural network approach to predict the life condition of crude oil pipeline. Processes 8, 1–13.
Soltani
J.
&
Tabari
M.
2012
Determination of effective parameters in pipe failure rate in water distribution system using the combination of artificial neural networks and genetic algorithm
.
Journal of Water and Wastewater
23
(
83
),
2
15
.
Sonaje
N. P.
&
Joshi
M. G.
2015
A review of modeling and application of water distribution networks (WDN) softwares
.
International Journal of Technical Research and Application
3
(
5
),
174
178
.
Tabesh, M., Soltani, J., Farmani, R. & Savic, D.
2009
Assessing pipe failure rate and mechanical reliability of water distribution networks using data driven modelling
.
Journal of Hydroinformatics
11
(
1
),
1
17
.
Tu
M. Y.
,
Tsai
F.
&
Yeh
W.
2005
Optimization of water distribution and water quality by hybrid genetic algorithm
.
Journal of Water Resources Planning and Management
131
(
6
),
431
440
.
Vairavamoorthy
K.
,
Akinpelu
E.
,
Lin
Z.
&
Ali
M.
2001
Design of sustainable system in developing countries
. In:
Proceedings of World Water and Envioromental Resources Challenges, Envioromental and Water Resources Institute of ASCE
,
20–24 May 2001
,
Orlando, Florida
.
Valis
K.
2013
Application of Fuzzy Logic for Failure Risk Assessment in Water Supply System Management
.
CEST
.
Xu
Q.
,
Chen
Q.
,
Li
W.
&
Ma
J.
2011
Pipe break prediction based on evolutionary data-driven methods with brief recorded data
.
Reliability Engineering & System Safety
96
(
8
),
942
948
.
Zhang, Q. J., Cudrak, A. A., Shariff, R. & Stanley, S. J. 2004 Implementing artificial neural network models for real-time water colour forecasting in a water treatment plant. Journal of Environmental Engineering and Science 3 (1), 15–23.
Zangenehmadar
Z.
&
Moselhi
O.
2016
Application of neural networks in predicting the remaining useful life of water pipelines
. In:
Pipelines 2016
. pp.
292
308
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data