ABSTRACT
In this research, a developed method according to the generalized structure of the group method of data handling (GSGMDH) method is used in comparison with the SAELM method to find the location and amount of leakage in a water distribution network. In this method, some limitations of using artificial intelligence techniques, including the number of output layers of machine-learning models, are removed so that the model is able to predict several outputs at the same time. Also, the implementation process, which is relatively time-consuming in most of these techniques when dealing with large water distribution networks, has been significantly reduced in the GSGMDH model. These methods are implemented on a sample circular water distribution network whose data are available. The results indicate that the developed method, in which the nature of the dependence of the leakage rate on the pressure is preserved, is able to identify the position of the leaking nodes and also accurately predict the location and the amount of leakage in the node that has less leakage, with minimal extraction of hydraulic information of the nodal pressure type. This method can replace many expensive and time-consuming conventional methods as well as tools and hardware in urban water distribution networks.
HIGHLIGHTS
Designing a new structure of GSGMDH with multi-layer output prediction capability.
Development of GSGMDH model to detect the amount and location of multiple leaks in water distribution networks.
Extracting the optimal arrangement of pressure measurement points in urban water networks.
INTRODUCTION
Water supply in developing countries is often faced with multiple challenges such as institutional limitations, insufficient financial resources, rapid population growth, and poor and worn-out maintenance infrastructure. These problems have challenged the provision of reliable drinking water for people (Zvobgo & Do 2020). In several studies, the design of water distribution networks is based solely on hydraulic criteria to reduce the cost of implementation and provide network pressure, which are unable to identify the location and amount of leakage in the network (Zarei et al. 2022). In the design of urban water supply networks, a quantity called unaccounted-for water (UFW) is integrated into the calculations. Leakage in urban water supply networks is one of the main components of UFW. Considering the existing limitations in supplying water and the high cost of maintaining water systems and resources, it is possible to properly manage water needs and shortages by using the available water resources in a principled and scientific manner and also preventing water losses (Bohorquez et al. 2020; Negharchi & Shafaghat 2021; Fallahi et al. 2023).
Reducing leakage in water supply systems can be considered the main goals of treated water supply and distribution organizations (Gupta & Kulat 2018). Leakage rates in water distribution systems vary widely between countries, regions, and systems, from 3% to 7% in regulated systems in the Netherlands and up to 50% in some underdeveloped countries or underserved systems (Beuken et al. 2008).
The occurrence of leakage in the water network causes uneven distribution of pressure and inefficient energy in the network, resulting in the loss of energy used for water supply (Puust et al. 2010). Different techniques have been extended to identify the location and extent of leakage based on hardware equipment and software methods. Network leakage estimation methods include water balance methods, component analysis, and minimum night flow (MNF) based on some hypothetical parameters, which make it difficult to estimate leakage in the network (Negharchi & Shafaghat 2020; Amoatey et al. 2022).
Hardware-based methods include detecting the position of leaking pipes using hardware tools and equipment. Although the exactness of these methods is acceptable, they have many shortcomings. In these methods, the range of detection is restricted and the leaks cannot be found immediately and in a short time for large pipe networks, or when faced with the monitoring of large areas, the operating cost will usually be high. Consequently, much research has focused on finding more efficient techniques that involve less cost. Software-based leak detection methods are usually based on an algorithm or some kind of model for leak detection. Because these methods rely on other data (network pressure and flow data) instead of noise and leakage sound data, as a result, they work well for any type of pipe of any gender. The leakage detection software methods mainly include numerical models and artificial intelligence. The first category is based on numerical models (methods based on the analysis of pressure waves, hydraulic model analysis methods) for which precise data on the geometric characteristics of the pipes and related parameters such as the roughness coefficient are required. Usually, these data and information are not fully available for making numerical models. Considering this issue, with the development of technology in recent years, another type of leakage detection approach, i.e. non-numerical modeling methods, has gradually increased. In these models, observational data are analyzed in combination with tools such as data mining or artificial intelligence algorithms, and then areas with possible leaks are identified based on specific rules or patterns obtained from the data (Mutikanga et al. 2013; Li et al. 2015).
According to the conducted research, several different factors are involved in the occurrence of bursts in a water distribution system, among them the depth of the pipes, the diameter and the life of the pipes, and most importantly, the pressure plays a key role (Zanfei et al. 2022; Du et al. 2024; Jazayeri & Moeini 2024; Leite et al. 2024). Therefore, other factors can be ignored. In other words, in many studies conducted in the field of water supply network leakage detection and network study, only the pressure factor is considered (Majidi Khalilabad et al. 2018; Fang et al. 2019).
The use of system optimization and simulation tools for calibrating water supply networks based on pressure and demand in nodes has always been of interest to researchers (Nasirian et al. 2013; Do et al. 2016; Negharchi & Shafaghat 2022; Leinæs et al. 2024).
The use of methods based on fuzzy and neuro-fuzzy systems to identify the location and amount of leakage has been widely considered by researchers (Mounce et al. 2009; Islam et al. 2011; Wachla et al. 2015).
The support vector machine (SVM) method is one of the methods that have been used to classify and detect the leakage values and determine the optimal pressure monitoring points in an urban water distribution network (Zhang et al. 2016; Kang et al. 2017; Quiñones-Grueiro et al. 2018; Yan et al. 2020; Boudhaouia & Wira 2021).
In some studies, the use of hybrid models by combining optimization algorithms and different artificial intelligence models for leak detection in water distribution networks has increased the accuracy in predicting the amount and location of leaks in these networks (Alvisi & Franchini 2009; Lijuan et al. 2012; Jin & Zhou 2014; Shekofteh et al. 2020).
One of the widely used methods in the field of estimating the amount and location of the leakage, which has always had good accuracy due to its multi-layer structure and the possibility of connecting it to meta-exploration algorithms, compared with other artificial intelligence methods, is the multi-layer perceptron (MLP) method (Jang & Choi 2017; Rojek & Studzinski 2019; Shravani et al. 2019; Fallahi et al. 2021; Pérez-Pérez et al. 2021).
Fan et al. (2021) developed multi-layer perceptron neural network models of self-encrypting neural networks with the help of spectral data to overcome the problem of large dimensions as well as data imbalance in leakage detection and irregular water network pressure pattern detection. Tsai et al. (2022) investigated the application of the self-encrypting convolutional neural network (CNN). After analyzing the spectral data with the help of the model, the location and characteristics of the leakage along with the percentage of certainty of its occurrence were sent to the mobile phone of the relevant personnel. The accuracy and ease of using the method indicated its high efficiency and reliability. Farah et al. (2022) investigated and predicted the UFW volume by optimizing the MLP and radial basis function (RBF) neural networks by the genetic algorithm and also using the autoregressive integrated moving average (ARIMA) model. Based on the findings, the MLP model showed a more accurate prediction than the ARIMA model.
Methods based on machine learning are widely used in solving problems related to water resources (Shabanlou 2018; Gharib et al. 2020; Zarei et al. 2020; Alizadeh et al. 2021; Esmaeili et al. 2021; Jalilian et al. 2022; Moghadam et al. 2022; Soltani & Azari 2022; Jalili et al. 2023; Dehbalaei et al. 2023).
Literature review indicates the various applications of software methods and artificial intelligence models in identifying the location and amount of leakage in urban distribution networks. In most artificial intelligence models used so far, the network structure design is limited to a single output layer. This limitation causes the model to fail to simultaneously identify the location and amount of leakage at different nodes of the water distribution system.
So, the goal of this research is to build a machine-learning model to detect the amount and location of multiple leakages in water distribution networks to solve some of the limitations of using artificial intelligence techniques, including the number of output layers of machine-learning models. The model developed in this research is able to predict multiple leakages using multiple output layers simultaneously. Also, the implementation process, which is relatively time-consuming in most of these techniques when dealing with large water distribution networks, has been significantly reduced in the GSGMDH model.
METHODS AND MATERIALS
Relations and study water distribution system
The leakage detection approach in this research is based on nodal pressure measurement in water distribution networks and performing trial and error until reaching the most optimal pressure measurement points, which has an impact on finding the location and exact amount of leakage. In order to set up the model, the entire physical and hydraulic characteristics of the network must be available and the calibration operation must be done correctly and accurately.
Pressure changes in different nodes versus different leakage values in node number (2).
Pressure changes in different nodes versus different leakage values in node number (2).
Generalized structure of group method of data handling (GSGMDH)




This equation creates the vector of coefficients for all M sets of three. Coefficients of neurons in the hidden and output layers in the training phase are determined based on the program's initial definition of the level of significance and the confidence interval desired by the researcher, and the optimization process of the coefficients and equations of neurons and the data-screening mechanism, that is, the elimination of variables that show low correlation in this phase is done. Therefore, calculations at large scales are practically solvable and help to put the system of normal equations in suitable and solvable conditions (Ahmadi et al. 2019; Naderpour et al. 2020; Park et al. 2020). The main advantage of the GMDH compared with conventional neural networks is to obtain and present a mathematical model in terms of polynomials for the process under investigation (Madala & Ivakhnenko 1994). Another advantage is the high ability of the GMDH to analyze multi-parameter data sets (Dodangeh et al. 2020), removing the inputs that have less impact on the calculation of the output value (Park et al. 2020), and finding the structure of the model and its parameters automatically (Rayegani & Onwubolu 2014; Stepashko et al. 2017).
Despite the advantages of classical GMDH, including automatic selection of the most effective input variables, automatic determination of model structure, and consideration of accuracy and simplicity at the same time to avoid overfitting, it also has limitations that make this method difficult to use. The most important issues of this method are: (1) the degree of the polynomial is limited to two, (2) the input of each neuron is limited to two, and (3) the input of each neuron is limited to the neurons of the adjacent layer. To achieve a simple model with high accuracy by overcoming the existing drawbacks, a new computer program is coded in the MATLAB software, known as the generalized structure of the group management data method (GSGMDH), which eliminates the three main limitations of the classical GMDH. In the GSGMDH, the degree of polynomials can be two or three. In addition, model inputs are not limited to the adjacent layer, but the number of inputs per neuron can be two or three. According to the description provided, the structure of each virtual variable is one of the following classes:
Self-adaptive extreme learning machine (SAELM)
The use of a differential evolution algorithm in a self-adaptive way has the ability to overcome existing limitations such as control parameters in the algorithm and choosing a trial vector strategy. Therefore, the self-adaptive extreme learning machine (SAELM) algorithm is presented by Cao et al. (2012) to optimize network input weights and hidden node biases. Having the training data set, L number of hidden nodes and activation function g(x), the SAELM algorithm can be formulated. For this purpose, first the initial population uses population vectors (NP) that include hidden nodes.
Evaluating the performance of artificial intelligence models
In this research, due to the lack of actual data and information recorded on leakage in networks, leakage values are hypothetically considered in several nodes of the network and the goal is to predict the location and amount of leakage in the network by training the artificial intelligence model. So, these hypothetical leakage values considered in some nodes of the network essentially play the role of observed leakage values, and artificial intelligence must be able to find the location and amount of these leaks in the network. (For detailed explanations on how to generate leakage scenarios, calculate emitter coefficients, set input and output data, and train the AI model, please see Appendix 1 in the Supplementary Material.)




Leakage modeling in hydraulic software



Optimal arrangement of pressure sensor points
In the neural network training phase, the pressure of one or more nodes of the water network model should be given as input to the neural network system so that it can predict the location and amount of leakage. It is very important to choose and determine the position of these pressure sensors in the water network, because in the case of the inappropriate selection of a node for pressure measurement, the correct answers might not be obtained. If the selected points for the pressure sensor are chosen correctly, it can be expected that optimal results will be obtained by using the neural network and the minimum number of pressure sensors. In order to be able to determine the state of the hydraulic pressure of the network at different times and points, the pressure measurement operation in the water supply network should be performed simultaneously and at different points.
Critical points, which are among the suitable points for the pressure measurement operation, become waterless in the case of pressure drops. These points are usually high points of an area or points far away from the water source of the system, which have large fluctuations and hydraulic pressure drops. Therefore, all these critical points in the water supply network should be investigated and their information should be recorded. By analyzing this information, it can be determined which of these points are the most critical. In this research, 20% of the Poulakis network nodes, i.e. up to six nodes, are subjected to pressure measurement as suspected leakage nodes. These points are determined based on the results of the analysis of the first pressure measurement and also based on the amount of leakage predicted by the neural networks.
Choosing the location of the pressure sensor in the network
In the initial modeling of the network, to determine the suitable points for pressure measurement in the water distribution network, a leakage value is assumed in the network. Based on this assumption, the graph of pressure changes in a particular node of the water network can be drawn for different combinations of dual leakages. For example, if it is assumed that the total leakage in the Poulakis water distribution network (Figure 1) is 50 L/s, then Figure (4) shows the pressure fluctuations in three specific nodes of the network that can be drawn for different combinations of dual 25 L/s leakages.
(a) Node pressure fluctuations (1) for different combinations of double leakages, (b) pressure fluctuations of node (2) for different dual leakage combinations, and (c) pressure fluctuations of node (6) for different dual leakage combinations.
(a) Node pressure fluctuations (1) for different combinations of double leakages, (b) pressure fluctuations of node (2) for different dual leakage combinations, and (c) pressure fluctuations of node (6) for different dual leakage combinations.
RESULTS
Sensitivity analysis of neural network models
Details of optimal configuration of SAELM and GSGMDH models
Model . | Number of hidden layers . | Neuron number in hidden layer . | Activation function . | Iteration number (epochs) . |
---|---|---|---|---|
SAELM | 1 | 45 | Hyperbolic tangent | 1,000 |
GSGMDH | 1 | 90 | Triangular base | 1,000 |
Model . | Number of hidden layers . | Neuron number in hidden layer . | Activation function . | Iteration number (epochs) . |
---|---|---|---|---|
SAELM | 1 | 45 | Hyperbolic tangent | 1,000 |
GSGMDH | 1 | 90 | Triangular base | 1,000 |
The RMSE values obtained from the analysis of the leakage scenario considered using different activation functions and the number of neurons in the GSGMDH neural network.
The RMSE values obtained from the analysis of the leakage scenario considered using different activation functions and the number of neurons in the GSGMDH neural network.
Evaluation of the efficiency of models
Next, the efficiency of each model in predicting leakage points and amounts is tested. At this step, it is assumed that some nodes of the network have leaks, and the GSGMDH and SAELM models are developed to find the location and amount of these leaks. In the first scenario, it is assumed that there are leaks in four nodes of the water distribution network, i.e. nodes (11), (6), (5), and (12), and they have a leakage equal to 12.5 L/s. In this scenario, the first node for pressure measurement, node number (6) of the network, is selected. The results of water network node leakage prediction in this case are given in Table 2. Also, the summary of the mentioned table considering the six nodes with the largest amount of leakage is presented in Table 3.
Analysis results of scenario number (2) after the first pressure measurement in node (6) by GSGMDH model
Node number . | Leakage amount (L/s) . | Node number . | Leakage amount (L/s) . |
---|---|---|---|
1 | 0.62 | 16 | 0.01 |
2 | 0.37 | 17 | 2.25 |
3 | 0.35 | 18 | 5.73 |
4 | 0.29 | 19 | 0.00 |
5 | −0.04 | 20 | 0.00 |
6 | 1.73 | 21 | −0.01 |
7 | 0.00 | 22 | 0.00 |
8 | 0.00 | 23 | 2.31 |
9 | 0.00 | 24 | 4.51 |
10 | −0.01 | 25 | 0.00 |
11 | −0.08 | 26 | 0.00 |
12 | 25.41 | 27 | −0.01 |
13 | 0.00 | 28 | −0.01 |
14 | 0.00 | 29 | 2.34 |
15 | 0.00 | 30 | 4.31 |
Node number . | Leakage amount (L/s) . | Node number . | Leakage amount (L/s) . |
---|---|---|---|
1 | 0.62 | 16 | 0.01 |
2 | 0.37 | 17 | 2.25 |
3 | 0.35 | 18 | 5.73 |
4 | 0.29 | 19 | 0.00 |
5 | −0.04 | 20 | 0.00 |
6 | 1.73 | 21 | −0.01 |
7 | 0.00 | 22 | 0.00 |
8 | 0.00 | 23 | 2.31 |
9 | 0.00 | 24 | 4.51 |
10 | −0.01 | 25 | 0.00 |
11 | −0.08 | 26 | 0.00 |
12 | 25.41 | 27 | −0.01 |
13 | 0.00 | 28 | −0.01 |
14 | 0.00 | 29 | 2.34 |
15 | 0.00 | 30 | 4.31 |
Nodes suspected of leaking according to the first pressure measurement in node number (6)
Node number . | Leakage amount (L/s) . |
---|---|
12 | 25.41 |
18 | 5.73 |
24 | 4.51 |
30 | 4.31 |
29 | 2.34 |
23 | 2.31 |
Node number . | Leakage amount (L/s) . |
---|---|
12 | 25.41 |
18 | 5.73 |
24 | 4.51 |
30 | 4.31 |
29 | 2.34 |
23 | 2.31 |
According to Table 3, node number (12) is selected for the pressure measurement of the second stage. The results can be seen in Table 4.
Leaky nodes according to pressure measurement in node numbers (6) and (12)
Node number . | Leakage amount (L/s) . |
---|---|
12 | 0.8 |
18 | −0.19 |
24 | 1.25 |
30 | 3.61 |
29 | 0.58 |
23 | −0.03 |
6 | 22.24 |
11 | 19.59 |
Node number . | Leakage amount (L/s) . |
---|---|
12 | 0.8 |
18 | −0.19 |
24 | 1.25 |
30 | 3.61 |
29 | 0.58 |
23 | −0.03 |
6 | 22.24 |
11 | 19.59 |
Nodes numbers (12), (18), (29) and (23) are removed from the list by taking negative and close-to-zero values after pressure measurement in number (12). By carrying out this pressure measurement, the flow rate in node numbers (6) and (11) shows a positive jump equal to 22.24 and 19.59 L/s, respectively. If these nodes are not removed by performing more pressure measurements, they should be considered instead of the removed nodes in the next steps. For the next step, node number (30) is selected as the next pressure measurement node. The results can be seen in Table 5.
The results of leak detection after pressure measurement in node numbers (6), (12), and (30)
Node number . | Leakage amount (L/s) . |
---|---|
24 | −0.25 |
30 | 0.00 |
6 | 24.13 |
11 | 5.74 |
12 | 7.6 |
5 | 4.28 |
Node number . | Leakage amount (L/s) . |
---|---|
24 | −0.25 |
30 | 0.00 |
6 | 24.13 |
11 | 5.74 |
12 | 7.6 |
5 | 4.28 |
With the pressure measurement at this stage, node numbers (24) and (30) are excluded from the list of calculations. Therefore, node numbers (6) and (11) remain in the list and their leakage is a positive value. Also, the flow rate in node number (5) increases during each saturation stage and reaches 4.28 L/s. According to Table 5, by performing only three pressure measurements, the exact position of four leaky nodes is well identified. Therefore, the number of pressure sensors should be increased in order to find the exact leakage rate of these nodes. To ensure the existence of leakage and its amount in nodes (11) and (5), it is necessary to consider these two nodes in the next calculations. The results of the barometric analysis in these two nodes are presented in Table 6. According to this table, none of the four leaky nodes in the previous list are removed. So this issue further strengthens the possibility of leakage in these nodes. To be sure, node (5) is subjected to pressure measurement as the fifth node. The results of this pressure measurement are presented in Table 7.
Leak detection results after pressure measurement in nodes (6), (12), (30), and (11)
Node number . | Leakage amount (L/s) . |
---|---|
6 | 17.73 |
11 | 7.48 |
12 | 11.45 |
5 | 15.46 |
Node number . | Leakage amount (L/s) . |
---|---|
6 | 17.73 |
11 | 7.48 |
12 | 11.45 |
5 | 15.46 |
Leak detection results after pressure measurement in nodes (6), (12), (30), (11), and (5)
Node number . | Leakage amount (L/s) . |
---|---|
6 | 12.3 |
11 | 10.13 |
12 | 13.27 |
5 | 9.37 |
Node number . | Leakage amount (L/s) . |
---|---|
6 | 12.3 |
11 | 10.13 |
12 | 13.27 |
5 | 9.37 |
Referring to the values in Table 7, it can be seen that the answers converged to the four nodes (5), (6), (11) and (12) with a correlation coefficient of R = 0.99, and by conducting more investigations and increasing the number of pressure measurement nodes, not only has a desirable effect not been seen in the answers, but the accuracy of the results has decreased. So, it can be concluded that only these four nodes are leaky. The results of the last stage of the pressure measurement can be seen in Table 8.
Final predicted leakage rate by neural network models along with the percentage of relative leakage error
Leaky node . | GSGMDH . | SAELM . | ||||
---|---|---|---|---|---|---|
Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | |
5 | 9.37 | 12.5 | 25.04 | 10.35 | 12.5 | 1.72 |
6 | 12.3 | 12.5 | 1.6 | 13.05 | 12.5 | 4.4 |
11 | 10.13 | 12.5 | 18.96 | 9.94 | 12.5 | 20.48 |
12 | 13.27 | 12.5 | 6.16 | 10.55 | 12.5 | 16.25 |
Leaky node . | GSGMDH . | SAELM . | ||||
---|---|---|---|---|---|---|
Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | |
5 | 9.37 | 12.5 | 25.04 | 10.35 | 12.5 | 1.72 |
6 | 12.3 | 12.5 | 1.6 | 13.05 | 12.5 | 4.4 |
11 | 10.13 | 12.5 | 18.96 | 9.94 | 12.5 | 20.48 |
12 | 13.27 | 12.5 | 6.16 | 10.55 | 12.5 | 16.25 |
The accuracy of results obtained from neural network models based on statistical indices
. | Train data . | Test data . | ||||||
---|---|---|---|---|---|---|---|---|
Model . | R . | RMSE . | NRMSE . | NSE . | R . | RMSE . | NRMSE . | NSE . |
GSGMDH | 0.86 | 1.03 | 0.12 | 0.73 | 0.99 | 0.28 | 0.03 | 0.98 |
SAELM | 0.88 | 0.95 | 0.11 | 0.77 | 0.99 | 0.28 | 0.03 | 0.98 |
. | Train data . | Test data . | ||||||
---|---|---|---|---|---|---|---|---|
Model . | R . | RMSE . | NRMSE . | NSE . | R . | RMSE . | NRMSE . | NSE . |
GSGMDH | 0.86 | 1.03 | 0.12 | 0.73 | 0.99 | 0.28 | 0.03 | 0.98 |
SAELM | 0.88 | 0.95 | 0.11 | 0.77 | 0.99 | 0.28 | 0.03 | 0.98 |
Test data graph including observed and predicted emitter coefficients by SAELM neural network.
Test data graph including observed and predicted emitter coefficients by SAELM neural network.
Test data graph including observed and predicted emitter coefficients by GSGMDH neural network.
Test data graph including observed and predicted emitter coefficients by GSGMDH neural network.
In the second scenario, it was assumed that the two leaking nodes were relatively far apart and that their leakage rates were different. In this scenario, nodes (25) and (12) were assumed to be the two leaking nodes and their leakage rates were considered to be 46 and 4 L/s, respectively. In this situation, due to the distance between the two nodes, it would be difficult to find both leaks (their amount and location) that occurred simultaneously in the system.
In the second scenario, similar to the process in the previous scenario, the pressure measurement points in the system were gradually added to the calculations. Node number (6) was selected as the first pressure measurement node. Then, after repeated executions, nodes (20) and (19) were added as pressure measurement points. Finally, based on the pressure measurements at these three nodes, the efficiency of the two GSGMDH and SAELM models in finding the amount and location of the leak in this scenario was compared. The predicted leak values for the second scenario using the two GSGMDH and SAELM models along with the relative leak error percentage are presented in Table 10. Table 10 shows that the SAELM model was unable to identify the amount of leak and the location of the node whose leak was relatively very small. No improvement was observed in the answers obtained by SAELM with increasing the number of pressure measurements. For this reason, the SAELM model was discarded compared with the GSGMDH model, and finally the GSGMDH model was introduced as the superior model.
Leakage value predicted by GSGMDH and SAELM models along with relative error percentage – scenario 2
Leaky node . | GSGMDH . | SAELM . | ||||
---|---|---|---|---|---|---|
Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | |
12 | 3.36 | 4 | 16 | – | 4 | – |
25 | 44.83 | 46 | 2.5 | 45.06 | 46 | 2.04 |
Leaky node . | GSGMDH . | SAELM . | ||||
---|---|---|---|---|---|---|
Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | Predicted leakage (L/s) . | Observed leakage (L/s) . | Relative error (%) . | |
12 | 3.36 | 4 | 16 | – | 4 | – |
25 | 44.83 | 46 | 2.5 | 45.06 | 46 | 2.04 |
The results showed that in the second scenario, the GSGMDH model is able to identify and predict the exact location of both leaking nodes and their leakage amount with high accuracy. However, the SAELM model was only able to identify the location and leakage amount of one of the nodes that had a much higher leakage amount than the other node. Therefore, in the SAELM model, the node that was assumed to have less leakage was not identified. The results showed that the GSGMDH performs better than the SAELM network in finding leaks with small flow rates.
From a statistical point of view, the greater the number of points (samples) examined and pressure gauged, the more reliable will be the results obtained from the analysis of the hydraulic condition of the water network. On the other hand, considering the existing facilities and equipment, expertise, and manpower, as well as the problems in installing, controlling, and accurately reading these sets of equipment at regular time-intervals, especially in the semi-automatic and manual type, this will cause a limitation in the number of points. Also, increasing the number of pressure measurement points in water distribution networks, especially in large and complex networks, will significantly increase costs. So, the goal of this research was to develop a model that can estimate the amount and location of leaks in the network with appropriate accuracy using the minimum number of pressure measurement points.
CONCLUSION
Conventional methods of identifying the location and amount of leakage based on complex models as well as existing tools and hardware that are used today are relatively complex, time-consuming, and very expensive. In most cases, the use of these methods requires special expertise, and they have a limited range of application and accuracy in large-scale monitored areas. Using artificial intelligence to identify the location and amount of point leakages, as well as to estimate the total leakage flow rate in urban water distribution networks, is of great importance among the existing methods. In this research, two artificial neural network models, i.e. SAELM and GSGMDH including multi-layer outputs, were developed in order to find the location and extent of multiple leaks. Unlike conventional artificial intelligence methods, which include only one output layer, the developed models have 30 output layers related to the possible values of flow rate and leakage in 30 different nodes of the water distribution network. These models are able to predict 30 output layers related to possible leakage values in 30 network nodes based on three input layers including different pressure values in three pressure measurement nodes in the network. In order to preserve the nature of water leakage in the pipe network whose values depend on the pressure and change with the fluctuations of the network pressure, this phenomenon was simulated by the orifice exponential equation. The emitter coefficients and nodal pressure were used to train artificial neural networks, and the values predicted by these networks were also from emitter coefficients. With the help of a string of codes, the neural network algorithm in the MATLAB environment was linked to the EPANET 2.2 software so that it was possible to continue the hydraulic calculations in order to predict the discharge values of each node based on the emitter coefficients. To show the effectiveness of the artificial neural networks used in the present research, two leakage scenarios with specific conditions and various locations of leakage in the water supply system were developed and analyzed. In the first scenario, the leakage in four adjacent nodes was assumed to be 12.5 L/s. In the second scenario, it was assumed that the two leaking nodes were relatively far apart and that their leakage rates were different. The results showed that with this method, it is possible to predict the amount and location of multiple leaks in the network in the shortest time by just monitoring the pressure in three nodes of the water distribution network. The important result that can be obtained from the first scenario is that, although in terms of prediction accuracy, the GSGMDH neural network obtains relatively similar results compared with the SAELM in most cases, in terms of execution time, the GSGMDH network has a higher operating speed. In other words, the duration of the leakage prediction, which is considered an important and influential factor in the volume of water wastage, is far less than in the SAELM model. The results showed that the GSGMDH neural network, in which the number of iterations and hidden layer neurons is much higher than in SAELM for each scenario, was implemented 50 times faster. This is an important advantage of using the GSGMDH model to detect leaks in complex urban networks with more nodes and pipes. The results showed that in the second scenario, the GSGMDH model was able to identify and predict the exact location of both leaking nodes far apart with various leakage amounts with high accuracy. However, the SAELM model was only able to identify the location and leakage amount of one of the nodes that had a much higher leakage amount than the other node. In the SAELM model, the node that was assumed to have less leakage was not identified. So, the GSGMDH performs better than the SAELM network in finding leaks with small flow rates. The GSGMDH as the superior model can be used in large and complex urban water distribution systems. Municipalities or water utility companies can use this method to quickly detect leaks in the system by determining and placing a minimum number of pressure measurement points in urban water distribution networks. This technology could be scaled and implemented at different levels of urban infrastructure. Although the focus is on operational efficiency, the environmental benefits of the research are significant. Preventing water leakage not only reduces water waste but also minimizes the energy and costs associated with treating and pumping water. This could be a key point for stakeholders looking to adopt more sustainable practices. One limitation of this method is that in complex real networks, actual leakage information must be recorded at some points in the network to be used for the model validation. This information is not routinely available in many networks. The method developed in this research can be used in other water distribution networks due to its high speed and accuracy in detecting the amount and location of leaks in urban water distribution networks based on the minimum number of pressure measurement points. Due to the increasing number of variables in large and complex water distribution networks with a large number of nodes and pipes, the developed code can be used in multi-core computer systems with a parallel structure to reduce the problem-solving time.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.