## Abstract

Reasons for occasional, random pipe bursts in water distribution networks (WDNs) may come from numerous factors (e.g. pH value of the soil, pipeline material). Still, the isolation of the damaged section is inevitable. While the corresponding area is segregated by closing the isolation valves, there is a shortfall in drinking water service. This paper analyses the vulnerability of segments of WDNs from the viewpoint of the consumers that is the product of the failure rate and the relative demand loss. Real pipe failure database, pipe material and pipe age data are used to increase the accuracy of the failure rate estimation for 27 real-life WDNs from Hungary. The vulnerability analysis revealed the highly exposed nature of the local vulnerabilities; the distribution of local vulnerability values follows a power-law distribution. This phenomenon is also found by investigating the artificial WDNs from the literature using N rule in terms of isolation valve layout, namely the ky networks, with similar results.

## HIGHLIGHTS

Definition of vulnerability is further generalized with actual pipe burst statistics.

Vulnerability analysis is performed in artificial and real-life WDNs.

The local vulnerability of all WDNs follow power-law trend.

### Graphical Abstract

## INTRODUCTION

Water distribution networks (WDNs) provide essential drinking water to residential areas and industry. Due to the continuous growth of the urban population, WDNs are constantly growing by laying new pipelines or connecting operating networks. While nowadays such heterogeneous systems as WDNs are highly supervised by numerous sensors (e.g. pressure, flow meters, quality control), and by utilising detailed hydraulic models, the risk and the consequence of a random pipe burst is still challenging to predict. Engineers and researchers are exploiting the advantages of vast computational resources provided by CPU or GPU clusters by applying sophisticated methods from the field of deep learning. Due to the fact that these highly sophisticated deep learning techniques are complex and unique methods compared to the ‘classical’ hydraulic approaches, and their application on the WDNs just appeared widely in the last decade, the lack of experience and practical benefits of such an algorithm are yet difficult to utilise, and direct methods are still dominating.

During the last decades, the complex network theory brought a decisive discovery regarding real-life networks (e.g. social networks, protein connections); that is, the traditional random graph model cannot accurately model the exposed behaviour of real networks. In a previous study Weber *et al.* (2020), the authors published a topological and vulnerability analysis of the segment graphs of real-life WDNs. It revealed that segment graphs show the nature of random, connected planar graphs, i.e. strictly from a topological viewpoint, they can be considered robust, as the removal of several nodes (isolation of segments) cannot significantly impact the overall integrity of the graph. However, the vulnerability analysis from the demand perspective, where the hydraulics is also considered, highlighted the exposed nature of real-life WDNs, i.e. some segments have a prominent role in terms of the reliable functionality of the system. This paper follows the idea of vulnerability and further extends it.

Throughout that study, the failure rate was considered equal for every meter of pipeline, while the present paper introduces how to cope with the properties of pipes and with actual pipe burst statistics. Dealing with additional input data allows the method to perform a more accurate analysis of the vulnerability. Besides the real-life WDNs from Hungary, the artificial ky networks from the literature are also investigated in similar aspects.

### The analysed networks

All of the analysed real-life networks are located in the region of Western Hungary. The hydraulic states for every network were determined using the yearly average demands from the billing system. Table 1 presents the ranges of the principal topological and hydraulic parameters of the networks.

Property . | Min . | Max . |
---|---|---|

Number of nodes | 301 | 7188 |

Number of segments | 12 | 717 |

Overall pipe length (km) | 4.59 | 112 |

Total nominal consumption (m^{3}/h) | 0.4 | 540 |

Property . | Min . | Max . |
---|---|---|

Number of nodes | 301 | 7188 |

Number of segments | 12 | 717 |

Overall pipe length (km) | 4.59 | 112 |

Total nominal consumption (m^{3}/h) | 0.4 | 540 |

According to Table 1, this analysis embraced networks with highly different sizes. In order to present the structural differences of these networks, four networks were randomly selected from the analysed 27 real-life water distribution networks. Figure 1 shows these networks, where the different colours indicate the different segments. These segments are identified according to Huzsvár *et al.* 2019.

As Figure 1 depicts, the group of the analysed networks are beholding rural (Network 24), urban (Network 10), and compound networks (Networks 7 and 3), which cannot be classified to one of the categories. Table 2 presents the hydraulically and topologically specific parameters of these networks.

ID . | Overall pipe length (km) . | Total nominal consumption (m^{3}/h)
. | Number of nodes (1) . | Number of segments (1) . |
---|---|---|---|---|

24 | 5.09 | 8.89 | 379 | 11 |

10 | 17.62 | 6.15 | 1294 | 65 |

7 | 31.92 | 36.07 | 1992 | 179 |

3 | 36.24 | 50.47 | 2941 | 222 |

ID . | Overall pipe length (km) . | Total nominal consumption (m^{3}/h)
. | Number of nodes (1) . | Number of segments (1) . |
---|---|---|---|---|

24 | 5.09 | 8.89 | 379 | 11 |

10 | 17.62 | 6.15 | 1294 | 65 |

7 | 31.92 | 36.07 | 1992 | 179 |

3 | 36.24 | 50.47 | 2941 | 222 |

### Hydraulic simulation

The EPANET solver has dominated hydraulic modelling (Rossman 2000) for decades, the scientific field, as well as the industrial, e.g. WaterGEMS (Wu *et al.* 2007), and MIKE (Ekklesia *et al.* 2015) programs, are also applying the open-source code of EPANET. However, this study utilised an in-house solver, called STACI, that was already used to present several results to the literature (Huzsvár *et al.* 2019; Weber *et al.* 2020). The hydraulic results were validated, ensuring that the differences in the outputs (e.g. pressure, volume flow rate) are below 1% compared to the EPANET solver. Besides the traditional hydraulic solver, two essential tools were required, on the one hand, demand-driven analysis, on the other hand hydraulic calculations, while some parts of the network are isolated. Although the standard EPANET solver is not capable of the latter one, commercially available programs are, see Kanakoudis & Gonelas (2015) and Abdel-Mottaleb & Walski (2020).

*Q*stands for the volume flow rate, ‘in’ denotes the pipelines pointing inwards, while ‘out’ refers to the outward-directed pipes, and the

*c*indicates the nodal consumption that may depend on the pressure. A properly operating WDN should provide enough pressure to fulfil the drinking water demands; however, in some exceptional cases, such as the isolating parts of the network, the pressure may drop drastically, and the actual consumption will be below the nominal demand. Although, even if the network pressure is low, some portion of the demands is volumetric and can be adequately supplied during outages (Schuück & Lansey 2018). Moreover, some parts of the demands will always depend on the pressure, e.g. leakages (Giustolisi

*et al.*2008). Based on the literature, several different approaches are available for the accurate modelling of pressure-dependent demands; during this study, the following function is used, following Abdy Sayyed

*et al.*(2015):where

*d*denotes the nominal demand (when the consumer is fully served),

*p*and

_{min}*p*represent the lower and upper boundaries for pressure (in m.w.c.), respectively, while

_{des}*m*is the exponent. During this study, the following parameters were used:

*p*= 10 m,

_{min}*p*= 25 m and

_{des}*m*= 2.5.

*p*denotes the pressure at the starting node of the edge, while

_{s}*p*at the ending node and

_{e}*f*is usually a nonlinear function of the volume flow rate, depending on the exact type of the edge. The most common element is a single pipeline, where function

*f(Q)*contains the geodetic height difference and the friction losses that the Darcy–Weisbach or the Hazen–Williams can model (Rossman 2000).

*N*+

*M*number of nonlinear, algebraic equations (with

*N*being the number of unknown nodal pressures and

*M*is the number of unknown edge flow rates), which can be rearranged to:where

*x**=*[

*]*

**p**,**Q**^{T}

*=*[

*p*]

_{1},p_{2},…,p_{N},Q_{1},Q_{2},…,Q_{M}^{T}contains the unknown hydraulic variables, i.e. the volume flow rates and nodal pressures. This set of equations is solved using Newton's technique that is a general algebraic equation solver:that is a linear set of equations where

**is the Jacobian matrix:**

*J**J*

_{i,j}*=*

*∂F*. Note that

_{i}/∂x_{j}**is a sparse matrix, meaning that there are only a few non-zero elements in each row. For solving sparse, linear equations efficiently in the C ++ environment, the Eigen library (Gael 2010) was implemented. This solution method differs from the EPANET, where a more efficient, unique iteration is implemented. As a result of the generalised algebraic equation solver, the presented approach is universal, and different extensions (e.g. pressure-dependent demands) or models can be added straightforwardly without the need to modify the solver.**

*J*### Vulnerability analysis

In the scientific literature, several similar definitions exist for the quantification of vulnerability (Tornyeviadzi *et al.* 2021), reliability (Zhuang *et al.* 2013; Kanakoudis 2004a) or resilience (Hosseini *et al.* 2016) of utility networks. Based on the experience from the literature, this study (similarly to the previous study (Weber *et al.* 2020) focuses on vulnerability analysis from the viewpoint of its consumers. The goal is to define a proper metric for quantifying the effect of a random pipe burst in the network using real-life pipe failure statistics.

After a pipeline failure, the corresponding utility company depressurises the damaged segments by closing isolation valves; meanwhile, some parts of the demands cannot be served with drinking water. This causes discomfort to the inhabitants and possible income loss to the industry. Furthermore, we assumed that every isolation valve is appropriately operating, thus for the isolation of a single pipe break, only the smallest island (segment) must be depressurised. This hypothesis may not be valid in general, and it can be an individual purpose of a different study (see Kanakoudis 2004b; Liu *et al.* 2017; Abdel-Mottaleb & Walski 2020). Although some papers (Sweetapple *et al.* 2019) suggest that pipe failures tend to accumulate, only a single segment loss is considered originating from a single pipe burst during this analysis. The analysis is performed using a single hydraulic state, which is typical with an average operation state with mean demands; however, the method can be carried out with any particular circumstances, e.g. morning peak consumption.

Although a purely topological metric, e.g. Maiolo *et al.* (2018) allows a comprehensive parametric study due to the computational efficiency, at first, a hydraulic approach is considered in this paper. The proposed vulnerability is also dimensionless, ensuring the comparison of networks of different sizes. Since, in the case of a pipe burst, the whole segment is isolated, the vulnerability value, according to this definition, will be uniform for all elements in a segment. First, the failure rate (*α*) needs to be determined for each segment, representing the probability of having a pipe failure in that segment. This quantity must be normalised, i.e. *∑α _{i}*

*=*

*1*. Numerous different issues can increase the chance of pipe bursts, e.g. pipe material, pipe age, pH value of the surrounding soil, increased traffic on the surface, or construction works around the pipelines (Bogárdi & Fülöp 2011). During this study, three of them are investigated.

**Relative pipeline length**The first assumption is based on the relative pipeline length found in a segment, i.e. it is assumed that each meter of pipelines has an equal chance to break (Creaco*et al.*2010).**Pipeline material**The corresponding utility company for the analysed real-life WDNs archived every pipe burst in the last 28 years, while pipe material data is also available from the database. Table 3 shows the overall length and number of pipe bursts for each material. The last column contains a relative failure representing how much pipe failure occurred on average each year along 100 km of pipelines. The failure rate for each pipeline can be determined as the product of its length and the relative failure based on the material of the pipeline. Finally, the*α*for each segment can be calculated as the sum of the pipeline failure rates._{i}**Pipeline age**Similarly to the material data, the same calculations were performed using real-life pipe failure statistics, but in this case, using the pipe age values. Figure 2 presents the original pipe failure probability and the processed (or averaged) data.

Material . | Sum length (km) . | Failures (pc.) . | Failure rate (pc./100 km/year) . |
---|---|---|---|

Asbestos cement | 249.43 | 1,068 | 15.29 |

Cast iron | 17.17 | 168 | 34.95 |

Spheroidal graphite cast iron | 25.50 | 3 | 0.42 |

High-density polyethylene | 377.83 | 735 | 6.95 |

Polyvinyl chloride KM | 42.70 | 1 | 0.08 |

Polyvinyl chloride | 4.92 | 24 | 17.41 |

Stainless steel | 0.43 | 1 | 8.32 |

Steel | 2.51 | 27 | 38.37 |

Summary | 720.49 | 2,027 | 10.05 |

Material . | Sum length (km) . | Failures (pc.) . | Failure rate (pc./100 km/year) . |
---|---|---|---|

Asbestos cement | 249.43 | 1,068 | 15.29 |

Cast iron | 17.17 | 168 | 34.95 |

Spheroidal graphite cast iron | 25.50 | 3 | 0.42 |

High-density polyethylene | 377.83 | 735 | 6.95 |

Polyvinyl chloride KM | 42.70 | 1 | 0.08 |

Polyvinyl chloride | 4.92 | 24 | 17.41 |

Stainless steel | 0.43 | 1 | 8.32 |

Steel | 2.51 | 27 | 38.37 |

Summary | 720.49 | 2,027 | 10.05 |

*Note* that the ‘KM’ stands for PVC pipelines with higher quality.

The first approach may be an inaccurate estimation, but it does not require any additional data; thus, it can be easily evaluated and used as a first guess for the failure rate. For artificial networks, where additional data is not feasible, this approach can be utilised. Even the second idea may contain some inaccuracies, e.g. the failure rate of polyvinyl chloride KM material is surprisingly low since the pipe age is neglected and these pipes were recently laid. The third approach uses the pipe age data. Still, if additional data is available, e.g. for pH value of the surrounding soil, the accuracy of such analysis can be further increased.

*i*th segment is calculated using the 1D hydraulic solver, STACI, i.e.where

*d*indicates the nominal demand and

_{i}*c*represents the actual amount of served water according to the model. For calculating this, one must be able to cope with:

_{i}demands that cannot be served inside the isolated segment,

there may be some parts of the network that are segregated unintentionally, i.e. outage segments (Walski

*et al.*2006),furthermore even in the case where a segment is topologically connecting to the main network, the hydraulic conditions may change, and the pressure of a segment drop drastically, so it is not capable of providing enough water to fulfil the nominal demand, i.e. the hydraulic solver must be able to handle pressure-dependent consumptions.

According to this dimensionless metric, the local vulnerability of a segment is high if it contains a significant amount of pipelines (presumably with poor quality material), and the relative loss in the drinking water service is also high in the case of its isolation. Let us consider two extreme situations: first, a segment closes to a major source where most of the water inflows, but there are only a few meters of pipelines, the local vulnerability is low, as the probability of having a random pipe burst is small. Second, there is a segment at the edges of a large WDN with hundreds of meters of pipes; although its isolation affects only a few consumers, thus the vulnerability is still negligible. A hydraulic simulation is required for the complete vulnerability analysis with every segment closed individually to determine the relative demand losses, i.e. *β*_{i} values. This means that for a large WDN, thousands of calculations are necessary to perform, making it computationally inefficient.

#### Real-life WDNs

During the examination of the spatial maps of local vulnerabilities, in each WDN, there were only a few critical segments, while the rest of the network was far less exposed. Therefore, a detailed analysis was performed on the distributions of local vulnerabilities by creating the probability density function (PDF) according to Weber *et al.* (2020), or also called frequency distribution. This representation shows similarities to the histogram, but there are also some discrepancies. The main advantage of this graphical illustration is (compared to the traditional statistical metrics, e.g. average, standard deviation or box plot) that it depicts the small details by indicating the dispersion between the minimum and maximum values of the data structure. Moreover, it can visualise the number and the location of stagnation points (where the data points tend to accumulate). It also unveils the elementary tendencies of the data and highlights the connections of the values. The steps of creating such a graph are the following:

The whole range between the minimum and the maximum values were split into

*r*number of bins, where*r**=**N*if_{seg}*N*_{seg}*<**100*and*r**=**log*_{2}(N_{seg})*+*1 otherwise, where*N*is the number of segments. It means that each network is represented using_{seg}*r*number of points in the diagram between four and ten for these WDNs. Note that there are similar rules of thumb in the literature suggesting similar*r*values.The boundaries of the bins were defined by sorting the same amount of data into each bin, i.e. the frequencies were equally distributed.

While determining the Y coordinate of each point, it is considered that the product of the horizontal width of the interval and the height of the point is equal to the relative frequency; i.e. these points share the same size, which also ensures that the area below the density function equals to 1.

Figure 3 depicts the results for all 27 real-life WDNs with a logarithmic scale on both axes if the failure rate is based on the material (left-side) or the pipe age (right-side) of the pipelines. All the points are located along a linear line that implies power-law probability distribution in both figures. Moreover, the range between the minimum and the maximum values embrace more than seven orders of magnitude. It implies that most of the segments do not have significant importance in terms of vulnerability; however, in every WDN, there are some segments with extreme outliers. These segments have a high probability of having a random pipe burst, while their loss could cause a severe outage in the service. During the initial study in Weber *et al.* (2020) the failure rates were approximated using only the relative pipeline lengths, and similar conclusions were drawn. The distribution of household incomes has a similar nature: there are very few billionaires, but the bulk of the population holds modest financial resources, or the magnitude of earthquakes follows a similar distribution.

#### Artificial WDNs

Although the results with the failure rate based on the material were congruent to the previous study in Weber *et al.* (2020), one may question whether this is a unique property of these local WDNs. Additional real-life WDNs were not available; however, there are several artificial networks from the literature, namely the ky networks (Jolly *et al.* 2014). Since these models do not contain any isolation valves originally, valves must have been placed at first. In practice, this is a difficult challenge both in terms of hydraulics and finance. The literature typically suggests either the N rule or the N-1 rule (Walski *et al.* 2006). According to the N rule, in every intersection where N (at least three) number of pipelines meet, N number of isolation valves must be placed along each pipe. In the case of the N-1 rule, one of the valves can be left out. Even though the N rule suggests too dense a layout from a practical perspective (Walski 2011), it is unique in contrast to the N-1 rule (where there is no rule which valve can be left out), isolation valves were placed according to the N rule to the ky networks for this study. Moreover, pipe material data is also unavailable for these WDNs, and the failure rate is approximated based on the pipeline length. It means each meter of pipelines has the same probability for a random pipe burst.

Figure 4 shows the results for the ky networks. Although these networks are completely independent of the real-life WDNs shown previously, the trend of the distribution of the local vulnerabilities is the same. Even the points are located along the same linear, see Figure 5. This figure even strengthens the power-law distribution of the local vulnerabilities. One may notice that the ky networks have far fewer points in the moderately vulnerable and non in the highly vulnerable region. The explanation behind this is two-fold the N rule placed almost twice as many isolation valves as the real-life WDNs have, and the ky networks contain multiple loops. A significant part of the real-life WDNs is close to a tree-like network with negligible topological redundancy.

## CONCLUSIONS

This paper presented the vulnerability analysis of 27 real-life and 14 artificial WDNs. The local vulnerability of a segment is, by definition, the product of the failure rate and the relative consumption loss in the case of the loss of the segment. For real-life WDNs, pipe failure statistics were included in the definition of the vulnerability, modelling the effect of the material of the pipelines. To determine the hydraulics of the networks, an in-house software called STACI was utilised to handle pressure-dependent demands and isolated segments.

The distribution of local vulnerability values is highly unfavourable as it follows a power-law distribution, i.e. some segments in each WDN are significantly exposed to a random pipe burst. It was presented in both cases: real-life WDNs with real pipe burst statistics and material data and the artificial ky networks from the literature with uniform failure rates. This indicates that the power-law nature of the distribution of local vulnerabilities seems to be general for every real-life WDN.

As the scrutiny suggests, there are segments in every WDN where increased efforts are needed to grant the uninterrupted service of the network. The presented vulnerability analysis technique can identify the location of these sections. For segments with high failure rate or with significant demand loss during its malfunction, the following modifications may decrease the local vulnerability:

Through the decrement of the failure rate, e.g. by improving the pipe material or more frequent pipe integrity analysis.

Decreasing the demand loss, e.g. by new segmentation of the area with redistribution of the already built-in, or new isolation valve placement.

## ACKNOWLEDGEMENTS

The research reported in this paper and carried out at BME was supported by the NRDI Fund (TKP2020 IES, Grant No. BME-IE-WAT) based on the charter of bolster issued by the NRDI Office under the auspices of the Ministry for Innovation and Technology. This project was also supported by the OTKA Grant K-135436 of Csaba Hős. The support of the New National Excellence Program of the Ministry for Innovation and Technology is also acknowledged by Richárd Wéber (ÚNKP-20-3-II-BME-271).

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.