## Abstract

This paper presents a novel algorithm driven by the minimization of the transport function for the partitioning of water distribution networks (WDNs) into district metered areas (DMAs). The algorithm is based on the linear programming (LP) embedded inside a multi-objective genetic algorithm, which enables engineering criteria, such as the minimization of the boundary pipes and the maximization of the uniformity of DMAs, to be considered in the partitioning. Furthermore, the application of the algorithm on the dual network topology based on segments and valves guarantees that configurations of DMAs that respect the real positions of isolation valves for WDN partitioning are obtained. After being described on a small WDN, it is successfully validated on a large size WDN, proving better performance than other algorithms in the scientific literature for the generation of engineeringly appealing DMA configurations, with almost identical hydraulic performance to the unpartitioned WDN.

## HIGHLIGHTS

A clustering based on transport function minimization is proposed.

The dual topology based on segments and valves enables the real valve positions to be considered in WDN partitioning.

The transport function minimization by LP is embedded in the multi-objective optimization to incorporate engineering judgment criteria.

Engineeringly appealing DMA configurations are obtained in the applications to a real WDN.

### Graphical Abstract

## INTRODUCTION

The partitioning of water distribution networks (WDNs) lies in the subdivision of large-scale networks into smaller and more easily manageable systems called district metered areas (DMAs; UK Water Industry Research Limited 1999). This brings various managerial benefits (Cabral *et al.* 2019; Bui *et al.* 2020; Diao 2021), such as the effective monitoring and control of consumption and leakage in the network. In fact, by continuously monitoring inflows and outflows at the boundaries, the total consumption (user demands + leakage) of the generic DMA can be inferred, and a prompt identification of anomalous consumption and pipe bursts can be carried out. Furthermore, a bespoke partitioning can make pressure management for leakage reduction more efficient (e.g., Creaco & Haider 2019). By reducing the number of water paths in the WDN, the partitioning brings benefits in terms of effective placement of sensors for the identification of contamination events and WDN protection (e.g., Ciaponi *et al.* 2019).

Even if WDN partitioning is nowadays a very widespread practice, it has proven to feature some disadvantages, which have caused some water utilities to move away from it. These disadvantages mainly concern the areas of the WDN close to the inter-DMA boundaries, where loss of redundancy/reliability and water quality deterioration can be observed if isolation valves are closed (Bui *et al.* 2020). These unpleasant effects can be avoided if virtual partitioning is carried out by installing flow meters at the boundaries, instead of closing isolation valves, though this results in higher partitioning costs. However, to sum things up, the pros of partitioning seem to outweigh the cons (Saldarriaga *et al.* 2019).

While some exceptions were spotted in the scientific literature (e.g., Alvisi 2015; Creaco & Haider 2019), WDN partitioning is generally carried out in two separate phases, i.e., *clustering* and *dividing*. The first phase aims to group together WDN nodes to form DMAs of uniform size, trying to keep a low number of inter-DMA boundary pipes. In the second phase, the DMAs are separated from each other either physically or virtually. At a certain boundary pipe between two adjacent DMAs, a physical separation is performed by closing the isolation valve in that pipe. A virtual separation lies in the installation of a flow meter, which enables monitoring of the flow exchange at that location. In both cases the dividing carries a cost, associated with the loss of WDN reliability and the deterioration of water quality to some extent or with the purchase/installation of a device, respectively. The dividing is generally carried out by applying optimization techniques to decide in which boundary pipes to install a flow meter or to close an isolation valve, in the search for trade-off solutions between WDN reliability and purchase/installation costs (e.g., Creaco *et al.* 2017; Laucelli *et al.* 2017). The clustering has been tackled mainly by using the graph theory. The scientific literature includes works based on the concept of modularity (e.g., Diao *et al.* 2013; Campbell *et al.* 2016; Zhang *et al.* 2017), which is an index representing the effectiveness of the partitioning (Girvan & Newman 2002), and works based on the spectral analysis (e.g., Nardo *et al.* 2017a, 2017b) or other graph-theory techniques (e.g., Perelman & Ostfeld 2011; Alvisi & Franchini 2014; Alvisi 2015; Hajebi *et al.* 2016; Herrera *et al.* 2016).

In most recent years, WDN partitioning has been one of the most researched topics in the literature of WDNs and has also been the ground of a battle (Saldarriaga *et al.* 2019), in which research teams from all over the world competed to find well-performing DMA configurations in the trade-off between various objectives, including pressure uniformity and water quality. Lately, the focus lies in the set-up of effective methodologies for large real-world WDNs (Pesantez *et al.* 2020; Vasilic *et al.* 2020), while considering numerous objective functions to simultaneously optimize (e.g., Bianchotti *et al.* 2021) and pursuing decentralized (e.g., Vegas Niño *et al.* 2021) and self-adapting configurations (e.g., Giudicianni *et al.* 2020), able to react to urban growth and to critical and abnormal operating conditions, respectively.

Among the various clustering methodologies in the scientific literature, some were developed in the context of the multi-objective approach, to incorporate engineering judgment-based considerations, such as the uniformity of the DMAs in terms of size or total demand and the total number of boundary pipes to be maximized and minimized, respectively (Giustolisi & Ridolfi 2014; Laucelli *et al.* 2017; Creaco & Haider 2019; Creaco *et al.* 2019; Liu & Lansey 2020). Since many of these algorithms were developed based on the concept of modularity, the question arises if clustering algorithms based on other concepts can be similarly efficient and effective, when used in the context of multi-objective optimization. To respond to this question, an algorithm based on the transport function is presented in this work. To the best of the authors' knowledge, no attempt has ever been made to make use of the transport function in the context of multi-objective WDN partitioning, with specific reference to the clustering phase. Considering a certain WDN and a prefixed demand scenario (daily average or peak), the transport function is calculated by summing the product of pipe length and absolute flow rate over the entire WDN (e.g., Stephenson 1984; Suribabu & Neelakantan 2005; Creaco & Franchini 2012; Ciaponi *et al.* 2017). This function has been mainly used in the context of pipe design (Ciaponi *et al.* 2017), in that its minimization enables finding the distribution of pipe flow rates that serve WDN users while covering the shortest path length. This results in small water detention times and better preservation of the water's organoleptic and thermic characteristics. Besides its meaningfulness in the context of WDN design, the transport function can be minimized using a fast algorithm based on the linear programming (LP, Stephenson 1984), which enables subdivision of a multi-source WDN into a system of independent single source branched WDNs. The clustering effect resulting from the minimization of the transport function can help subdivide the WDN into a desired number of DMAs with some initial features, which can be used as starting point for further refinement. Therefore, its application to WDN partitioning, and specifically to the clustering phase, seems a good prospect to be explored in the present work.

In the following sections, the methodology is first presented, followed by its application to a real case study. The paper ends with the discussion and conclusion.

## METHODOLOGY

The key elements of the methodology are described in Subsections 2.1 and 2.2, dedicated to the clustering and dividing phases, respectively. In the clustering phase, starting from a generic source node, an algorithm based on the minimization of the transport function is used to identify the minimum transport spanning tree fed by the source, which determines the extent of DMA surrounding the source itself. In a generic WDN, the simultaneous application of the algorithm to *n*_{DMA} sources enables identifying *n*_{DMA} DMAs, each of which is connected to only one source. If the desired number of DMAs in the WDN is larger than the number of sources, as is often the case with WDN management, additional fictitious source nodes must be considered in the WDN layout. The embedding of the minimum transport algorithm in a multi-objective optimization enables obtaining a configuration of DMAs complying with typically desired technical requirements, i.e., the high uniformity of demands in the DMAs and the low number of inter-DMA boundary pipes. This is accomplished as shown in the flowchart of Figure 1, in which the decisional variables of the multi-objective optimization contribute to the construction of the input for the clustering, performed by means of the LP minimization of the transport function. This results in a layout of DMAs for which the performance in terms of the criteria mentioned above is evaluated. This performance is used for the construction of the objective functions of the multi-objective optimization, as will be described below.

The high uniformity of demands in the DMAs is motivated by the need to equalize the management in the DMAs, whereas the low number of inter-DMA boundary pipes is related to technical/economic considerations. In fact, a boundary pipe is a site where the choice must be made to install a flow meter or to close the isolation valve, resulting in an economic cost or the loss of some network graphical redundance and potential water quality problems, respectively. In the dividing phase, the hydraulic impacts of the partitioning are examined and the number of boundary pipes to be closed are determined.

### Clustering of the WDN into DMAs

#### Overview

As the main novelty of the article, Subsection 2.1.2 presents the optimization problem for the minimization of the transport function and explains how it can be used for partitioning a WDN into a number of DMAs equal to the number of sources. Subsection 2.1.3 presents the generalization of the algorithm presented in Subsection 2.1.2, to account for the following practical aspects of the partitioning:

- 1.
Boundaries between adjacent DMAs must be located in pipes fitted with isolation valves, to enable WDN partitioning.

- 2.
The number of DMAs to design for the WDN is generally different from the number of sources.

- 3.
DMAs must be designed to comply with such technical requirements as the maximization of the uniformity of demand and the minimization of the number of boundary pipes.

As to point 1, it must be remarked that a system of isolation valves is already available in the system, since it is typically designed along with pipe diameters to guarantee an expected level of reliability in the WDN. Isolation valves are not available in the middle of pipes, as is wrongly assumed in most WDN-partitioning methodologies, but at pipe ends. Since the number of isolation valves is typically slightly lower than the number of pipes, the possibility of cutting the WDN in correspondence to any pipe, which is often assumed, is infeasible in real-world applications. Therefore, the design of DMAs must consider the actual position of isolation valves and DMAs must be obtained by merging segments, as was shown by Santonastaso *et al.* (2019), which are the smallest parts of the WDN that can be isolated from the remainder of the WDN for maintenance purposes, by closing suitably identified isolation valves (Walski 1993). Furthermore, the presence of the isolation valve offers advantages also at sites where flow meter installation is chosen instead of isolation valve closure. In fact, the isolation valve is generally installed in a vault that can also be used for flow meter installation, with no extra cost for the water utility besides the purchase/installation of the flow meter.

#### Minimization of the transport function for the WDN clustering

**A**can be constructed to describe the topology, where a number of rows and columns equal to the number of pipes (

*n*) and nodes (

_{p}*n*) is considered respectively, i.e.,

**A**(

*n*×

_{p}*n*) (Todini & Pilati 1988). For the

*i*-th pipe of the WDN, the elements

*i*-th row of

**A**can take values equal to −1, 0 and 1. Specifically, the element

*A*(

*i*,

*j*) is +1 and/or −1 if the flow arbitrarily assumed positive in pipe

*i*enters and/or leaves node

*j*. It is 0 if pipe

*i*is not incident upon node

*j*(Ates 2016, 2017). Matrix

**A**can be split into the two matrixes

**A**and

_{12}**A**, associated with the demand nodes and source nodes, respectively:

_{10}*n*

_{0}columns of

**A**associated with the source nodes are transferred to matrix

**A**(

_{10}*n*×

_{p}*n*

_{0}), whereas the

*n*

_{1}columns associated with the demand nodes are transferred to matrix

**A**(

_{12}*n*×

_{p}*n*

_{1}). If

**q**(

*n*

_{1}× 1) is the vector of nodal demand values, in a generic demand scenario like average or peak demand, the continuity equations at the demand nodes can be written in the following compact form:where

**A**and

_{21}**Q**(

*n*× 1) are the transpose matrix of

_{p}**A**and the vector of pipe flow rates, respectively. Equation (2) means that the algebraic summation of all the inflows and outflows

_{12}*Q*of pipes intersecting at any node is equal to the water output/input or demand/supply

*q*at the related node (Ates 2016, 2017).

*f*for the WDN takes on the following form (Ciaponi

*et al.*2017):where the symbols | | and

*stand for the absolute value applied to each vector element and the transpose matrix, respectively. Vector*

^{T}**L**(

*n*× 1) is the vector of pipe lengths. The calculation of the transport function lies in summing the product of absolute flow rate and length at all WDN pipes. In various contexts, such as the design of WDN pipes, a distribution of meaningful pipe flow rates is obtained by minimizing the function

_{p}*f*of Equation (3) under the constraint of Equation (2). As Ciaponi

*et al.*(2017) explained, the presence of the absolute value in Equation (3) prevents the original optimization problem with constraints and objective function in Equations (2) and (3), respectively, from being solvable by using the LP. To make the LP applicable, a modified topology of the WDN (Ciaponi

*et al.*2017), in which each pipe is replaced with two arcs with opposite flow directions, can be used. The modified topology is described by matrix (2

*n*×

_{p}*n*

_{1}) instead of matrix

**A**. The relationship between and

_{12}**A**is:

_{12}*n*× 1) is the vector of arc flow rates , and (2

_{p}*n*× 1) is the vector of arc lengths. Indeed, the length of either arc associated with the generic pipe is equal to the length of the pipe itself. The solution of the LP problem summarized in Equations (5) and (6) consists of a distribution of arc flow rate values . The arc flow rates associated with the generic of the

_{p}*n*pipes cannot be both larger than 0, since this would clash with the minimization of the transport function. In the solution, a certain number of pipes, equal to

_{p}*n*=

_{l}*n*−

_{p}*n*

_{1}, will feature flow rate values equal to 0 at both arcs, where

*n*represents the number of WDN loops, including the closed circuits and the source interconnecting paths. It is the number of pipes that can be removed from the WDN layout while still guaranteeing that each demand node remains connected to one source. After solving the optimization problem based on Equations (5) and (6), pipe flow rate values

_{l}*Q*of individual pipes can be obtained by summing the flow rate values

*Q*of both related arcs that result from LP solution. These values are equal to those obtainable through a demand-driven simulation of the WDN layout with open loops.

^{d}Let us consider the skeletonized WDN of Santa Maria Di Licodia, Italy (Pezzinga 2008; Creaco & Pezzinga 2015) as an example for the application of the algorithm based on the Equations (5) and (6) (Figure 2(a)).

This WDN includes *n* = 34 nodes (of which *n*_{1} = 32 with unknown head and *n*_{0} = 2 source nodes with fixed head, i.e., nodes 33 and 34) and *n _{p}* = 41 pipes. The daily average demand of all WDN users is around 18.5 L/s with nodal demand values ranging from 0.1156 to 1.1563 L/s.

By applying the algorithm based on Equations (5) and (6), the minimization of the transport function in the daily average demand scenario yields *f* = 20.28 m^{2}/s and *n _{l}* = 9 pipes with null flow rate, namely pipes 8, 13, 15, 17, 22, 25, 32, 34 and 38 (Figure 2(b)).

The removal of null flow rate pipes enables clustering the WDN into two DMAs, i.e., one DMA for each present source. The DMA configuration is shown in Figure 3.

Among the removed pipes, those that do not disrupt the source interconnection paths, i.e., pipes 8, 13, 15, 22, 32, 34 and 38, can be reintroduced for the sake of DMAs' reliability. The others, i.e., pipes 17 and 25, are the boundary pipes between the DMAs, which remain removed, as long as this does not cause unacceptable service pressure reductions and subsequent demand shortfalls.

It must be noted that the connectivity of each demand node to only one source node is an inherent property in the minimization of the transport function, guaranteeing existence and uniqueness of the clustering solution. Also, the choice of the previously removed pipes to be reintroduced is unambiguous: among the previously removed pipes, the ones linking two nodes connected to the same source, i.e., belonging to the same cluster, must be reintroduced. Those linking two nodes belonging to different clusters must be considered as boundary pipes.

The transport function shown in Equations (3) and (5) is based on the product *Q*·*L*. In this context, pipe length *L* is a simplified form of the carrying capacity of the pipe, which also depends on pipe diameter and roughness. Its use is justified here by the fact that, as will be shown below, pipe lengths are used as decision variables inside a genetic optimizer, to drive the methodologies toward interesting solutions of WDN partitioning from the viewpoint of engineering judgment.

#### Incorporating practical considerations into the clustering

Subsection 2.1.2 showed how the minimization of the transport function solves the problem of clustering. To consider the actual position of the isolation valves, the dual topology based on segments and isolation valves instead of nodes and links, respectively, can be constructed (Walski 1993; Santonastaso *et al.* 2019). Segments are the smallest parts of the WDN that can be isolated from the remainder of the WDN for maintenance purposes, by closing suitably identified isolation valves (Walski 1993). As Santonastaso *et al.* (2019) showed, DMAs can be clustered by aggregating adjacent segments.

Let us assume *n _{v}* = 10 isolation valves (V1–V10) to be present in the WDN of Santa Maria di Licodia, as is shown in the following Figure 4(a), resulting in a total of eight segments. The dual topology of the WDN in terms of segments (S1–S8) and valves (V1–V10) is shown in Figure 4(b). The nodes and pipes belonging to each segment are reported in Table 1.

Segment . | Nodes . | Pipes . | Demand (L/s) . |
---|---|---|---|

S1 | 33 | / | 0.0000 |

S2 | 1, 2, 18, 30 | 1, 4, 5, 6, 7 | 1.0407 |

S3 | 3, 4, 5, 6 | 8, 9, 10, 11, 12, 14 | 0.8094 |

S4 | 7, 13, 14, 15 | 13, 15, 16, 17, 24, 25 | 2.1969 |

S5 | 8. 9, 10, 11, 12, 16, 17 | 18, 19, 20, 21, 22, 23, 26, 33 | 4.7985 |

S6 | 29, 31, 32 | 2, 3, 27, 28 | 1.0985 |

S7 | 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 | 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41 | 8.5565 |

S8 | 34 | / | 0.0000 |

Segment . | Nodes . | Pipes . | Demand (L/s) . |
---|---|---|---|

S1 | 33 | / | 0.0000 |

S2 | 1, 2, 18, 30 | 1, 4, 5, 6, 7 | 1.0407 |

S3 | 3, 4, 5, 6 | 8, 9, 10, 11, 12, 14 | 0.8094 |

S4 | 7, 13, 14, 15 | 13, 15, 16, 17, 24, 25 | 2.1969 |

S5 | 8. 9, 10, 11, 12, 16, 17 | 18, 19, 20, 21, 22, 23, 26, 33 | 4.7985 |

S6 | 29, 31, 32 | 2, 3, 27, 28 | 1.0985 |

S7 | 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 | 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41 | 8.5565 |

S8 | 34 | / | 0.0000 |

The application of the algorithm described in Section 2.1, while considering the dual topology in Figure 4(b), with the sources at segments S1 and S8 and valve lengths set to 1 m, yields two DMAs. These include segments S1, S2, S3 and S4 and segments S5, S6, S7 and S8, respectively. The boundary pipes between the two adjacent segments are V5 (pipe 25 in the original topology) and V6 (pipe 17 in the original topology). Of course, by leaning upon Table 1, it is possible to go back to the ordinary topology of the WDN, therefore identifying nodes and pipes belonging to each DMA (see Figure 5(a)). The overall demand of the two DMAs is 4.047 and 14.4535 L/s, respectively.

A configuration of WDN partitioning, in which every DMA is fed exclusively by a single source with no interconnections between DMAs, may have some advantages in terms of controlling water quality (Scarpa *et al.* 2016). In fact, it would prevent mixing of water coming from different sources when this mixing is not desired under specific field conditions, e.g., when the high hardness of water from one source must be mitigated before user consumption. However, from a general viewpoint, practical considerations based on the size and demand of the single DMA may lead to other choices. To design a partitioning with a larger number of DMAs, additional fictitious sources must be added to the dual topology. An additional fictitious source is then added in S5, chosen here only for explicatory purposes. The application of the algorithm described in Section 2.1.2, while considering the dual topology in Figure 4(b), the sources at segments S1, S5 and S8 and valve lengths set to 1 m, yields three DMAs. DMAs 1, 2 and 3 include segments S1 and S2, segments S3, S4, S5 and S7, and segments S6 and S8, respectively. The boundary pipe between DMA 1 and DMA 2 is V2 (pipe 7 in the original topology) and that between DMA 2 and DMA 3 is V9 (pipe 27 in the original topology) (see Figure 5(b)). Table 1 enables identifying nodes and pipes belonging to each DMA. The overall demand of the three DMAs is 1.0407, 16.3613 and 1.0985 L/s, respectively.

The configurations of *n*_{DMA} = 2 and *n*_{DMA} = 3 DMAs described above and shown in Figure 5 were obtained by minimizing the transport function in the dual topology of the WDN, while considering valve lengths set to 1 m and sources at S1, S8 (and S5). Though the clustering obtained by minimizing the transport function proved to be very fast, it did not give good results of demand uniformity in the DMAs. This is the evidence that the configurations of DMAs obtained by using this approach alone may not comply with the technical requirements typically considered in WDN partitioning, such as the high uniformity of the DMAs in terms of such properties as demand, size or nodal ground elevations, or the low number *n _{b}* of boundary pipes between the DMAs. To meet technical requirements such as those mentioned above, the LP algorithm for the minimization of the transport function was embedded in an external optimization algorithm, which operates by suitably modifying the positions of the (fictitious) sources in the dual topology of the WDN and the fictitious lengths of the valve-fitted pipes before the application of LP. Under conditions of modified source positions and valve lengths, the transport function loses its original physical meaning and can then be renamed ‘equivalent transport function’.

*c*of the DMA demands, ratio of the standard deviation to the arithmetic mean. For a fixed number

_{v}*n*

_{DMA}of DMAs, different configurations of clustering, each featuring its own performance in terms of

*n*and

_{b}*c*, can be obtained in the context of a bi-objective optimization problem formulated as follows:in which

_{v}*S*and

_{i}*L*are the decision variables. For the generic

_{f,j}*i*-th of the

*n*segments,

_{s}*S*is a binary variable indicating the absence or presence of the fictitious source in the segment, when the value is equal to 0 or 1, respectively. For the generic

_{i}*j*-th of the

*n*isolation valves,

_{v}*L*is a real number representing the fictitious length of the valve. A generic combination of values of

_{f,j}*S*and

_{i}*L*can be used to minimize the equivalent transport function by LP in WDN clustering, which is performed as explained in Section 2.1.2. The values of the objective functions are then estimated.

_{f,j}It must be noted that the LP-based clustering algorithm is embedded as a functional part into the bi-objective optimization. Therefore, the overall methodology can be defined as a low-level hybrid algorithm, according to the Talbi (2002) classification.

In the framework of Equation (7), the valve lengths are fictitious variables modified by the optimizer to equalize the demands of the DMAs. The fictious pipe lengths play the role of weights in the minimization of the transport function. However, the research of these weights is performed here by using optimization, marking a difference from the approach proposed in the work of Giustolisi & Ridolfi (2014). In fact, in that work weights are pre-selected by the user of the algorithm to obtain desired properties in the design of DMAs while optimization is used to perform cuts in the WDN.

### WDN dividing

For each pipe lying at the boundary between adjacent DMAs, the options of closing the present isolation valve or installing a flow meter to monitor the flow exchange are considered, to have physical or virtual partitioning, respectively. The dividing of the WDN into the DMAs identified in Section 2.1 can be tackled here by means of an optimization problem (e.g., see Creaco *et al.* 2017, 2019; Santonastaso *et al.* 2019). The two objective functions in the optimization problem are the generalized resilience/failure index GRF of the WDN, expressed using the pressure-driven formulation of Creaco *et al.* (2016), and the number of flow meters *n _{fm}* installed at boundary pipes (surrogate to the partitioning cost), to be maximized and minimized, respectively. In more detail, GRF represents the global conditions of the WDN in terms of service pressure at nodes in a certain operational scenario. It ranges between −1 and 1, with the negative and positive values indicating deficit and surplus conditions, respectively. The generic of the

*n*decision variables encodes the two plausible options for the generic boundary pipe, i.e., installation of the flow meter or closure of the present isolation valve.

_{b}*i*-th of the

*n*boundary pipes,

_{b}*x*is a binary variable indicating the absence or presence of the flow meter in the pipe, when the value is equal to 0 or 1, respectively. In the absence of the flow meter, the present isolation valve is closed. For a generic distribution of installed flow meters and closed isolation valves at the boundary pipes, the objective function

_{i}*n*to minimize can be trivially calculated. Then, a snapshot hydraulic simulation of the WDN can be carried out in meaningful scenarios, such as the peak or average demand, to evaluate GRF.

_{fm}While the problem formulated in Equation (8) can be generally tackled by means of a bi-objective optimization algorithm, the positions of the flow meters can be decided in some cases without resorting to optimization, i.e., basing these on practical considerations, such as the (in)feasible installation at certain sites or the need to keep large diameter boundary pipes open due to reliability considerations.

After an isolation valve is closed, the associated boundary pipe becomes a branched dead-end, in which water quality problems can occur because of low flow velocities. Therefore, considerations concerning water quality could be made in the dividing phase to choose which boundary pipes can be closed. As an example, the number of customers present along the generic boundary pipe could be analyzed, to evaluate flow velocities in single- or multiple-demand scenarios. Then, hydraulic and water quality simulations could also be performed in these scenarios, to investigate the extent to which flow velocities resulting from the potential closure of the isolation valve at the pipe end could cause an exceedingly low detention time and an excessive disinfectant decay. If these unpleasant consequences happen, the option of leaving the isolation valve open could be selected.

## APPLICATIONS

### Case study

The WDN serving a city in northern-central Italy (Alvisi *et al.* 2011) is the case study of the present work (Figure 6). This WDN includes two source nodes, with head set to 30 m, 536 demand nodes with ground elevation of 0 m and 825 pipes with diameters ranging between 25 and 600 mm. In the demand peak scenario, the total demand is around 367 L/s, including 20% of leakage. The overall length of the WDN is about 87 km.

In the WDN in Figure 6, 969 isolation valves are present along the pipes, plus two isolation valves at the exit of the source nodes, subdividing the WDN into 682 segments. As Figure 6 shows, some pipes of the WDN are fitted with one isolation valve; others are fitted with two isolation valves, i.e., one at either end; the remaining ones feature no isolation valve. Therefore, resorting to the dual topology described in Subsection 2.1.3 is indispensable for guaranteeing the presence of the isolation valve at the boundary between two generic adjacent DMAs.

In the applications, the presented methodology was applied considering a number of DMAs equal to 8. To accomplish this, potential locations of eight fictitious sources were assumed in the segments containing the pipes with diameters larger than 200 mm in the WDN, indicated with a red line in Figure 6. This choice guarantees that each DMA is fed by the main pipes of the WDN.

The locations of the fictitious sources and the fictitious lengths of the isolation valves were optimized by means of the multi-objective genetic algorithm NSGAII (Deb *et al.* 2002). For each individual of NSGAII, corresponding to a combination of fictitious sources and valve lengths, the algorithm described in Section 2 was run to search for a configuration of WDN clustering into 8 DMAs.

Inside this algorithm, the generic individual (called ‘chromosome’) was encoded with *n _{s}* +

*n*genes. Each of the first

_{v}*n*genes takes on the integer values 0 or 1, encoding the absence or presence of the fictitious source in the generic segment of the WDN. The subsequent

_{s}*n*genes take on real values, encoding the fictitious lengths of the valve-fitted pipes. For the generic individual inside the genetic algorithm, the minimization of the transport function enables identification of a certain configuration of

_{v}*n*

_{DMA}DMAs, for which the values of the objective functions

*n*and

_{b}*c*are calculated.

_{v}Optimization runs with a number of individuals and generations equal to 1,000 and 500 were carried out. In these optimizations, valve lengths, which play the role of weights in the minimization of the transport function, were generated in the range 0.01–100 m, with an average value equal to 2.58 m. This choice was the result of preliminary optimization runs, which proved the range to enable effective convergence toward optimal solutions in the whole Pareto front. After each crossover and mutation, a correction was applied to the generic individual to guarantee that it respected the right number of fictitious sources (sum of the first *n _{s}* genes equal to 8) and the right average valve length (average of the following

*n*genes equal to 2.58). Specifically, two optimization runs were performed, the former with no penalization in the objective functions and the latter with penalization on individuals with

_{v}*c*> 0.5 to promote the convergence of the algorithm toward solutions with low values of

_{v}*c*, which are more interesting from the viewpoint of engineering judgment.

_{v}### Results

The calculations of this work were carried out in the MATLAB 2018a^{®} environment, by using a single core of an Intel^{®} Xeon^{®} CPU E3-1505M v6 processor with a frequency of 3.00 GHz.

The Pareto front of optimal solutions including the best solutions from the two runs is shown in Figure 7, in comparison with the benchmark front obtained by applying the algorithm of Creaco *et al.* (2019) based on the combined use of bi-objective optimization and the modularity-driven clustering. This algorithm was applied in a single optimization run with 100 individuals and 100 generations, to obtain a similar computational overhead to the overall overhead of the two optimization runs carried out with the novel algorithm. No penalization was carried out on the individuals with high values of *c _{v}*, as this did not improve the results of the algorithm of Creaco

*et al.*(2019).

As expected, the Pareto front contains solutions featuring increasing values of *n _{b}*, as

*c*decreases. In other words, the desirable small values of

_{v}*c*, associated with more uniform DMAs in terms of total demand, are obtained at the cost of more numerous boundary pipes in the WDN. Overall, the graph in Figure 7 shows that the pattern of

_{v}*c*as a function of

_{v}*n*obtained using the minimum transport-driven clustering is similar to that obtained with the algorithm of Creaco

_{b}*et al.*(2019). However, especially for the low values of

*c*, corresponding to engineeringly attractive DMA configurations with uniform partitioning of demand, the minimum transport-driven clustering offers more profitable solutions with fewer boundary pipes.

_{v}As an example of the solutions found by the new methodology, Figure 8 reports the WDN partitioning solution with *n _{b}* = 64 and

*c*= 0.008. In the peak demand scenario, the total demand of each of the eight DMAs, net of leakage, varies in the small range between 36.27 and 37.12 L/s, attesting to the good uniformity of demands in the solution. Figure 8 shows that the physical size of the eight DMAs is different, with DMAs 1 and 5 being the largest and smallest, respectively. This is due to the uneven density of the population in the area. The main features of the DMAs in terms of total peak demand, length, and number of boundary pipes in common with the adjacent DMAs are reported in Table 2.

_{v}DMA . | Demand (L/s) . | Length (km) . | Boundary with DMA1 . | Boundary with DMA2 . | Boundary with DMA3 . | Boundary with DMA4 . | Boundary with DMA5 . | Boundary with DMA6 . | Boundary with DMA7 . | Boundary with DMA8 . |
---|---|---|---|---|---|---|---|---|---|---|

DMA1 | 36.91 | 20.17 | 3 | 8*** | 0 | 0 | 0 | 1* | 7** | |

DMA2 | 36.91 | 7.78 | 3 | 5** | 6 | 0 | 0 | 0 | 3 | |

DMA3 | 37.12 | 8.79 | 8*** | 5** | 0 | 0 | 0 | 0 | 1 | |

DMA4 | 36.91 | 10.22 | 0 | 6 | 0 | 5* | 2* | 1 | 6 | |

DMA5 | 36.49 | 5.33 | 0 | 0 | 0 | 5* | 8* | 3 | 0 | |

DMA6 | 36.27 | 12.28 | 0 | 0 | 0 | 2* | 8* | 1* | 0 | |

DMA7 | 36.27 | 10.01 | 1* | 0 | 0 | 1 | 3 | 1* | 4 | |

DMA8 | 36.70 | 12.70 | 7** | 3 | 1 | 6 | 0 | 0 | 4 |

DMA . | Demand (L/s) . | Length (km) . | Boundary with DMA1 . | Boundary with DMA2 . | Boundary with DMA3 . | Boundary with DMA4 . | Boundary with DMA5 . | Boundary with DMA6 . | Boundary with DMA7 . | Boundary with DMA8 . |
---|---|---|---|---|---|---|---|---|---|---|

DMA1 | 36.91 | 20.17 | 3 | 8*** | 0 | 0 | 0 | 1* | 7** | |

DMA2 | 36.91 | 7.78 | 3 | 5** | 6 | 0 | 0 | 0 | 3 | |

DMA3 | 37.12 | 8.79 | 8*** | 5** | 0 | 0 | 0 | 0 | 1 | |

DMA4 | 36.91 | 10.22 | 0 | 6 | 0 | 5* | 2* | 1 | 6 | |

DMA5 | 36.49 | 5.33 | 0 | 0 | 0 | 5* | 8* | 3 | 0 | |

DMA6 | 36.27 | 12.28 | 0 | 0 | 0 | 2* | 8* | 1* | 0 | |

DMA7 | 36.27 | 10.01 | 1* | 0 | 0 | 1 | 3 | 1* | 4 | |

DMA8 | 36.70 | 12.70 | 7** | 3 | 1 | 6 | 0 | 0 | 4 |

Every ‘*’ indicates a flow meter installed.

Figure 8 also shows the locations of the eight fictitious sources in this solution, which are well scattered over the area of the WDN. As expected, there is one fictitious source for each DMA. In this solution, the fictitious valve lengths range between 0.01 and 55.84 m, with an average value of 2.58 m. The choice of assuming the potential locations of the fictitious sources along the main pipes of the WDN (see Figure 6) guarantees that the DMAs are crossed by the main pipes at some boundaries. This facilitates the selection of the isolation valves to be closed and of the flow meters to be installed in the dividing phase. In Figure 8, the symbol ‘x’ in black indicates 11 boundary pipes to be fitted with a flow meter, to be added to the two flow meters already present at the exit of the two real sources of the WDN. In the other 53 boundary pipes, the isolation valves can be closed. In terms of hydraulic performance, the snapshot simulation of the WDN in the peak and average demand scenarios points out no significant variation in the GRF (equal to 0.33 and 0.22) because of the partitioning. There is also no significant variation in the minimum pressure (equal to 28.22 and 29.16 m in the peak and average demand scenarios, respectively) because of the partitioning.

It must be remarked that, for similar demand uniformity over the DMAs, the values of *n _{b}* obtained using the novel methodology are generally lower than or equal to those obtained using the methodology of Creaco

*et al.*(2019). The high values of

*n*required for obtaining the desirable demand uniformity over the DMAs, e.g.,

_{b}*n*= 64 in Figure 8, are not an undesired effect of the methodology used in this work. Indeed, they are due to the high redundancy of the case study WDN in terms of loops and interconnections. Since the methodology is applied on the dual topology of the WDN based on valves and segments (Subsection 2.1.3), all boundary pipes are equipped with the isolation valve

_{b}*ab initio*, resulting in no additional installation costs.

Water quality simulations were then performed with the software EPANET 2.2 (Rossman 2000) to investigate if the dead-ends resulting from closure of 53 isolation valves caused water quality problems. In these simulations, a chlorine concentration of 2 mg/L was assumed in the source and a large chlorine bulk decay coefficient of 1 day^{−1} was considered as a margin of safety. Two scenarios were considered with nodal users outflows equal to the average and peak demand values, respectively. The demand values were kept constant for 48 h to reach the balance condition on chlorine concentration at all nodes. These simulations pointed out only one violation of the minimum allowed chlorine concentration, assumed equal to 0.2 mg/L, occurring close to the gray ‘x’ in Figure 8. To eliminate this violation, the option of not closing the isolation valve and installing a flow meter in the boundary pipe tagged with the gray ‘x’ was considered. Finally, the dividing then led to 12 flow meters installed and 52 isolation valves closed.

## DISCUSSION

The following remarks on the use of the methodology presented in this work can be made.

In the calculations of this work, the uniformity of demands and the number of boundary pipes were considered as objective functions to optimize. However, other significant variables based on engineering judgment could be adopted, without loss of generality of the approach presented. In fact, the editing of the fictitious valve lengths performed by the optimizer could make the clustering capable of accommodating any desired requirement. In fact, the function *c _{v}* in Equation (7) can be reformulated to include any other properties of the DMAs in the WDN, instead of the demand in a single scenario, including the following:

• Nodal elevations, useful for pressure management;

• Pipe characteristics, useful for WDNs made up of pipes of different ages and materials;

• Time-varying demands in multiple scenarios, useful for WDNs featuring variable loading conditions in their operation.

In all cases, the optimizer would attempt to modify the fictitious valve lengths to minimize *c _{v}*.

In comparison with many other methodologies in the scientific literature, a positive aspect of the novel methodology lies in the possibility to choose *a priori* potential locations, such as the main pipes of the WDN, for the fictitious sources in the algorithm. This guarantees that the DMAs are crossed by the main pipes, thus facilitating selection of the boundary pipes where a flow meter must be installed, without resorting to any additional optimization.

Other remarks can be made as far as the comparison with the methodology of Creaco *et al.* (2019) is concerned. For each individual of NSGAII, i.e., for each combination of fictitious sources and valve lengths, the application of the algorithm presented in this work for the WDN clustering requires a computation time of about 0.2 s. In comparison with the methodology of Creaco *et al.* (2019), which requires about 22 s for each individual of NSGAII, the faster execution of the algorithm pays back the more complex encoding of the individuals. In fact, the optimization of the positions of the fictitious sources in the novel methodology demands for a larger number of individuals and generations by one order of magnitude inside NSGAII, leading globally to similar computation times for the optimization. However, the WDN partitioning results obtained prove the effectiveness and efficiency of the novel approach (Figure 7), which gives better results in the trade-off between DMA uniformity and the number of boundary pipes.

## CONCLUSION

In this paper, a novel algorithm driven by the minimization of the transport function was presented for the partitioning of WDNs into DMAs. The algorithm was first presented in its basic application, which enables clustering a WDN into a number of DMAs equal to the number of sources feeding the WDN. In fact, in a certain demand scenario, the minimization of the transport function by means of the LP results in the identification of pipes with null flow rates, which are as numerous as the sum of cyclic loops and source interconnecting paths. The pipes with null flow rates along the source interconnecting paths can then be considered as the boundary pipes between the DMAs. The algorithm was then extended to create configurations of any number of DMAs, by positioning fictitious sources in the WDN, around which close nodes and pipes were identified to form the DMAs. Furthermore, in order to obtain WDN partitioning configurations equipped with isolation valves at the boundary pipes, it was reformulated in the dual network topology, which is very meaningful for management purposes, in which segments and isolation valves play the role of nodes and pipes, respectively. Finally, the LP algorithm for the minimization of the transport function was embedded inside a multi-objective genetic algorithm considering the location of the fictitious sources and the fictitious length of the valves as decision variables. This enabled simultaneously optimizing the uniformity of DMAs and the number of boundary pipes in the context of WDN partitioning.

The applications to a real case study proved the effectiveness and efficiency of the algorithm, in comparison with results obtained by using a modularity-driven algorithm in the scientific literature and also highlight an applicative advantage. In fact, before the optimization, the user of the algorithm can choose the segments located along the main pipes of the WDN as potential locations for the fictitious sources. This guarantees that the DMAs are crossed by large diameter pipes at some boundaries, thus ensuring the reliability of the WDN partitioning configuration and giving insight into the identification of the locations where flow meters can be installed to monitor inter-DMA flow exchanges.

## FUNDING

No external fund was used to support this work.

## CONFLICTS OF INTEREST

The authors declare no conflict of interest.

## AUTHORS’ CONTRIBUTIONS

Prof. Creaco and Prof. Pezzinga conceived the general idea at the basis of this work. Prof. Creaco implemented the methodology and wrote the first draft of the manuscript. Prof. Pezzinga and Prof. Zheng made valuable comments and revisions in the manuscript.

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.