This paper presents a novel method using a clustering, detection, and optimization model to devise a solution of fully automatic partitioning in a water distribution system (WDS). First, the Black Hole Clustering Algorithm is employed to divide the WDS into different partitions. Second, two types of outliers are eliminated by multistage iterative processes including traverse, k-Nearest neighbor, and the Warshall algorithm. Finally, the boundary conditions of the partitions are optimized by a Non-dominated Sorting Porcellio Scaber Algorithm to minimize the number of boundary pipes required to balance pressures and reduce leakages. Seven WDSs are employed as case studies to verify the practicability of the method. The Open Water Analytics toolbox is applied to code the hydraulic calculation program. The result demonstrates that average pressure and leakage cost decreases after optimization.

  • A novel partition method based on monitors pressure sensitivity analysis was presented.

  • Partition errors were examined by a new outliers searching model.

  • The minimum number of valves, pressure uniformity function, and leakage uniformity function were employed to optimized WDSs.

Fossil energy is one of the most important resources for humans to survive and develop. About 4% of the yearly total energy power is consumed via the water distribution system (WDS) in developed nations (Pasha & Lansey 2014). Additionally, up to 15% of water from a WDS is lost in developed nations and this loss is even higher, 35%, in developing nations (Zhang et al. 2016). In China, the leakage rate of water was more than 27% (Laiyun 2015). Reducing leakage can save costs and improve the quality, efficiency, and safety of the water supply, promoting socio-economic development. In order to reduce leakage, it is essential to reconstruct and extend old urban WDSs.

The best approach to reconstruction efforts is to determine the optimal partitions in a WDS. Partition means to divide the WDS into different zones, which is widely applied in many cities around the world, and many studies have presented the methods of optimal partition problems.

Malcolm and Farley presented the District Metered Area (DMA) model in 1989. In this model, WDS can be divided into different partitions. In each partition, pressure-reducing valves (PRVs) are installed. Control of these valves can reduce leakage (Ulanicki et al. 2003). However, this is a static model with only one partition solution for any time. With the development of sensor equipment, massive data is more easily acquired. Data analysis technology can be applied in WDS to create an energy-saving condition (Wu & Rahman 2017). An optimal partition can convert the static to a dynamic model by using sensor equipment. Zhang and Shen presented a channel networks model, which is a dynamic embedded model (Zhang & Shen 2007). The layers of partitions are determined and then analyzed layer by layer to set the boundaries of valves and sensors. According to the sensor data, valves that can be opened or closed in a WDS are gradually optimized. Finally, the state of the steady and unsteady flow can be obtained.

Due to limited conditions of economy and manpower, a real WDS cannot directly be used to divide from the start, thus simulation software should be used to precheck the effectiveness of the partition model. EPANET 2.0 is a calculator for the hydraulic model from the U.S. Environmental Protection Agency that can be used for extended period simulation (EPS) (Rossman 2000) and is able to conveniently process real-time hydraulic simulation. EPANET 2.0 can estimate the conditions of WDSs if the model meets any hydraulic limiting requirements (Price & Ostfeld 2016) and can also be re-developed with the Visual studio or Matlab platform. For instance, Porteau 4.0, a modification of the EPANET software, has been developed, which can partition a WDS into zones and optimize pipes, valves, and tanks (Gilbert et al. 2017).

Nowadays, one-key generation technologies are widely developed. A lot of time and manpower can be saved because of the qualities of easy use of the personal computer. Although significant efforts have been made to develop partitioning technologies in WDS, few can realize fully automatic partitioning. Here, a fully automatic partitioning method in WDS is presented to expedite partition speed for engineers. In this method, engineers only need to input a WDS file into a mini software to solve the program.

The full-automatic partitioning methodology is described in the methodology section. A Black Hole Clustering Algorithm was firstly applied to a WDS in the partitioning algorithm subsection. To eliminate generated outliers due to false clustering, three detection methods including traverse, k-Nearest neighbor, and the Warshall model were applied to remove outliers in the outliers detection subsection. Furthermore, the optimization algorithm subsection showed that the Non-dominated Sorting and Porcellio Scaber algorithms were combined to solve a multi-objective optimal problem in WDS. In the case studies section, seven WDSs were employed to verify the practicability of the method, and the results are presented. The conclusion and improvement of this work are presented in the conclusion section.

Partitioning algorithm

The Black Hole Clustering Algorithm (BHCA) was used to divide the WDS into different partitions for the first time. This method was presented by Hatamlou (2013), and references the form of black holes in space. A black hole is created when a giant planet collapses. Its gravitation is so intense that nothing can escape if it enters within the Schwarzschild radius. In other words, anything that enters this radius will disappear forever.

BHCA is based on the black hole theory. It is similar to the k-means and fuzzy c-means algorithm (Han & Liu 2017), a clustering method that classifies data into multiple clusters with varying distances of membership. First, the number of stars – here, the WDS partition index – is represented by l, the maximum number of iterations are set to I, and the star size-population size P is manually set. Next, nodes in WDS weights are considered as variables. In this application, the pressure differences at monitor points between leakage and normal states serve as the stars (weights). The difference equation can be simply expressed as follows:
(1)
where is the pressure difference of the monitor point j between node i leakage and the normal amount at time t; is the pressure of the monitor point j at time t when node i is experiencing leakage; is the pressure of the monitor point j at time t when node i is normally functioning and not leaking.
After calculating all monitor point pressures between normal nodes and leakage states, a difference matrix can be obtained and is treated as a node weight matrix. The form of the matrix is as follows:
(2)
where is a vector of node i weights at time t; m and n are the number of nodes and monitor points, respectively.

Next, the average of the matrix value was calculated as: , where T is the total time.

Then, the star center matrix (partition center) c can be randomly generated, which is similar to the matrix where each value range is greater than the minimum pressure difference and less than the maximum pressure difference in matrix . The formalism of this matrix is as follows:
(3)
where l represents the number of stars, or the WDS partitions number; n is the number of monitor points; and is a vector of the partition center i values.
A distance function, such as Euclidean, Mahalanobis, or Chebychev, is then employed to divide the WDS into different zones. In this study, the Euclidean distance equation is used as follows:
(4)
where is the Euclidean distance value between node i weight and the center of partition (). is a diagonal matrix including n elements whose diagonal elements are , where is the degree of membership of in matrix . Bezdec (1981) defined the calculation of by following equation:
(5)
where m is the power exponent ().
According to Equations (4)–(5), a matrix between the node weight and each partition center distance is obtained as follows:
(6)

For each node, the least distance value is selected in Equation (6), and allocated to divide the node into partitions with respect to the minimum membership degree [].

An objective function based on the sum of the least distances is employed to optimize this process. It can be obtained by the following expression:
(7)
According to Equations (1)–(7), the number of p groups objective function and standard of clustering results are solved. Searching the objective function for the minimum solution in the population size of groups provides the local optimal solution. Then, the black holes with a Schwarzschild radius R can be represented as:
(8)

Next, calculate each distance between the partition centers in groups and the local optimal solution (Farzadnia & Shirazi 2017). If a center distance is greater than the Schwarzschild radius R of the black hole, the stars (partition centers) around it are absorbed. Meanwhile, new stars will be regenerated, randomly.

Then, the local optimal solution is retained and the other center locations of partitions are updated. If the location value is out of range, a new solution will be randomly produced. The updating equation is as follows:
(9)
where i is one of center location label of the local optimal solution.

Next, the objective function minimum solution is recalculated and the center location of partitions is updated until reaching the maximum iterations I. Finally, the best optimal solution with the least objective function value is obtained. WDS is divided into different partitions using Equation (6). The process of clustering is illustrated in Figure 1.

Figure 1

Flowchart of Black Hole Clustering Algorithm (BHCA).

Figure 1

Flowchart of Black Hole Clustering Algorithm (BHCA).

Close modal

Outliers detection

After clustering, WDS is divided into l partitions. However, some outliers exist in each partition. For instance, as shown in Figure 2, a few nodes belonging to partition 1 are falsely allocated into the zone of partition 2, but are not connected with other nodes of partition 2. These nodes are called as the first outliers. In the same way, monitor points are also assumed as first outliers that are falsely allocated to partition 0 (temporary partition).

Figure 2

Outliers of partition 1 in WDS.

Figure 2

Outliers of partition 1 in WDS.

Close modal

To eliminate the first outliers, the connectivity of all of nodes should be determined. To do this, a partition is selected and the connectivity of all nodes are traversed in this zone. If a node is not connected with any other nodes in this zone, this information will be recorded in the database and its partition index is then set to 0 (temporary partition). Second, other partitions are then selected and the nodes are traversed until all partitions are iterated. Then, all nodes in partition 0 are allocated to adjacent zones (excluding partition 0) so that nodes of partition 0 are connected to a partition with nodes with more connections than the other partitions. In this way, the first outliers are eliminated.

As shown in Figure 3, the partition index of two outliers is first set to 0. The connection states of the nodes are then analyzed. Outliers 1 and 2 are connected by two nodes of partition 1 and the zero node of partition 2. Two nodes are allocated to partition 1 because the number of connected nodes in partition 1 is greater than node in partition 2.

Figure 3

The allocation of first outliers: (a) searching the first outlier nodes; (b) allocating outliers nodes to partition 0; (c) evaluating the connection state of nodes and searching for the greatest connection partition index; (d) allocating nodes to the correct partition.

Figure 3

The allocation of first outliers: (a) searching the first outlier nodes; (b) allocating outliers nodes to partition 0; (c) evaluating the connection state of nodes and searching for the greatest connection partition index; (d) allocating nodes to the correct partition.

Close modal

Although the first outliers are eliminated and the partition becomes integrated, nodes of each partition are still not completely connected. Nodes may be isolated because the clustering algorithm does not consider coordinates including the horizontal position (x, y) and the elevation (z) as weights, considered as second outliers, as illustrated in Figure 4.

Figure 4

The second outliers of partition 1 in WDS.

Figure 4

The second outliers of partition 1 in WDS.

Close modal

These outliers are eliminated in two steps:

In the first step, a kind of K-Nearest neighbor (KNN) detection algorithm is employed (Valizadeh et al. 2009), which is a distance detection method based on a few outliers distance where the outliers are far from others. The advantage of KNN is that the distributed method is integrated, making the proximity measure of outliers more easily achieved than by using other statistical distributions methods.

Therefore, the distance factor can be defined:
(10)
where is the nearest neighbor distance i and its coordinate position is ; is the k-nearest value; is the distance of node i and j at the same partition, as follows:
(11)
Then, the average and the standard deviation of all nodes in each partition are then calculated by Equations (12) and (13). If is greater than + or less than , the node i is a second outlier.
(12)
(13)
where n is the number of nodes in partition k.
As illustrated in Table 1, this partition has six nodes and five distances can be acquired. The average of node is 511.93 m and the standard deviation is 214.40 m. If N is equal to 4, four nearest distances, 198.20 m, 484.16 m, 431.32 m, and 596.33 m are selected. According to Equation (10), is calculated as follows:
Table 1

Distances between node 1 and others

d1jDistance (m)
596.33 
789.64 
491.32 
484.14 
198.20 
d1jDistance (m)
596.33 
789.64 
491.32 
484.14 
198.20 
According to the description above, the boundaries of the outliers are:

(427.50 m) belongs to the boundaries; thus, node 1 is not a second outlier.

When all nodes in every partition have been searched, the acquired outliers of partitions index are set to 0.

In the second step, although all outlier nodes have been searched, a few nodes of each partition remain not connected. To address this, the Warshall algorithm (Gao 2014) is employed to judge the connectivity of nodes at each partition.

The Warshall algorithm was presented by Warshall (1962), and is an efficient way to judge undirected graph connectivity. In this algorithm, the reachability of nodes at each partition is expressed by an adjacent matrix, in which rows and columns are used to represent the node numbers. The elements are equal to 1 if two nodes are connected or 0 if they are not connected. This method starts from the first column diagonal element and finds all nodes connected with it in this column, recording the states of connection. By traversing the diagonal element for all columns, an array can be obtained. Next, starting from this array of rows, all connected nodes are found and all the subsets are deleted until the array elements do not change.

An example of an undirected graph is shown in Figure 5.

Figure 5

Example of undirected graph.

Figure 5

Example of undirected graph.

Close modal
Based on the undirected graph, a symmetric adjacent matrix can be obtained. The adjacent matrix A of Figure 5 can be defined as follows:
where is the connection between nodes i and j. For example, is equal to 1 because nodes 3 and 4 are connected. is equal to 0 because nodes 3 and 5 are not connected.

Then, the connected states of the undirected graph are calculated as shown in Figure 6.

Figure 6

The flowchart of the Warshall algorithm: (a) finding all connected nodes; (b) combining subsets of the connection matrix; (c) acquiring connected states of nodes.

Figure 6

The flowchart of the Warshall algorithm: (a) finding all connected nodes; (b) combining subsets of the connection matrix; (c) acquiring connected states of nodes.

Close modal

After judging the connectivity of nodes, the maximum number of connected nodes in a zone is retained for each partition and the other partition index values are set to 0. If a partition has two maximum connected nodes subsets, only the first is retained. Here, 1, 2, 3, and 4 nodes are retained and 5, 6, 7, and 8 of the partition number are set to 0 in Figure 6.

When two steps of the second outliers detection are completed, the first outlier detection process is repeated. The multistage iterative process is performed until no outliers can be detected. The outlier detection process is illustrated in Figure 7.

Figure 7

Flowchart of outlier detection.

Figure 7

Flowchart of outlier detection.

Close modal

Optimization algorithm

After outlier detection, WDS is divided into different partitions without any abnormal nodes, which can be made sure by checking the connectivity of each partition. Next, it is important to decide whether the connection pipes between two partitions, as shown in Figure 8, are open or closed to balance the pressure of the network and reduce leakage. This can be implemented by using valves on these pipes. To solve this problem, a novel method that combines two algorithms into a Non-dominated Sorting Porcellio Scaber Algorithm (NSPSA) was developed as a multi-objective function optimization method.

Figure 8

Three connection pipes.

Figure 8

Three connection pipes.

Close modal

The Non-dominated Sorting (NS) algorithm is a lower complexity procedure than other approaches (Deb et al. 2002). NS requires two parameters to be calculated including the domination count whose solutions dominate p and the domination solutions , which is dominated by variable p. The Porcellio Scaber Algorithm (PSA) is a new method to solve optimization problems, and is inspired by the observation of two survival rules of a species of woodlice, aggregation and propensity (Zhang & Li 2017). PSA is like a genetic algorithm (GA), which is a kind of group optimization algorithm. But it contains some new features compared with GA: one is the optimal swarm solutions for searching in the solution space without crossover and mutation process, which can avoid getting into local optimum and is easy to implement and does not have many parameters to be adjusted.

In this analysis, three objective functions are employed to optimize WDS for balancing pressure and reducing leakage.

One function is the number of connection pipes between two partitions (Zhang et al. 2017):
(14)
where is the connection pipe at partition i when the valve is open, or ; is the sum of the WDS partition boundary pipes; is the sum of the open connection pipes.
In this analysis, all of is divided into l groups for computer analysis. For example, with 12 connection pipes and six partitions, each partition has two connection pipes in WDS. If all of them are open, this can be written as:
Divide this into six variables:
The second function is to minimize the pressure uniformity function (Al-Hemairi & Shakir 2006) equation:
(15)
where n is the number of nodes; is the pressure of node i at time t; is the minimum of all node pressures at time t; is the average pressure of a node at time t; is the pressure uniformity value.
The last function is to minimize leakage uniformity function (Germanopoulos & Jowitt 1989), as follows:
(16)
where is the loss of leakage; is the index of the leakage, (); is the parameter of leakage, ; is the sum of pipes; , , and are the total heads and ground elevation at nodes i, j (connecting with pipe ), respectively.
Two constraint conditions are also considered, in addition to obeying the equation of continuity and the conservation of energy:
(17)
(18)
where is the maximum of all node pressures at time t; is the water level of tank i; and are the minimum and maximum, respectively, of all tank water levels at time t.

For better balanced pressure, we hope to obtain the best solution of under the best solution and . Thus, NSPSA is applied to solve for the minimum values of three objectives, and the process is as follows:

In the first step, parameters were all initialized:

Input all nodes of partition numbers into the database as described in the above section. Next, input all pipes and the node index for all pipes. Find the connection pipes between two partitions by identification of the connection nodes. Next, set the size of the initial position of G for Porcellio Scabers X randomly, and if the state of the pipe is opened or closed. For instance, if G is 3 and the connection pipes are 4, the variable matrix X is:
(19)
where is the value of pipes j for population i.

In the second step, were solved using Equations (14) and (19), and the pressure solved by hydraulic calculation for calculating and . Finally, compare the boundaries of and . If they meet the requirement, retain and , or set them to Inf (infinity).

In the third step, a non-dominated sort (Srinivas & Deb 1994) was done:

After obtaining the values of the objective functions, a matrix of functions is formed. For example, it can be defined as:
(20)
where are the values of the objective functions i in population i.

Then, find all the individuals of whose functions are better than the other individuals in the population and save those into set and mark as rank 1. Next, traverse and find the for each number i. Traverse each of all the individual and then execute . If , save it into set Q. Then, = , and repeat this process until is a null set.

For instance, assign values for Equation (18):

Because all function values in (1) are less than (2), (3), is equal to 0 and (2), (3) are combined to the and .

Then, mark each element of (1) to rank 1. Similarly, (2) is less than (3) but larger than (1), so is equal to 1 and . (3) is larger than (1), (2), so is equal to 2.

is not a null set; traverse . and = 1−1 = 0, so .

Mark element of (2) to rank 2. -1 = 2-1 = 1, reject it. = and traverse . -1-1 = 2-1-1 = 0, so and mark element of (2) to rank 3. = , and traverse .

Then, is a null set, so Q is a null set. = and is null.

Finally, (1), (2) and (3) are ranked as 1, 2, and 3.

After calculating the rank, the crowding distance should be assigned to the whole population. For each population, the items should be sorted in rank, allowing calculation of the crowding distance for the population at each rank and in descending order with distance. The crowding distance equation is as follows:
(21)
where is the crowding distance; is the objection function m.
Next, the new rank and distance should be calculated, for each population position to be updated. The updating equation is as follows:
(22)
where is the update weight; p is a function to map the fitness of Porcellio Scaber to an action strength. The equation p is as follows:
(23)
where is a random vector of the same dimension as

Then, combine half of the old population with half of the new population, which are all half of higher sort elements.

For instance: the old population is:
where [1 2] is rank 1 and [3 4] is rank 2.
The new population is:
where [3 5] is rank 1 and [6 8] is rank 2.
Then, combine [1 2] and [3 5] and reject others. A new combination can be produced:

Finally, update the PSA and execute step (2) to (5) until reaching the maximum iteration I.

The process of clustering is illustrated in Figure 9.

Figure 9

Flowchart of Non-dominated Sorting Porcellio Scaber Algorithm (NSPSA).

Figure 9

Flowchart of Non-dominated Sorting Porcellio Scaber Algorithm (NSPSA).

Close modal

A WDS case study of Exa7 (Bentley Systems 2013), shown in Figure 10, was employed to verify the practicability of the above method. The network has 381 junctions (including nine SCADA pressure monitor points), 469 pipes, three pumps, and one tank. The average water usage is 5,220 m3/h. The leakage discharge is set to 1% of the nodal demand if there is node demand or 3% of the average demand of 156.60 m3/h, equal to 4.70 m3/h. The WDS partition index l is 10. The time interval of hydraulic calculation is set to one hour in EPS and the maximum time is set to 24th hour.

Figure10

Layout of network of city Exa7.

Figure10

Layout of network of city Exa7.

Close modal

The system is as follows: CPU is i7-3410 m; RAM is DDR3-16G; ROM is ssd-256G; OS is Windows 10, and Matlab-2016a, and Open Water Analytics (OWA) toolbox is used.

Multistage iterative full-automatic partitioning method in WDS of Exa7 is performed as follows:

  • 1.
    The Open Water Analytics (OWA) toolbox (McDonnell et al. 2007) is used for hydraulic calculations. This is an EPANET classlib of Matlab developed by the KIOS Research Center for Intelligent Systems and Networks, University of Cyprus. First, monitor points are selected based on the pressure sensitivity analysis model. In Exa7 WDS, we can select monitors via the method of Zhang et al. 2016 for a reference. It will be concisely illustrated in this section. Hydraulic analysis of the normal state is simulated in WDS and pressures are obtained in each node. Then, increase water demand of a node as a leakage state and hydraulic analysis is processed. Each node of water demand is applied in the same way to obtain all leakage states. After calculation, pressures in each leakage state can be solved. Next, difference of unit pressure values can be solved as follows:
    (24)
    where pi and p′i are the pressures of node i in the normal and leakage states, respectively; pk and p’k are the pressures of node k in the normal and leakage states, respectively.
The same matrix formalism can be obtained, as in Equation (2). Then, all the xik-are standardized by:
(25)
where x′ik is the standardization result of xik; is the average of all the unit pressure differences between node k and others; is the standard deviation of node k.
Then, normalization processing is done and the x’’ik is obtained. Finally, euclidean distance is employed to select monitors from value of smallest to biggest following Equation (26). If adjacent monitor points exist, only select the best one.
(26)
where is the euclidean distance; N is the number of nodes.

Second, the initial pressures of monitor points are calculated in the normal state by OWA. The leakage discharge value is added to the demand of a node (without monitor points for difference inspection) to simulate the leakage state, and the pressures of monitor points are hydraulically calculated again. Then, the difference of the two results is determined as the pressure difference of this node. Similarly, the leakage of other nodes can be simulated and the pressure differences are calculated. These results can then be used to construct a difference matrix (372 rows and 9 columns). In the same way, all time difference matrices can be calculated. These can be used to calculate the difference average value across time. The matrix of Equation (2) can be obtained.

  • 2.

    The matrix (10 rows and 9 columns) of the partition center position is randomly produced. The BHCA is used to divide WDS into 10 partitions. The power exponent m is set to 2 in this case. The result of clustering is shown in Figure 11(a).

  • 3.

    Outliers exist in each partition. Thus, two kinds of outliers should be eliminated. First, a partition is selected and the connectivity of all nodes is searched for the first outliers. Then, the other partitions are searched. When all partitions have been searched, the first outlier partition index can be set to zero. Then, the first outliers are allocated to partitions that are connected with them. These nodes connected it number in a partition are greater than the other partitions shown in Figure 11(b). Secondly, the node distance at each partition is evaluated to search for second outliers. The KNN model is used to detect outliers that are far away from groups, and the k-nearest value N is set to 10. Then, the node partition index is set to 0. Next, the connectivity of nodes at each partition is evaluated by Warshall algorithm. If a partition is not connected, the largest connected node number group is retained and the other group index values are set to 0, as shown in Figure 11(c). Finally, the first outlier is subjected to the multistage iterative function until all outliers are removed.

  • 4.

    After outlier detection, this case is divided into different partitions without abnormal nodes, as shown in Figure 11(d). Next, the connection pipes between two partitions were determined, as shown in Figure 12. The number of connection pipes is 35.

Figure 11

Clustering process of city Exa7: (a) BHCA partition; (b) eliminating first ouliers; (c) eliminating second ouliers; (d) multistage iterative final result.

Figure 11

Clustering process of city Exa7: (a) BHCA partition; (b) eliminating first ouliers; (c) eliminating second ouliers; (d) multistage iterative final result.

Close modal
Figure12

Connection pipes between two partitions.

Figure12

Connection pipes between two partitions.

Close modal

Then, the NSPSA model is used to optimize WDS. Three equations (Equations (14)–(16)) are designed to minimize the number of boundary pipes, and make the pressure and leakage balances. The range of solutions was limited by two constraint conditions, the range of node pressure was set to 10 m and 50 m, and the range of water level of tank was set to 10 m and 25 m. The maximum iteration I was set to 100 and the update weight is 0.8. The solution set of the three equations are shown in Figure 13 and the top ten results are shown in Table 2. The run time of this WDS file is 3,468.69s (contained cluster time 24.68s and optimized time 3,444.01s).

Table 2

The top 10 results of Exa7 WDS

Connection pipesPressure uniformityLeakage uniformity
11 444.42 23.65 
13 441.30 23.63 
13 440.57 23.65 
13 440.57 23.65 
15 436.36 23.64 
14 438.72 23.63 
15 437.85 23.63 
Connection pipesPressure uniformityLeakage uniformity
11 444.42 23.65 
13 441.30 23.63 
13 440.57 23.65 
13 440.57 23.65 
15 436.36 23.64 
14 438.72 23.63 
15 437.85 23.63 
Figure 13

The result of NSPSA optimization.

Figure 13

The result of NSPSA optimization.

Close modal

The best solution of the number of connection pipes was selected to minimize the leakage uniformity equation. Thus, the point: (11, 444.42, 23.65) should be considered as the best solution, as shown in Table 3. Pressure reducing valves (PRV) were used to improve pressure management. The pressure and leakage uniformity were reduced by 162.60 and 3.35 respectively compared to the values before (607.03, 27.01), where are calculated by opening all of the boundary pipes.

Table 3

The states of Exa7 WDS

PartitionsNodesPipesConnection pipesConnection pipes (open)
56 67 
81 93 16 
28 31 
40 46 
64 68 
31 34 
37 48 10 
14 14 
10 26 29 
Total 383 437 70/2 = 35 24/2 = 12 
PartitionsNodesPipesConnection pipesConnection pipes (open)
56 67 
81 93 16 
28 31 
40 46 
64 68 
31 34 
37 48 10 
14 14 
10 26 29 
Total 383 437 70/2 = 35 24/2 = 12 

At the end of this model, 24 hour periods are selected to calculate the node average pressure and the pipe average leakage of each partition. The results are compared with those before partition optimization, as illustrated in Figures 14 and 15.

Figure 14

Average pressure of each partition before and after optimization.

Figure 14

Average pressure of each partition before and after optimization.

Close modal
Figure 15

Average leakage of each partition before and after optimization.

Figure 15

Average leakage of each partition before and after optimization.

Close modal

Seven example WDSs were used to test the method and demonstrate that it is universally feasible. The studied WDSs include three Kentucky (KY) systems (Jolly et al. 2014) from the Kentucky statewide WDS database, KY3, KY5, and KY11. The fourth and fifth are BWSN1 and BWSN2, developed by Ostfeld et al. 2008. The sixth is an example system distributed with EPANET called Net3 (Hwang & Lansey 2017). The final one is a case study of City F in China. Monitor points selection is based on a pressure sensitivity analysis model according to Amin et al. 2013 except for BWSN2, which used data of Zhengyi Wu in Ostfeld et al. 2008. The parameters of these WDSs are listed in Table 4 and the optimized setting values are:

Table 4

The parameters of the seven WDSs

WDSNodesPipesTanksReservoirsPumpsMonitors
KY3 273 349 10 
KY5 423 496 11 10 
KY11 805 846 28 23 10 
BWSN1 126 168 10 
BWSN2 12,523 14,822 20 
Net3 92 117 10 
City-F 920 1,032 10 
WDSNodesPipesTanksReservoirsPumpsMonitors
KY3 273 349 10 
KY5 423 496 11 10 
KY11 805 846 28 23 10 
BWSN1 126 168 10 
BWSN2 12,523 14,822 20 
Net3 92 117 10 
City-F 920 1,032 10 

Different WDSs need to set on different parameters. In Ky3, is set to 35 m and 100 m. The range of water levels of three tanks are set to [42.88, 48.98], [44.88, 52.50] and [28.69, 39.36] m, respectively. In Ky5, is set to 30 m and 150 m. The range of water levels of three tanks are set to [17.58, 25.18], [7.18, 17.85], and [8.19, 18.85] m. In Ky11, is set to 40 m and 200 m. The range of water levels of the 28 tanks are set to [0.10, 4.67], [0.00, 3.96], [30.07, 39.21], [12.22, 20.15], [1.87, 11.92], [3.80, 10.50], [13.83, 20.54], [28.02, 35.64], [18.22, 21.27], [22.38, 30.00], [0.20, 9.34], [33.89, 38.46], [30.74, 39.89], [25.99, 33.61], [24.47, 30.56], [0.13, 8.66], [41.03, 51.69], [0.22, 7.54], [18.65, 26.57], [34.33, 43.17], [21.19, 25.76], [0.07, 8.00], [18.91, 25.01], [32.94, 39.65], [15.70, 21.80], [12.18, 19.19], [14.47, 27.28] and [6.35, 10.92] m, respectively. In BWSN1, is set to 15 m and 510 m. The range of water levels of two tanks are set to [0, 9.75] and [0, 12.77] m. In BWSN2, is set to 5 m and 100 m. The range of water levels of tanks are set to [0, 7.17] and [1.22, 5.29] m. In Net3, is set to 10 m and 150 m. The range of water levels of three tanks are set to [0, 9.78], [1.98, 12.28], and [1.22, 10.82] m. The number of partitions is then set to 5. In City F, is set to 10 m and 40 m and there are no tanks in WDS.

The corresponding topological graphs are shown in Figure 16. The best optimized results for each WDS are given in Table 5 and illustrated in Figure 17.

Table 5

The results for seven WDSs

WDSPartitionsConnection pipesConnection pipes (open)Pressure uniformityLeakage uniformityCluster time (s)Optimized time (s)Total time (s)
KY3 10 22 15 0.18 1.03 11.48 111.65 123.13 
KY5 10 27 21 0.11 10.75 17.54 134.27 151.81 
KY11 10 13 12 0.02 0.53 70.32 1,340.21 1,410.53 
BWSN1 14 2.66 0.18 131.28 1,713.03 1,844.31 
BWSN2 10 94 40 31.44 260.35 2,883.54 163,666.67 166,550.21 
Net3 12 2.32 2.07 5.66 33.95 39.61 
City-F 10 36 29 0.13 0.07 71.15 1,229.85 1,301.00 
WDSPartitionsConnection pipesConnection pipes (open)Pressure uniformityLeakage uniformityCluster time (s)Optimized time (s)Total time (s)
KY3 10 22 15 0.18 1.03 11.48 111.65 123.13 
KY5 10 27 21 0.11 10.75 17.54 134.27 151.81 
KY11 10 13 12 0.02 0.53 70.32 1,340.21 1,410.53 
BWSN1 14 2.66 0.18 131.28 1,713.03 1,844.31 
BWSN2 10 94 40 31.44 260.35 2,883.54 163,666.67 166,550.21 
Net3 12 2.32 2.07 5.66 33.95 39.61 
City-F 10 36 29 0.13 0.07 71.15 1,229.85 1,301.00 
Figure 16

Topological graphs: (a) KY3; (b) KY5; (c) KY11; (d) BWSN1; (e) BWSN2; (f) Net3; (g) City F.

Figure 16

Topological graphs: (a) KY3; (b) KY5; (c) KY11; (d) BWSN1; (e) BWSN2; (f) Net3; (g) City F.

Close modal
Figure 17

Multistage iterative partition results: (a) KY3; (b) KY5; (c) KY11; (d) BWSN1; (e) BWSN2; (f) Net3; (g) City F.

Figure 17

Multistage iterative partition results: (a) KY3; (b) KY5; (c) KY11; (d) BWSN1; (e) BWSN2; (f) Net3; (g) City F.

Close modal

The results show that the pressure and leakage uniformity values at each partition for each specific case are reduced and partitions success in division. In other words, energy and economic losses are decreased due to the balanced pressure and reduced leakage. Thus, the multistage iterative fully automatic partitioning method is a practicable method for WDSs control and management.

A multistage iterative methodology with fully automatic partitioning is presented. In this methodology, BHCA is employed to divide WDS into different partitions. First and second outliers are eliminated by multistage iterative methods using traverse, KNN, and Warshall algorithms. Finally, the boundaries of each partition are optimized by NSPSA. The methodology is verified by case studies of real WDSs. The results indicate that the method is a practicability way to partition WDSs, balance pressures and reduce leakages.

Although it can automatically process a WDS file for partition, there are limits to this methodology. Monitor points are selected by pressure sensitivity analysis model, where it might be necessary to manually exclude monitor points before running the program. The second outliers are detected by application of the Warshall algorithm, and outliers are allocated in the clusters with the largest number of nodes, but some nodes might be set to an unfitted zone, which also might change the pressure sensitivity variations of the monitors. Future work should develop a new way to select monitors that meets the requirements above. And second oulier detection methods should be improved to adjust pressure sensitivity variations.

This work was supported by the Shenzhen Science and Technology Plan Program (KJYY20170413170501147) and the Major Science and Technology Program for Water Pollution Control and Treatment (2014ZX07406-003).

All relevant data are included in the paper or its Supplementary Information.

Al-Hemairi
H. A.
Shakir
R. H.
2006
Minimizing leakage rates in water distribution networks through optimal valves settings
. In
World Environmental and Water Resource Congress 2006
,
Reston, VA
.
American Society of Civil Engineers
, pp.
1
13
. .
Amin
S.
Litrico
X.
Sastry
S. S.
Bayen
A. M.
2013
Cyber security of water SCADA systems – part II: attack detection using enhanced hydrodynamic models
.
IEEE Transactions on Control Systems Technology
21
(
5
),
1679
1693
. .
Bentley Systems
2013
WaterGEMS V8i Users'Manual
.
Bentley Institute Press
,
Exton, PA
.
Bezdek
J. C.
1981
Pattern Recognition with Fuzzy Objective Function Algorithms
.
Springer US
,
Boston, MA
.
Deb
K.
Pratap
A.
Agarwal
S.
Meyarivan
T.
2002
A fast and elitist multiobjective genetic algorithm: NSGA-II
.
IEEE Transactions on Evolutionary Computation
6
(
2
),
182
197
. .
Farzadnia
E.
Shirazi
H.
2017
The Black Hole Clustering Algorithm: A MATLAB Simulation
.
Gao
T.
2014
Efficient identification of segments in water distribution networks
.
Journal of Water Resources Planning and Management
140
(
6
),
04014003.1
04014003.7
.
Germanopoulos
G.
Jowitt
P.
1989
Leakage reduction by excess pressure minimization in a water supply network
.
Proceedings of the Institution of Civil Engineers
87
(
2
),
195
214
. .
Gilbert
D.
Abraham
E.
Montalvo
I.
Piller
O.
2017
Iterative multistage method for a large water network sectorization into DMAs under multiple design objectives
.
Journal of Water Resources Planning and Management
143
(
11
),
04017067
. .
Jolly
M. D.
Lothes
A. D.
Sebastian Bryson
L.
Ormsbee
L.
2014
Research database of water distribution system models
.
Journal of Water Resources Planning and Management
140
(
4
),
410
416
. .
Laiyun
S.
(ed.)
2015
China Statistical Yearbook
.
National Bureau of Statistics of China
,
Beijing, China
.
McDonnell
B. E.
Salomons
E.
Uber
J.
Klise
K.
2007
Open Water Analytics (OWA)
. .
Ostfeld
A.
Uber
J. G.
Salomons
E.
Berry
J. W.
Hart
W. E.
Phillips
C. A.
Watson
J.-P.
Dorini
G.
Jonkergouw
P.
Kapelan
Z.
di Pierro
F.
Khu
S.-T.
Savic
D.
Eliades
D.
Polycarpou
M.
Ghimire
S. R.
Barkdoll
B. D.
Gueli
R.
Huang
J. J.
McBean
E. A.
James
W.
Krause
A.
Leskovec
J.
Isovitsch
S.
Xu
J.
Guestrin
C.
VanBriesen
J.
Small
M.
Fischbeck
P.
Preis
A.
Propato
M.
Piller
O.
Trachtman
G. B.
Wu
Z. Y.
Walski
T.
2008
The battle of the water sensor networks (BWSN): a design challenge for engineers and algorithms
.
Journal of Water Resources Planning and Management
134
(
6
),
556
568
. .
Rossman
L.
2000
EPANET 2 User Manual
.
US EPA
,
Washington, DC
.
Ulanicki
B.
Shipley
N.
Prescott
S.
2003
Analysis of district metered area (DMA) performance
. In:
Advances in Water Supply Management
.
Taylor & Francis
. .
Valizadeh
S.
Moshiri
B.
Salahshoor
K.
2009
Leak detection in transportation pipelines using feature extraction and KNN classification
. In
Pipelines 2009
,
Reston, VA
.
American Society of Civil Engineers
, pp.
580
589
. .
Warshall
S.
1962
A theorem on boolean matrices
.
Journal of the ACM (JACM)
9
(
1
),
11
12
.
http://dl.acm.org/doi/10.1145/321105.321107
.
Zhang
Y.
Li
S.
2017
PSA: A Novel Optimization Algorithm Based on Survival Rules of Porcellio Scaber
.
Zhang
Q.
Wu
Z. Y.
Zhao
M.
Qi
J.
Huang
Y.
Zhao
H.
2016
Leakage zone identification in large-scale water distribution systems using multiclass support vector machines
.
Journal of Water Resources Planning and Management
142
(
11
),
04016042
. .
Zhang
Q.
Wu
Z. Y.
Zhao
M.
Qi
J.
Huang
Y.
Zhao
H.
2017
Automatic partitioning of water distribution networks using multiscale community detection and multiobjective optimization
.
Journal of Water Resources Planning and Management
143
(
9
),
04017057
. .