## Abstract

Given the continuous aging of pipeline infrastructure buried underground, water utilities need to make a strategic plan on how to maintain the entire water distribution systems (WDSs) to ensure the required standard of supply. This paper investigates the nexus of three key factors that have a significant impact on the decision-making process of the rehabilitation plan for WDSs. The factors considered include the problem formulations, the pipe selection methods for identifying critical components of WDSs, and the multi-objective evolutionary algorithms (MOEAs). The nexus was revealed by considering all the combinations of two variants for each factor. The optimal rehabilitation problem of the Exeter network was used as a case study. Results exhibit that the problem formulation determined the range of Pareto fronts essentially, which should give the highest priority in the decision-making process. The pipe selection method played a secondary role, mainly affecting the shape of Pareto fronts. Optimization algorithms also had a considerable impact on the optimality of solutions, subject to their characteristics and parameter settings. This paper also highlights that breakthroughs need to focus on these key factors to facilitate a more cost-effective solution to the rehabilitation of large WDSs.

## HIGHLIGHTS

• A hierarchical framework is presented for the strategic rehabilitation planning of WDSs.

• The nexus among three key factors related to the rehabilitation of WDSs is investigated.

• The problem formulation is the most significant factor.

• The pipe selection method for identifying the vulnerable pipes of WDSs plays a secondary role.

• Advanced MOEAs that alleviate the parameterization issues are needed in practice.

### Graphical Abstract

Graphical Abstract
Graphical Abstract

## INTRODUCTION

The aging of water distribution systems (WDSs) is a common issue worldwide, especially in large and mega cities, which can deteriorate the service performance provided by utilities. As a result, WDSs need to be regularly maintained to cope with the deterioration of pipes and other components and meet the increasing demand for sustainable urban development (Sitzenfrei et al. 2013). The rehabilitation plan, either short- or long-term, is a vital resource since it usually involves substantial investment and has significant impacts on the operation of WDSs and citizens’ wellbeing.

Taking the United States as an example, more than two decades ago, the first report to Congress by the US Environmental Protection Agency found that the total infrastructure investment need was $138.4 billion and over 50% (about$77 billion) would be dedicated to transmission and distribution pipelines over the next 20 years (USEPA 1997). In the sixth report, the total 20-year need rises to $472.6 billion, with over 60% expenditure (about$312.6 billion) on water transmission and distribution projects. The need for the rehabilitation, replacement, and upgrade of existing infrastructure continuously escalates and is the cause of the overall increase in total national needs (USEPA 2018). Another outlook report agreed with the same situation in the OECD member states. Infrastructure investment will be challenged by the growing importance of maintenance, upgrading, and rehabilitation of existing infrastructure (OECD 2006).

Rehabilitation of WDSs is a challenging task, as it needs to ensure that substantial yet limited investment will improve the service performance to the required level in the most cost-effective way. As such, many methods and strategies have been proposed in the past decades. Readers are referred to the following review papers for insightful opinions and comments (Engelhardt et al. 2000; Bubtiena et al. 2012; Scheidegger et al. 2015). The rehabilitation task can be even more complicated by taking water demand uncertainty and staged construction into consideration. The system demand may continuously increase due to the effect of urbanization or occasionally decrease due to depopulation. Therefore, this intrinsic uncertainty will have significant impacts on the decisions of WDS rehabilitation.

On the other hand, the engineering construction involved in rehabilitation usually happens sequentially at different intervals, thus being prone to be affected by factors such as macro policies, fluctuations of inflation rates, and labor costs. Several strategic methodologies have been proposed to cope with difficulties mentioned above under long-term, unknown future conditions, including the scenario-based method (Kang & Lansey 2012), the flexible design method (Basupi & Kapelan 2015), the real options-based approach (Marques et al. 2015), and that using multi-criteria decision analysis (Scholten et al. 2014). Compared with the traditional, deterministic design methods, these strategic approaches provide decision-makers with a set of more cost-effective alternatives over the long-term horizons.

From the perspective of the nature of analytical methods, previous studies concerning the decision models for rehabilitation can be generally divided into two categories: (1) prioritization models based and (2) optimization models based. The former usually rely on various types of predictive models of pipe deterioration/failure rates, combining either statistical models or data-driven (mining) techniques with multi-criteria decision-making methods to generate a list of pipes which are of top priority in the rehabilitation program (Yoo et al. 2010; Choi et al. 2015). The latter apply various optimization algorithms, especially those based on multi-objective evolutionary algorithms (MOEAs), to dealing with where system components (mainly focused on pipes) should be considered for rehabilitation (Halhal et al. 1999; Cheung et al. 2003; Bozorg-Haddad et al. 2008; Olsson et al. 2009; Rahmani et al. 2016; Muhammed et al. 2017). Most recently, a combination of both types of models mentioned above has also been reported to be a promising alternative for planning the rehabilitation of WDSs (Vertommen et al. 2018).

The common issues associated with the previous methods include: (1) the prioritization models based methods usually require sufficient and accurate historical data about the leakage or burst of pipes, which are sometimes unavailable for many utilities, especially in developing countries. Also, the statistical or data-driven models without any linkage with hydraulic simulations may lead to missing really critical pipes within WDSs; (2) the optimality of solutions obtained by various MOEAs heavily relies on the settings of associated parameters, which is usually case by case and should be identified/fine-tuned through preliminary runs. This requirement could be time-intensive for larger WDSs and presents a challenging task to practitioners who lack the expertise of MOEAs; (3) most cases for testing/comparative purposes were hypothetical or small networks, which implies that these methods might not be easily expanded to cope with larger, real-world WDSs.

This paper focuses on the deterministic optimization models based method for the rehabilitation task, assuming that utilities have their hydraulic models of WDSs that are well maintained as key assets. We aim to answer the following two questions: (1) what are key factors of the strategic decision-making for the rehabilitation of large WDSs and (2) what is the nexus between these factors? The main contributions of this work are threefold: (1) proposing an analytical framework to facilitate a rational decision-making process, (2) exhibiting the impacts of key factors on the rehabilitation planning, and (3) suggesting the future work directions toward more informed decision-making leading to cost-effective, sustainable, and energy-efficient management of WDSs.

## METHODS

### Overview

No matter the long- or short-term, some fundamental questions need to be considered during the strategic planning on the rehabilitation of WDSs. As shown in Figure 1, those questions generally include (but not limited to) the following three: (i) How to define the problem, i.e., what are the goals of system rehabilitation? (ii) Where are the bottleneck components located? (iii) What kinds of analytical methods are effective and efficient in dealing with the planning task? Those questions can be further organized as a top-down structure, containing three levels of thinking. At the goal level, decision-makers need to determine what objectives are really of concern, and this level implicitly influences the lower levels. A well-maintained WDS should consistently meet its customers’ demand in sufficient quantity (i.e., satisfying minimum pressure heads) and good quality (e.g., no risks of unpleasant smell, discoloration, and recontamination). Importantly, this performance should be achieved in a cost-effective manner, reducing the energy cost and greenhouse gas emissions. The rehabilitation planning often targets the areas with poor performance (e.g., frequent customer complaints) due to aging infrastructure or excessive residence times. At the component level, decision-makers want to identify all the vulnerable components, such as the pipes with high unit head loss or a higher possibility of breakage, as these are the real bottlenecks of the system, which will be repaired, duplicated, or replaced. This level sets the scope of the rehabilitation work, in which only critical elements are taken into account. At the tool level, decision-makers have to rely on some analytical methods to establish a satisfactory solution to the rehabilitation problem defined at the previous two levels. Various simulation and optimization packages are usually necessary and helpful tools in making this task efficient.

Figure 1

Strategic planning on the rehabilitation of WDSs: from brainstorming to hierarchical thinking.

Figure 1

Strategic planning on the rehabilitation of WDSs: from brainstorming to hierarchical thinking.

More relevant questions, which may have a great impact, can be raised during the stage of brainstorming. However, in this paper, we only focus on three common factors (questions) mentioned above and make an effort to demonstrate how these three factors are interconnected with each other. For each factor, we choose two representative variants (i.e., two kinds of problem formulations, two types of selection methods for the identification of critical pipes, and two optimization algorithms) to restrict the analyses in an affordable range. In particular, we consider the tradeoffs between capital costs and two surrogate indicators of system performance, one is the number of nodes with pressure deficiencies (NNPD) (Muhammed et al. 2017), and the other is the network resilience index (NRI) (Prasad & Park 2004). A recently proposed, cluster-based technique for identifying the critical pipes within a WDS is compared with the engineers’ knowledge-based approach (Muhammed et al. 2017). Regarding the optimization methods, we consider the well-known NSGA-II (Deb et al. 2002) and a brand-new, hybrid MOEA (Wang et al. 2017). A brief introduction to these variants of each factor is provided below. Due to space limitations, readers are referred to the corresponding references for more details.

### Problem formulations: Cost vs. NNPD and Cost vs. NRI

The first formulation of the optimal rehabilitation problem of a WDS is given in Equation (1). The objectives are to minimize the capital cost and the NNPD. In a narrow sense, only pipe sizes for duplication are taken as decision variables, which can be chosen from a list of discrete options available in the market. This multi-objective formulation is a relaxed form of the least-cost design of WDSs, in which the constraints regarding the minimum nodal pressure heads are converted as the second objective.
(1)
where C is the capital cost; U(Di) is the unit cost of pipe i which mainly depends on its corresponding diameter Di; Li is the length of pipe i; np is the number of pipes considered for duplication; PDj is a Boolean value (i.e., 0/1) depending on the relationship between the actual head Hj and the minimum required head at node j; nn is the number of nodes within the network; and ns is the number of commercially available pipe sizes.

The benefits of using this kind of formulation are twofold. Firstly, it eliminates the troublesome task (often time-consuming) of setting the penalty factor to discriminate against the infeasible solutions from feasible ones. Secondly, in case of an insufficient budget, this formulation can provide some alternative solutions that only a few nodes suffer from pressure deficiencies but with significant investment savings. The latter point is probably a great advantage in dealing with practical projects. However, an improved formulation, which measures the magnitude of pressure deficiencies, should be considered to ensure the deficiencies are within an acceptable scale (i.e., marginally violating the head requirements).

The second formulation for WDS rehabilitation is shown in Equation (2). The objectives are to minimize the capital cost and maximize the NRI. The NRI is a widely used surrogate indicator of system reliability (Raad et al. 2010; Wang et al. 2015; Liu et al. 2017; Bin Mahmoud & Piratla 2018). It is derived from the law of energy conservation, where the total available power injected into a given WDS is dissipated in pipes (due to pipe wall friction resistance) and delivered to demand nodes (users). Besides the surplus power at demand nodes, the NRI also takes the pipe size uniformity into account. The increase of NRI can improve the energetic redundancy of the network system and ensure reliable loops around demand nodes, which in turn increases the mechanical reliability to overcome potential failures of components (e.g., pipe breakage). However, improving NRI may also introduce an adverse impact on water quality, and this is particularly of concern in the extremities of networks due to longer residence time.
(2)
where Uj is the uniformity of node j; qj is the demand at node j; npj is the number of pipes that connected to node j; di is the diameter of pipe i that connected to node j; Qr and Hr are the discharge and head at reservoir r, respectively; Pk is the energy provided by pump k; is the specific weight of water; and nr and npu are the numbers of reservoirs and pumps, respectively.

Compared with the first problem formulation, minimization of cost and maximization of the NRI simultaneously pose a constrained optimization problem and pay attention to the tradeoff between capital cost and system reliability. This will help decision-makers meet the regulations of supply service strictly and identify more reliable solutions.

### Pipe selection methods: knowledge-based empirical method vs. cluster-based approach

As the aging of a WDS, the hydraulic performance of system components deteriorates accordingly. Therefore, identifying the most critical/vulnerable components plays a crucial role in making a cost-effective rehabilitation plan. In the past, such a plan mainly relied on the expertise of local engineers who were familiar with the topological and operational characteristics of their WDSs. The drawback of the knowledge-based empirical method lies in the fact that the value (accuracy) of engineering judgment may diminish as the system layout changes or expands due to urban development. Also, water utilities may lose such expertise in case of staff turnover. Therefore, the analytical methods that can replace engineering judgment are demanded.

Recently, a novel method based on the graph theory clustering concept was developed to identify the critical pipes within a large WDS (Muhammed et al. 2017). The methodology starts with dividing the entire network into several clusters according to its connectivity properties. Then, pipes that may have a significant impact on system performance are identified and considered as decision variables for the rehabilitation problem through multi-objective optimization. Those pipes include the ones located inside clusters resulting in nodal pressure deficiencies and the ones existing as inter-cluster transmission pipelines or linking paths between water source(s) and clusters. The application of the cluster-based approach mentioned above was applied to the EXNET rehabilitation problem (Farmani et al. 2005) and successfully found better cost-effective solutions than those discovered by engineering judgment.

### Optimization algorithms: NSGA-II vs. hybrid MOEAs

The rehabilitation problems of WDSs described in Equations (1) and (2) are extremely challenging due to the combinatorial nature, which is NP-hard and computationally intensive. MOEAs are suitable for this type of problem due to the characteristics of metaheuristics. Below some key features of NSGA-II and the hybrid MOEA used are described.

NSGA-II is acknowledged as an ‘industry standard’ algorithm that has been successfully applied to a variety of optimization problems related to WDSs (Nicklow et al. 2010; Maier et al. 2014, 2019). The main driving force of this algorithm is the fast non-dominated sorting approach for ranking solutions based on the Pareto-dominance relationship. Solutions with higher rankings survive and are selected as parents for reproduction. In case solutions are non-dominated to each other (i.e., with the same ranking), a secondary sorting criterion known as the crowding distance is used. Solutions with larger crowding distances are preferred.

NSGA-II approximates the Pareto-optimal front of a given problem by generating better offspring iteratively. It implements a binary tournament selection to establish the mating pool (i.e., parents) and applies the simulated binary crossover (SBX) and the polynomial mutation (PM) with designated probabilities to create children from parents (Deb et al. 2002). The implicit elitism strategy ensures that the best solutions ever found in the search history are always retained in the population. This allows a new population to be derived from the combination of parents and their offspring via the fast non-dominated sorting approach. NSGA-II also provides an effective constraint-handling technique to deal with constrained problems.

Compared with NSGA-II, GALAXY, which stands for a genetically adaptive leaping algorithm for approximation and diversity, is a brand-new hybrid optimizer for dealing with the discrete, combinatorial, multi-objective design of WDSs (Wang et al. 2017). It follows the generational framework of MOEAs and employs six search operators for reproduction simultaneously. These operators are selected based on their leaping ability in the objective space from the global and local search perspectives. Three search operators (i.e., Turbulence Factor, Differential Evolution, and Simulated Binary Crossover for Integers) guide the search at the early stage due to excellent leaping abilities in the global sense; in contrast, the other search operators (i.e., Uniform Mutation, Gaussian Mutation, and Dither Creeping) intensify the local search by fine-tuning the solutions found in the near-optimal region.

GALAXY also implements several important strategies that guarantee its performance on the combinatorial optimization problems. The hybrid replacement strategy combines the power of Pareto-dominance and ε-dominance concepts (Laumanns et al. 2002), thus maintaining a good balance between exploration and exploitation and preventing global optima from being lost. A subtle modification to the original genetically adaptive strategy (Vrugt & Robinson 2007) adjusts the best portfolio of six operators and ensures a consistently robust search capability. The global information strategy (Vrugt & Robinson 2007) increases the probability of not-so-good operators to contribute high-quality solutions subsequently. The duplicates handling strategy periodically helps further to improve the convergence and diversity of final solutions.

Another advantage of GALAXY over existing MOEAs is the alleviation of the parameterization issues to a great extent. In practice, users only need to specify the population size and the computational budget (i.e., times of evaluating the objective functions). This significantly simplifies the fine-tuning process of parameters and improves its robustness intrinsically.

### Case study

The same EXNET water distribution system, as used in Muhammed et al. (2017), is considered in this study. The EXNET was set up by the Centre for Water Systems at the University of Exeter as a realistic and challenging rehabilitation problem (Farmani et al. 2004). The network serves a population of approximately 400,000 and needs to be reinforced to meet the projected demand in 2020. It consists of a great number of small pipes and few transmission mains, with a large head-loss range at the extremities of the system. Initially, this model was built as an extended period simulation model. It is simplified here as per Farmani et al. (2005) by replacing two tanks with fixed-head reservoirs and turning the extended period simulation model into a steady-state (snapshot) model, considering the peak demand of the system over 24 h.

The layout of the EXNET is shown in Figure 2, which contains 2,465 pipes, 5 valves, 1,891 junction nodes, and 7 water sources (blue dots). The total consumption is 3,245.81 l/s. Two major reservoirs (node 3001 and 3002) supply water to the system at fixed heads of 58.4 and 62.4 m, respectively. The system is also fed by its neighbor systems via node 3003 to 3007 at fixed rates. Three non-return valves (also known as check valves) are connected to node 3001 and 3002 to control the flow direction into and out of the system. One pressure reducing valve is located in the downstream of node 3004 to maintain the downstream pressure below 58.4 m. One throttle control valve is also in the link downstream of node 3004 to control the system's flow and pressure.

Figure 2

The EXNET's layout and the candidate pipes’ locations selected by Muhammed349 (green) and Engineer567 (blue).

Figure 2

The EXNET's layout and the candidate pipes’ locations selected by Muhammed349 (green) and Engineer567 (blue).

The minimum pressure required at each demand node is 15 m (Muhammed et al. 2017). As seen from Figure 2, one-third of nodes have pressure deficiencies (larger gray dots) evaluated by EPANET 2.0 (Rossman 2000), mainly concentrated in the west and southeast boundaries of the network. There are also some low-pressure nodes scattered near the sources and in the middle part of the system. In particular, some demand nodes suffer from negative pressure (red dots), which are located at the left and right terminals of the network. The reasons for pressure deficiencies in EXNET mainly include a great number of pipes with insufficient diameters, limited transmission mains, and significant differences in the elevations of demand nodes. Therefore, it is necessary to propose a cost-effective rehabilitation plan to improve the performance of this network and meet the projected demand.

Figure 2 depicts the layout of candidate pipes selected for duplication by the two methods described previously. The green and blue lines show the set of pipes identified by the cluster-based approach (denoted as Muhammed349) and the engineering judgment method (denoted as Engineer567), respectively. The numbers ‘349’ and ‘567’ at the subscript positions denote the numbers of critical pipes selected by Muhammed349 and Engineer567. The total length considered for rehabilitation by Muhammed349 and Engineer567 are about 101 and 167 km, which account for 17 and 28% of the pipe length of the entire network (i.e., 594 km), respectively. As shown in Figure 2, the former method has a clear pattern, including the pipes that link water sources to NNPD. In contrast, the pipes considered by the latter method seem to be scattered across the network. More data related to the rehabilitation of EXNET are provided in CWS (2020), including the Colebrook–White friction factors and unit costs of new pipes, which depend on the importance of roads (i.e., major roads or footpaths) and internal diameters.

### Experimental setup

The Exeter network's optimal rehabilitation problem was solved using different combinations of problem formulations, pipe selection methods, and MOEAs to investigate the nexus among the three factors. For simplicity, we considered only two cases for each factor, leading to eight groups of experiments. Ten independent runs were implemented for each experiment to eliminate the impact of randomness imposed at the initialization stage of MOEAs. The same computational budget was applied to each run of optimization; that is, the number of function evaluations (NFEs) was equal to 5 million, and the population size of 100 was fixed between NSGA-II and GALAXY. This computational budget allowed two MOEAs to evolve 50,000 generations, which was observed to achieve a satisfactory convergence in the preliminary runs.

As mentioned before, the GALAXY method has no extra parameters besides the population size and the NFEs. Therefore, to ensure the excellent performance of NSGA-II, a fine-tuning procedure of parameters was conducted, as described in Wang et al. (2019). In particular, the probabilities of SBX and PM were set to 0.9 and the inverse of the number of decision variables, respectively. The distribution indices of SBX and PM were fixed to 20 and 1, respectively.

We performed a posteriori reliability assessment of each rehabilitated network as a complementary and straightforward indicator to the NRI for some representative solutions obtained by different groups of experiments. For clarity, the average demand satisfaction (ADS) hereafter refers to a network system's capability to fulfill the nodal demands under mechanical failures (e.g., pipe breakage). In a narrow sense, we only considered the single-pipe failure scenarios due to the relatively low probabilities of simultaneous breaks of two or more pipes (Liu et al. 2017).

The reliability of a given network, measured by the ADS, is calculated by Equation (3). The nodal demands were evaluated by a pressure-driven analysis (PDA) model built into the recently released EPANET Toolkit 2.2 (McDonnell et al. 2019). The required pressure and the minimum pressure were set to 15 and 0 m, respectively. This means that the nodal demand can be supplied only if the pressure is above 0 m and is fully satisfied if the pressure head is greater than 15 m. This is because PDA is recognized as providing a more realistic picture of the hydraulic behavior of WDSs during pressure-deficient conditions than the traditional demand-driven analysis (Giustolisi & Walski 2012). Specifically, each pipe considered failed was first closed in the EPANET model. This was followed by the hydraulic simulation and the retrieval of the actual demand at each node. This iterative process continued until all candidate pipes were processed, one at a time. Without losing the generality, every single pipe is assumed to be out of service with equal probability. Thus, the ADS exhibits the overall satisfaction rate of nodal demand across all the possible pipe failures.
(3)
where nf is the number of potential single-pipe isolation scenarios; is the actual demand at node j evaluated by the PDA when pipe i is closed; and is the required demand at node j.
Another quantitative metric, termed as the probability of suffering from pressure deficiencies at a specific node (denoted as ), is also used to measure the impact of single-pipe isolation events (see Equation (4)). is equal to the ratio of the cumulative to the number of all the possible single-pipe isolation scenarios.
(4)
where is a Boolean value (i.e., 0/1) depending on the relationship between the actual head and the minimum required head at node j when pipe i is out of service.

## RESULTS AND DISCUSSION

### Impact of problem formulations

Figure 3 shows how the problem formulations impact the Pareto fronts, no matter which pipe selection method and MOEA were used. There exists a considerable gap between the solutions obtained under two formulations in terms of capital costs, despite a small number of overlaps around £3 M. This gap suggests the different landscapes of the search space established by the two problem formulations. In particular, minimizing both the capital cost and NNPD makes the rehabilitation task an unconstrained optimization problem. Only the solutions lying on the x-axis are fully feasible and meet the minimum pressure head requirements across the network. In contrast, minimizing the capital cost and maximizing the NRI transform the rehabilitation of the Exeter network as a constrained problem; thus, each solution on the Pareto front is strictly feasible. Consequently, more investment is required to ensure sufficient pressure heads at demand nodes across the network.

Figure 3

Pareto fronts obtained by different problem formulations (the red dots denote the least-cost solutions obtained under two problem formulations, which are and in Table 1, respectively).

Figure 3

Pareto fronts obtained by different problem formulations (the red dots denote the least-cost solutions obtained under two problem formulations, which are and in Table 1, respectively).

Table 1

Impact of the problem formulation on the least-cost solutions for decision-making

Assessment criteriaCost vs. NNPD
Cost vs. NRI
Cost (£M) 1.6434 1.7158 2.5004 2.5079 2.8540 2.9713 3.0021 4.2809
NRI (–) 0.1915 0.1949 0.1846 0.1808 0.2152 0.2139 0.1900 0.2388
ADS (–) 0.9991 0.9991 0.9991 0.9991 0.9993 0.9993 0.9993 0.9993
No. of duplications 40 42 61 65 111 92 109 186
Duplication length (m) 8,903 9,993 12,160 14,033 21,303 15,279 22,187 37,365
Avg. diam. (mm) 231.7 215.5 248.0 233.2 180.4 233.2 193.0 139.0
(%) = 0 365 365 376 430 367 486 442 432
(0,5] 1,479 1,482 1,475 1,420 1,496 1,380 1,417 1,428
(5,10] 34 33 31 30 16 14 23 21
(10,15]
(15,20]
(20,25]
(25,30]
(30,35]
Assessment criteriaCost vs. NNPD
Cost vs. NRI
Cost (£M) 1.6434 1.7158 2.5004 2.5079 2.8540 2.9713 3.0021 4.2809
NRI (–) 0.1915 0.1949 0.1846 0.1808 0.2152 0.2139 0.1900 0.2388
ADS (–) 0.9991 0.9991 0.9991 0.9991 0.9993 0.9993 0.9993 0.9993
No. of duplications 40 42 61 65 111 92 109 186
Duplication length (m) 8,903 9,993 12,160 14,033 21,303 15,279 22,187 37,365
Avg. diam. (mm) 231.7 215.5 248.0 233.2 180.4 233.2 193.0 139.0
(%) = 0 365 365 376 430 367 486 442 432
(0,5] 1,479 1,482 1,475 1,420 1,496 1,380 1,417 1,428
(5,10] 34 33 31 30 16 14 23 21
(10,15]
(15,20]
(20,25]
(25,30]
(30,35]

Note: represents the least-cost solution obtained by an MOEA and a pipe selection method. The letters ‘N’ and ‘G’ at the superscript positions denote ‘NSGA-II’ and ‘GALAXY’, respectively. The numbers ‘349’ and ‘567’ have the same meanings as mentioned before, which denote the numbers of critical pipes selected by Muhammed349 and Engineer567, respectively. The total number of demand nodes considered for calculating is 1,885.

From the decision-making perspective, the two problem formulations show conflicting information, which would yield significantly increased investment. For instance, the capital costs of the least-cost solutions (marked as red dots in Figure 3) obtained under Cost vs. NNPD and Cost vs. NRI are around £1.64 M and £2.85 M, respectively. The NRI values of the two solutions are about 0.192 and 0.215. This means that increasing the capital cost by 74% leads to only a 12% improvement in the surrogate reliability indicator, which seems to be not cost-effective. Table 1 compares the eight least-cost solutions identified in each experiment. The capital costs of these solutions range from £1.64 M to £4.28 M, implying a sharp increase (more than 160%) in capital costs influenced by problem formulations, pipe selection methods, and optimization algorithms. The solutions discovered under Cost vs. NNPD are always cheaper than their counterparts under Cost vs. NRI, which correspond to the gap between the two problem formulations, as shown in Figure 3.

Additionally, and under Cost vs. NNPD can save 34.5 and 31.4% compared with and , respectively. Meanwhile, the NRI values of and under Cost vs. NNPD are slightly better than and . In the Pareto sense, the former two solutions dominate the latter ones, which may suggest that identifying the critical components by local engineers are more accurate than those by the cluster-based method under Cost vs. NNPD.

To further validate to what extend the network demands would be affected during the pipe failure scenarios, we calculated the ADS and the following Equations (3) and (4), respectively (see Table 1). The ADS of the original network before rehabilitation is 0.9651. No matter which formulation was used, the ADS of the EXNET was improved (about 3.5%) after critical pipes were duplicated. However, the differences in terms of ADS are not as noteworthy as appeared in NRI. Specifically, from under Cost vs. NNPD to under Cost vs. NRI, the ADS has a small increase (0.02%) despite the fact that the capital cost and the NRI are raised by 160 and 24.7%, respectively. Furthermore, from the distribution of across the eight least-cost solutions, solutions obtained under Cost vs. NRI are generally more reliable in the single-pipe isolation events than those under Cost vs. NNPD. However, this advantage is negligible. In particular, 19.4–25.8% of nodes are free of risk, no matter which pipe is failed. 73.2–79.4% of nodes have a very low chance (less than 5%) of facing low-pressure supply. Only 1–2.2% of nodes have to bear pressure deficiencies with a probability higher than 5%. In the absence of information on the importance of different nodes, the improved reliability cannot compensate for the massive increase in capital costs. In other words, the solutions obtained under Cost vs. NNPD are better choices due to considerable savings in expenditure. Without the comparison between two types of problem formulations, decision-makers who follow the multi-objective optimization involving Cost and NRI may incur far less efficient investment with marginal benefit.

Table 1 also shows the number, length, and average diameter of pipes rehabilitated for each least-cost solution obtained in this study. The number (or length) of pipes duplicated plays a key role in rehabilitation cost. Specifically, the rehabilitation cost has a clear positive correlation with the number and length of pipes duplicated. In contrast, the rehabilitation cost seems to have an adverse trend against the average diameter of pipes duplicated. No matter which MOEA and pipe selection method was used, the number and length of pipes rehabilitated under Cost vs. NNPD are much less than those under Cost vs. NRI. This is straightforward because the former formulation tends to meet the minimum pressure heads at the lowest cost. Therefore, duplicating a minimum set of bottleneck pipes with smaller pipes in diameter turns out to be a cost-effective decision. In this sense, the lengths of pipe rehabilitated are within 8.9–14.0 km under Cost vs. NNPD, accounting for less than 2.5% of the total length of the EXNET (i.e., 594 km). When solving the rehabilitation problem from the perspective of Cost vs. NRI, more investment is required to strengthen the reliable loops across the EXNET.

### Impact of pipe selection methods

Figure 4(a) compares the Pareto fronts obtained by NSGA-II using different pipe selection methods under Cost vs. NNPD formulation. The Pareto fronts identified by Engineer567 nearly dominate those by Muhammed349, especially when the capital costs are beyond £0.6 M. The gaps between the two Pareto fronts become larger as the capital costs increase, indicating that the pipe selection method suggested by Engineer567 leads to better converged Pareto front. After optimization, only duplicates 40 existing pipes (7.1% of the 567 candidate pipes), while suggests 65 of 349 old pipes (18.6%) are rehabilitated. This proves Engineer567 effectively covers more critical components compared with those by Muhammed349; thus, money can be spent where it matters most.

Figure 4

Comparisons of Pareto fronts obtained by NSGA-II using different pipe selection methods: (a) Under Cost vs. NNPD formulation and (b) Under Cost vs. NRI formulation.

Figure 4

Comparisons of Pareto fronts obtained by NSGA-II using different pipe selection methods: (a) Under Cost vs. NNPD formulation and (b) Under Cost vs. NRI formulation.

Figure 4(b) compares the Pareto fronts obtained by NSGA-II using different pipe selection methods under Cost vs. NRI formulation. Unlike Figure 4(a), the Pareto fronts obtained by two pipe selection methods have their advantages. The Pareto front obtained by Muhammed349 marginally dominates that by Engineer567 in the range of costs between £3 M and £13 M. Beyond £13 M, the Pareto fronts obtained by Engineer567 outperform those by Muhammed349. The cross of the two Pareto fronts implies that each pipe selection method captures some more critical pipes than its counterpart. In other words, when the capital costs are limited to less than £13 M, Muhammed349 identifies more cost-effective solutions by duplicating the pipes not covered by Engineer567, and this situation reverses when more expenditure is available.

Figure 5(a) shows the comparison of Pareto fronts obtained by GALAXY using different pipe selection methods under Cost vs. NNPD formulation. A very similar pattern can be observed, as seen in Figure 4(a). This confirms that no matter what kind of MOEAs is used, the Pareto fronts identified by Engineer567 largely outperforms those by Muhammed349. In contrast, Figure 5(b) demonstrates minor differences from Figure 4(b). Specifically, the Pareto fronts obtained by Muhammed349 intersect twice with those by Engineer567. In the range of costs between £10 M and £20 M, the solutions discovered by Muhammed349 dominate those by Engineer567. However, if the expenditure is out of this range, the solutions found by Engineer567 are superior to those by Muhammed349, especially when capital costs are over £20 M. The differences between Figures 4(b) and 5(b) are probably attributed to the search capability of two MOEAs, which will be discussed in detail in the subsection below.

Figure 5

Comparisons of Pareto fronts obtained by GALAXY using different pipe selection methods: (a) Under Cost vs. NNPD formulation and (b) Under Cost vs. NRI formulation.

Figure 5

Comparisons of Pareto fronts obtained by GALAXY using different pipe selection methods: (a) Under Cost vs. NNPD formulation and (b) Under Cost vs. NRI formulation.

### Impact of optimization algorithms

Figure 6 presents the comparisons of Pareto fronts discovered by GALAXY and NSGA-II under different combinations of the other two factors, i.e., the problem formulations and the pipe selection methods. Figure 6 is derived from the same data, as shown in Figures 4 and 5, but organized differently. This is to differentiate the performance of two MOEAs under different scenarios. As shown in Figure 6, the Pareto fronts identified by NSGA-II consistently outperform those by GALAXY, except for a small number of solutions observed in the boundary regions in Figure 6(c). In other words, NSGA-II, after a fine-tuning procedure, managed to discover better converged and distributed solutions than GALAXY regardless of problem formulations and pipe selection methods. However, it is worth noting that the performance of NSGA-II varied significantly if different parameter settings were used (see Supplementary Appendix for details). This means that using default settings as specified in Deb et al. (2002) or less effective parameter combinations could result in much worse Pareto fronts compared with those by GALAXY. In contrast, GALAXY is capable of providing robust solutions due to the alleviation of the parameterization issues of MOEAs to a great extent.

Figure 6

Comparisons of Pareto fronts obtained by different optimization algorithms: (a) Cost vs. NNPD using Engineer567, (b) Cost vs. NNPD using Muhammed349, (c) Cost vs. NRI using Engineer567, and (d) Cost vs. NRI using Muhammed349.

Figure 6

Comparisons of Pareto fronts obtained by different optimization algorithms: (a) Cost vs. NNPD using Engineer567, (b) Cost vs. NNPD using Muhammed349, (c) Cost vs. NRI using Engineer567, and (d) Cost vs. NRI using Muhammed349.

As a larger network than other commonly used benchmark networks for the rehabilitation purpose in the literature (e.g., New York Tunnel), EXNET requires more computational time for MOEAs to converge. In particular, we carried out the numerical experiments on a desktop computer with Intel® Core™ i7-6700 CPU @ 3.4 GHz with a RAM of 8 GB. GALAXY and NSGA-II took on average 9.4 and 5.2 h to solve the Cost vs. NNPD problems for 5 million NFEs. Under Cost vs. NRI formulation, the required CPU runtime for GALAXY and NSGA-II increased to 18.1 and 21.7 h, respectively, because calculating NRI is more computationally intensive than NNPD. Admittedly, applying the proposed approach to a real-world rehabilitation problem, including hundreds of thousands of nodes and pipes in a hydraulic model, would be hindered by the computational overhead. A smart workaround will be necessary in such a case. Researchers and practitioners can either resort to high-performance computing techniques and facilities (e.g., using GPU-accelerated computing on a supercomputer) or using a skeletonized network model, eliminating a significant number of components that have minor impacts.

On the other hand, since the comparison of two MOEAs was not the target of this study, we did not pay particular attention to the impact of population size on the quality of Pareto fronts. Using a small population size (i.e., 100) to deal with the optimal rehabilitation of the Exeter network, which contains 349 or 567 decision variables, may not be efficient to steer the exploration out of the massive local optima existing in the landscape of search space. Therefore, by increasing the population size, both NSGA-II and GALAXY may demonstrate improved (or degraded) performance. However, it is out of the scope of this paper.

## CONCLUSIONS

Many factors can have significant impacts on the strategic decision-making process of the rehabilitation of aging WDSs. This paper establishes a three-layer hierarchical framework of decision-making by abstracting some key factors from the technical perspective. The nexus of these key factors is investigated using the EXNET case study.

The three key factors, including the problem formulations, the pipe selection methods, and the MOEAs, have significantly different impacts on the strategic rehabilitation planning, which should not be overlooked by decision-makers. Specifically, the problem formulation has the highest priority in determining where and how the rehabilitation measures (e.g., pipe duplications in this study) should be implemented. This is because the way that how the optimization problem is formulated affects the landscape of search space in essence. Furthermore, this landscape controls where an effective and efficient optimization algorithm can approach, which in turn influences the optimality of the solutions obtained. Second, the pipe selection method plays a secondary role, which mainly dominates the shape of the Pareto fronts obtained. If more bottleneck pipes within a WDS can be identified in a specific pipe selection method, it is more likely to generate better converged and more diversified Pareto fronts than other pipe selection methods. Third, optimization algorithms also have a considerable impact on the optimality of results, despite not as important as the pipe selection methods. The performance of MOEAs is subject to their characteristics and parameter settings provided that the computational budget is chosen appropriately.

The implications of the current study are threefold. First, how to properly define the optimization problem is vital to decision-makers. In particular, many-objective optimization has been demonstrated in some relevant studies (Fu et al. 2013; Reed et al. 2013) to provide a more comprehensive perspective than traditional multi-objective optimization methodologies, with only two or three objectives optimized concurrently. What kind of problem formulation is more convincing and reasonable than most used ones (as considered in this paper) in the rehabilitation of WDSs turns out to be a salient and promising research direction. For instance, the optimal rehabilitation problem can be formulated to discover the tradeoff of diversified objectives at both the design and operation stages (i.e., capital cost, operational cost, network leakage, and water age). This may facilitate the identification of a more sustainable and energy-efficient solution to network rehabilitation, which ensures a better benefit over the long-term horizons.

Second, a more theoretical-sound methodology to identify the bottleneck pipes within a given WDS is practical and beneficial to decision-makers. Some researchers rely on the clustering methods to screen out candidate pipes, which is generally under the philosophy of ‘divide-and-conquer’. However, with the continuously increasing volume of monitoring data and the aid of modern techniques for dealing with big data, for instance, deep learning, it is possible to build up a data-driven model that is more sophisticated in identifying critical pipes within a broad set of candidate ones. The static data such as pipe length, diameter, material, or even length of service, and simulated data based on a calibrated model, such as pressure heads of adjacent nodes, unit head loss, and flow rates within pipes, are valuable assets which can be used to train the data-driven model. This will probably substitute the empirical methods (e.g., engineering judgment).

Last but not least, more in-depth investigations on the existing MOEAs, as shown in Zecchin et al. (2012), Zheng et al. (2016, 2017a, 2017b), and Wang et al. (2020), can provide more insights into the behavior of these optimization tools and improve the search efficiency by reducing the computational budget and/or alleviating the parameterization issues to a great extent.

The progress in all these above research directions will further advance the decision-making toward more cost-effective, sustainable, and energy-efficient WDSs. Such technological breakthroughs probably lead to user-friendly software, which will address the challenging rehabilitation issues facing most water companies worldwide.

## ACKNOWLEDGEMENTS

The authors disclosed receipt of the following financial support for the research, authorship, and publication of this article: This work was sponsored by the Open Research Fund of the Yellow River Sediment Key Laboratory affiliated to the Ministry of Water Resources, P.R. China (Grant No. 201904).

## DATA AVAILABILITY STATEMENT

All relevant data are available from an online repository or repositories (http://dx.doi.org/10.17632/f2ydwz75nh.1).

## REFERENCES

Basupi
I.
Kapelan
Z.
2015
Evaluating flexibility in water distribution system design under future demand uncertainty
.
Journal of Infrastructure Systems
21
(
2
),
04014034
.
Bin Mahmoud
A. A.
Piratla
K. R.
2018
Comparative evaluation of resilience metrics for water distribution systems using a pressure driven demand-based reliability approach
.
Journal of Water Supply Research and Technology – AQUA
67
(
6
),
517
530
.
O.
B. J.
Marino
M. A.
2008
Optimum rehabilitation strategy of water distribution systems using the HBMO algorithm
.
Journal of Water Supply: Research and Technology – AQUA
57
(
5
),
337
350
.
Bubtiena
A. M.
El Shafei
A. H.
Jafaar
O.
2012
Review of rehabilitation strategies for water distribution pipes
.
Journal of Water Supply: Research and Technology – AQUA
61
(
1
),
23
31
.
Cheung
P. B.
Reis
L. F. R.
Formiga
K. T. M.
Chaudhry
F. H.
Ticona
W. G. C.
2003
Multiobjective evolutionary algorithms applied to the rehabilitation of a water distribution system: a comparative study
. In:
Evolutionary Multi-Criterion Optimization
(
Fonseca
C. M.
Fleming
P. J.
Zitzler
E.
Deb
K.
Thiele
L.
, eds). EMO 2003. Lecture Notes in Computer Science, vol 2632. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36970-8_47
Choi
T.
Han
J.
Koo
J.
2015
Decision method for rehabilitation priority of water distribution system using ELECTRE method
.
Desalination and Water Treatment
53
(
9
),
2369
2377
.
CWS
2020
EXNET Case at the Centre for Water Systems of Exeter University
. .
Deb
K.
Pratap
A.
Agarwal
S.
Meyarivan
T.
2002
A fast and elitist multiobjective genetic algorithm: NSGA-II
.
IEEE Transactions on Evolutionary Computation
6
(
2
),
182
197
.
Engelhardt
M. O.
Skipworth
P. J.
Savic
D. A.
Saul
A. J.
Walters
G. A.
2000
Rehabilitation strategies for water distribution networks: a literature review with a UK perspective
.
Urban Water
2
,
153
170
.
Farmani
R.
Savic
D. A.
Walters
G. A.
2004
‘EXNET’ benchmark problem for multi-objective optimization of large water systems
. In
Modelling and Control for Participatory Planning and Managing Water Systems, IFAC Workshop
,
Venice, Italy
.
Farmani
R.
Savic
D. A.
Walters
G. A.
2005
Evolutionary multi-objective optimization in water distribution network design
.
Engineering Optimization
37
(
2
),
167
183
.
Fu
G. T.
Kapelan
Z.
Kasprzyk
J. R.
Reed
P.
2013
Optimal design of water distribution systems using many-objective visual analytics
.
Journal of Water Resources Planning and Management
139
(
6
),
624
633
.
Giustolisi
O.
Walski
T. M.
2012
Demand components in water distribution network analysis
.
Journal of Water Resources Planning and Management
138
(
4
),
356
367
.
Halhal
D.
Walters
G. A.
Savic
D. A.
Ouazar
D.
1999
Scheduling of water distribution system rehabilitation using structured messy genetic algorithms
.
Evolutionary Computation
7
(
3
),
311
329
.
Kang
D.
Lansey
K.
2012
Scenario-based multistage construction of water supply infrastructure
. In
World Environmental and Water Resources Congress 2012: Crossing Boundaries
,
ASCE
,
Reston, VA
, 3265–3274
.
Laumanns
M.
Thiele
L.
Deb
K.
Zitzler
E.
2002
Combining convergence and diversity in evolutionary multiobjective optimization
.
Evolutionary Computation
10
(
3
),
263
282
.
Liu
H.
Savic
D.
Kapelan
Z.
Creaco
E.
Yuan
Y.
2017
Reliability surrogate measures for water distribution system design: comparative analysis
.
Journal of Water Resources Planning and Management
143
,
2
.
Maier
H. R.
Kapelan
Z.
Kasprzyk
J.
Kollat
J.
Matott
L. S.
Cunha
M. C.
Dandy
G. C.
Gibbs
M. S.
Keedwell
E.
Marchi
A.
Ostfeld
A.
Savic
D.
Solomatine
D. P.
Vrugt
J. A.
Zecchin
A. C.
Minsker
B. S.
Barbour
E. J.
Kuczera
G.
Pasha
F.
Castelletti
A.
Giuliani
M.
Reed
P. M.
2014
Evolutionary algorithms and other metaheuristics in water resources: current status, research challenges and future directions
.
Environmental Modelling & Software
62
,
271
299
.
Maier
H. R.
Razavi
S.
Kapelan
Z.
Matott
L. S.
Kasprzyk
J.
Tolson
B. A.
2019
Introductory overview: optimization using evolutionary algorithms and other metaheuristics
.
Environmental Modelling & Software
114
,
195
213
.
Marques
J.
Cunha
M.
Savić
D. A.
2015
Multi-objective optimization of water distribution systems based on a real options approach
.
Environmental Modelling & Software
63
,
1
13
.
McDonnell
B. E.
Salomons
E.
Uber
J.
Klise
K.
Hatchett
S.
Woo
H.
2019
OWA-EPANET Toolkit 2.2
. .
Muhammed
K.
Farmani
R.
K.
Diao
K.
Butler
D.
2017
Optimal rehabilitation of water distribution systems using a cluster-based technique
.
Journal of Water Resources Planning and Management
143
(
7
), 04017022.
Nicklow
J.
Reed
P.
Savic
D.
Dessalegne
T.
Harrell
L.
Chan-Hilton
A.
Karamouz
M.
Minsker
B.
Ostfeld
A.
Singh
A.
Zechman
E.
2010
State of the art for genetic algorithms and beyond in water resources planning and management
.
Journal of Water Resources Planning and Management
136
(
4
),
412
432
.
OECD
2006
Infrastructure to 2030: Telecom, Land Transport, Water and Electricity
.
OECD
.
Olsson
R. J.
Kapelan
Z.
Savic
D. A.
2009
Probabilistic building block identification for the optimal design and rehabilitation of water distribution systems
.
Journal of Hydroinformatics
11
(
2
),
89
105
.
T. D.
Park
N. S.
2004
Multiobjective genetic algorithms for design of water distribution networks
.
Journal of Water Resources Planning and Management – ASCE
130
(
1
),
73
82
.
D. N.
Sinske
A. N.
van Vuuren
J. H.
2010
Comparison of four reliability surrogate measures for water distribution systems design
.
Water Resources Research
46
, W05524.
Rahmani
F.
K.
Ardeshir
A.
2016
Rehabilitation of a water distribution system using sequential multiobjective optimization models
.
Journal of Water Resources Planning and Management
142
,
5
.
Reed
P. M.
D.
Herman
J. D.
Kasprzyk
J. R.
Kollat
J. B.
2013
Evolutionary multiobjective optimization in water resources: the past, present, and future
.
51
(
0
),
438
456
.
Rossman
L. A.
2000
EPANET Users Manual
.
U.S. Environmental Protection Agency
,
Washington, DC
.
Scheidegger
A.
Leitao
J. P.
Scholten
L.
2015
Statistical failure models for water distribution pipes – a review from a unified perspective
.
Water Research
83
,
237
247
.
Scholten
L.
Scheidegger
A.
Reichert
P.
Mauer
M.
Lienert
J.
2014
Strategic rehabilitation planning of piped water networks using multi-criteria decision analysis
.
Water Research
49
,
124
143
.
USEPA
1997
Drinking water infrastructure needs survey. In First Rep. to Congress, Washington, DC. Office of Water
.
USEPA
2018
Drinking water infrastructure needs survey. In Sixth Rep. to Congress, Washington, DC. Office of Water
.
Vertommen
I.
Laarhoven
K.
Thienen
P.
Agudelo-Vera
C.
Haaijer
T.
Diemel
R.
2018
Optimal design of and transition towards water distribution network blueprints
.
Proceedings
2
,
584
.
Vrugt
J. A.
Robinson
B. A.
2007
Improved evolutionary optimization from genetically adaptive multimethod search
.
Proceedings of the National Academy of Sciences
104
(
3
),
708
711
.
Wang
Q.
Guidolin
M.
Savić
D.
Kapelan
Z.
2015
Two-objective design of benchmark problems of water distribution system via MOEAs: towards the best-known approximation to the true Pareto front
.
Journal of Water Resources Planning and Management
141
(
3
), 04014060.
Wang
Q.
Savić
D. A.
Kapelan
Z.
2017
GALAXY: a new hybrid MOEA for the optimal design of water distribution systems
.
Water Resources Research
53
(
3
),
1997
2015
.
Wang
Q.
Wang
L.
Huang
W.
Wang
Z.
Liu
S.
Savić
D. A.
2019
Parameterization of NSGA-II for the optimal design of water distribution systems
.
Water
11
(
5
),
971
.
Wang
P.
Zecchin
A. C.
Maier
H. R.
Zheng
F.
Newman
J. P.
2020
Do existing multiobjective evolutionary algorithms use a sufficient number of operators? An empirical investigation for water distribution design problems
.
Water Resources Research
56
(
5
),
e2019WR026031
.
Yoo
D. G.
Kim
J. H.
Jun
H.
2010
Study of rehabilitation priority order of pipes for water distribution systems using Utopian approach
.
Journal of the Korean Society of Water and Wastewater
24
(
2
),
183
193
.
Zecchin
A. C.
Simpson
A. R.
Maier
H. R.
Marchi
A.
Nixon
J. B.
2012
Improved understanding of the searching behavior of ant colony optimization algorithms applied to the water distribution design problem
.
Water Resources Research
48
(
9
),
W09505
.
Zheng
F.
Zecchin
A. C.
Maier
H. R.
Simpson
A. R.
2016
Comparison of the searching behavior of NSGA-II, SAMODE, and Borg MOEAs applied to water distribution system design problems
.
Journal of Water Resources Planning and Management
142
(
7
),
04016017
.
Zheng
F.
Zecchin
A. C.
Newman
J. P.
Maier
H. R.
Dandy
G. C.
2017b
An adaptive convergence-trajectory controlled ant colony optimization algorithm with application to water distribution system design problems
.
IEEE Transactions on Evolutionary Computation
21
(
5
),
773
791
.