## Abstract

An issue regarding near-optimal solutions identified by evolutionary algorithms (EAs) is that their absolute deviations from the global optima are often unknown, and hence an EA's performance in handling real-world problems remains unclear. To this end, this paper investigates how close optimal solutions from simple EAs can approach the global optimal for water distribution system (WDS) design problems through an experiment with the number of decision variables ranging from 21 to 3,400. Three simple EAs are considered: the standard differential evolution, the standard genetic algorithm and the creeping genetic algorithm (CGA). The CGA consistently identifies optimal solutions with deviations lower than 50% to the global optimal, even for the WDS with 3,400 decision variables, but the performance of the other two EAs is heavily case study dependent. Results obtained build knowledge regarding these simple EAs’ ability in handling WDS design problems with different sizes. We must acknowledge that these results are conditioned on the WDSs and the parameterization strategies used, and future studies should focus on generalizing the findings obtained in this paper.

## HIGHLIGHTS

This paper explores EA's ability in finding global optima.

This paper gives guidelines for the selection of EAs based on the number of decision variables.

### Graphical Abstract

## INTRODUCTION

Evolutionary algorithms (EAs) have been widely used to deal with water resources optimization problems of various types, including model calibration, engineering design, and system operations and management (Nicklow *et al.* 2010; Maier *et al.* 2014; Jeong *et al.* 2017; Duan *et al.* 2020; Wang *et al.* 2020). Their popularities are attributed to EAs providing a global search strategy without a need for gradient or Hessian information, thereby making them ideally suited for application to real-world water resources problems, which typically involve highly complex physical models, non-linear objectives and/or constraints, and complex solution spaces (Wang *et al.* 2015a; Bozorg-Haddad *et al.* 2017; Choi *et al.* 2017; Zheng *et al.* 2017a).

Originating from a simple genetic algorithm (GA; Simpson *et al.* 1994), different types of EAs have been developed and applied to water resources problems over the past few decades. These include the simulated annealing (Loganathan *et al.* 1995), Tabu search (Lippai *et al.* 1999), harmony search (Geem *et al.* 2001), the shuffled frog leaping algorithm (Eusuff & Lansey 2003), ant colony optimization (ACO; Maier *et al.* 2003), particle swarm optimization (Suribabu & Neelakantan 2006), differential evolution algorithms (Zheng *et al.* 2015) and GA (Huang *et al.* 2020). The performance of these EAs applied to water resources optimization problems is assessed by their solution quality and search efficiency (Nicklow *et al.* 2010). However, the majority of these case studies considered have been on benchmark problems with relatively small search spaces (Kang & Lansey 2012), and thus generally lacking connection to real-world problems that are often of high complexity and dimension (Artita *et al.* 2013).

Having recognized this, there has been an increasing interest in applying EAs to handle large and real-world water resources problems. For examples, in the research area of water distribution system (WDS) design, Reca & Martínez (2006) applied GAs to a real-world irrigation design problem with 454 decision variables; Marchi *et al.* (2014) optimized the design of a WDS with 476 decision variables with the aid of EAs; and Kang & Lansey (2012) used GAs to deal with a real-world WDS with 1,274 pipes. In more recent years, Bi *et al.* (2016) and Zheng *et al.* (2017b) also considered WDS case studies with the number of decision variables more than 1,200; Johns *et al.* (2020) adapted GAs to a WDS design problem with 1,277 decision variables.

While many EAs have been successfully implemented to deal with large and real-world water resources problems as mentioned above, their applications are not without difficulties. One of the challenges is that computational overheads required by EAs to identify optimal solutions are intensive, because of EAs’ inherently stochastic nature in searching behavior, as well as high complexity associated with physical models of water resources problems (Wang *et al.* 2015a; Zhang *et al.* 2018; Chu *et al.* 2020). For example, Beh *et al.* (2017) considered a real-world water infrastructure planning problem with deep uncertainty and estimated that a total runtime for GA-based optimization was approximately 33.6 years if a typically used number of generations (10,000) was allowed. This runtime goes significantly beyond the computational resources that are typically available in practice, often at the scale of a few hours or days (Lee *et al.* 2014).

To address this challenge, one approach is to identify near-optimal solutions for real-world water resources problems that are achievable with affordable computational budgets, rather than aiming to locate global optima that require massive computational resources (Karamouz *et al.* 2014; Maier *et al.* 2014). For example, Bi *et al.* (2015) investigated EAs’ ability in identifying near-optimal solutions for a broad range of WDS design problems, where the near-optimal solutions were defined as those within 5% deviations to the current best-known solutions in the literature; Zheng *et al.* (2017b) developed an adaptive ACO method to find the best possible solutions for WDS design problems under limited computational budgets; Giacomoni & Joseph (2017) developed an EA-based algorithm to identify the near-optimal solutions for deploying green roofs and permeable pavements at the catchment scale.

However, a fundamental issue regarding the near-optimal solutions is that their absolute deviations to the global optima are often unknown for real-world water resources problems. For small problems or for numerical optimization problems, it is possible to estimate the deviation between EAs’ near-optimal solutions and their corresponding global optima, as the latter can be obtained through full enumeration or theoretical derivation. However, this approach is not applicable for large and real-world problems because it is often impossible to analytically identify global optima for such complex problems (Gibbs *et al.* 2011; Yazdi 2016). Consequently, it remains unknown how close these EAs’ solutions are to the global optima for real-world water resources problems, and near-optimal solutions are reported without an ‘absolute’ quality assessment. In the context of the practical applications, EAs’ performance in dealing with large and real-world water resources problems remains unclear to practitioners and even for algorithm developers.

To advance the knowledge within the EA community, the present study aims to understand how close optimal solutions from simple EAs can approach the global optima for water resources problems with varying sizes and complexity. To illustrate the proposed approach, the least-cost WDS design problems are considered, as they represent a typically difficult problem type in water resources. The objective of the paper is attained through a unique experiment design, which considers WDS case studies of gradually increased sizes with known global optima (details are given in the Methodology section). In addition to solution deviation to global optima, we also explore how EAs’ ability in finding feasible solutions varies as a function of optimization problems with increasing sizes. This is because this ability is also often of great interest when dealing with large and complex real-world problems (Maier *et al.* 2014). The number of decision variables considered in the present study ranges from 21 to 3,400, representing a broad range of scales in search space.

Three simple EAs are considered in this study, namely the standard differential evolution (SDE; Zheng *et al.* 2015), the standard genetic algorithm (SGA; Savic & Walters 1997) and the creeping genetic algorithm (CGA; Dandy *et al.* 1996). These three EAs are selected due to their simplicities in algorithm structure as well as their great popularity in practice (Wang *et al.* 2015b). It should be noted that the aim of this study is to investigate the performance variations of simple EAs (solution deviations to the global optima as well as the feasibility) as a function of gradually increasing problem sizes, and hence complex/hybrid EA variants are not considered. The total simulation time of the present study was 141.4 days using five Dell PCs (Inter R) at 2.9 GHz, representing the most comprehensive computational experiment thus far. It is anticipated that this study will improve our knowledge on EAs’ absolute performance in handling real-world water resources problems.

## METHODS

A schematic of the proposed methodology is given in Figure 1, with details presented in the subsequent sections.

### Definition of WDS design problems

Typically, a single-objective optimization of a WDS is the minimization of capital costs (pipes, tanks and other components) while satisfying head constraints at each node, and the problem formulation can be described as follows:

*F*is the network cost that is to be minimized;

*D*is the diameter of the pipe

_{i}*i*;

*L*is the length of the pipe

_{i}*i*;

*a, b*are the specified coefficients for the cost function;

*np*is the total number of pipes in the network;

*nj*is the total number of nodes in the network;

*G*(

*H*,

_{k}*D*) is the nodal mass balance and loop (path) energy balance equations for the whole network, which is solved by a hydraulic simulation package (EPANET2.0 in this study);

*H*is the head at the node

_{k}*k*= 1,2, … ,

*nj*; is the minimum allowable pressure head limit at each of the nodes; and

*A*is a set of commercially available pipe diameters.

### Simple EAs

As previously stated, three simple and popular EAs are considered, namely the SDE, the SGA and the CGA. The SDE has been well demonstrated to be an efficient searching algorithm for WDS design problems (Ahmadianfar *et al.* 2015), with pseudo-code as given in Supplementary Figure S1. As shown in this figure, the SDE relies on three fundamental operators, namely mutation, crossover and selection. As observed in previous studies (Zheng *et al.* 2015), the SDE's searching performance is dominated by its mutation strategy, which generates new solutions through adding the weighted difference (*F*) between two random population members to the third member, as illustrated in Supplementary Figure S1. Apart from the mutation weighted difference (*F*), the searching behavior of SDE is also influenced by population size (*N*) and crossover rate (*CR*).

The SGAs used in water community are derived from Holland (1975), where the crossover operator has been the dominant operator during the evolution, and the mutation operator is considered a second-order operator in terms of importance (Simpson *et al.* 1994). This significantly differs to the SDE as its searching behavior is mainly controlled by its mutation operator. The parameterization strategy with a high crossover probability combined with a low mutation probability has often been suggested to ensure SGAs’ satisfactory performance in dealing with various optimization problems (Nicklow *et al.* 2010). While the SGA has been well documented in the literature, we still present its pseudo-code as given in Supplementary Figure S2 in this paper to enable a clear illustration.

The GA with a creeping mutation operator (CGA) was first proposed by Dandy *et al.* (1996) for optimizing WDS design problems. Creeping mutation aims at maintaining the diversity in the population by altering a string locally to hopefully create a better string. In the CGA, each selected decision variable substring is mutated to an adjacent decision variable substring, indicating that the selected design variable is only allowed to mutate to the adjacent pipe size either up or down one size. Therefore, the solution produced after mutation is located in the neighborhood of the solution that existed before mutation. This mutation approach only slightly modifies an emerging good solution, such that there is more preservation and less disruption compared with the traditional mutation used in the SGA (Supplementary Figure S2). For the CGA used in this paper, the creeping mutation operator mutates the selected substring to an adjacent larger or smaller pipe size with an equal probability. This indicates that there is a 50% probability of the creeping mutation operator changing the selected pipe size to the next adjacent larger or smaller size, as outlined in Supplementary Figure S3.

### Development of synthetic case studies

This paper is featured by the way with which a large range of synthetic case studies with known global optima are produced, so as to enable performance assessments of EAs in dealing with gradually increasing sizes of optimization problems. Two well-known benchmark WDSs are selected as the basic networks to generate synthetic case studies, namely the New York Tunnel Problem (NYTP; 21 decision variables) and the Hanoi Problem (HP; 34 decision variables), with details given in Dandy *et al.* (1996) and Reca & Martínez (2006). This is because these two WDSs have been extensively studied in the literature and the global optima for these problems are considered to be known. As previously reported, the current best-known cost solutions for the NYTP and the HP are $38.64 million (Maier *et al.* 2003) and $6.081 million (Reca & Martínez 2006), respectively. It is noted that the current best-known solutions for these two problems are taken as their corresponding global optima in the present study to enable performance assessments.

Synthetic case studies have been developed using a method referred to as ‘*M*-NET’, which consists of *M* original WDSs connected via a reservoir source node, as illustrated in Figure 2. As can be seen from this figure, three NYTP and HP networks are connected to an identical reservoir node, producing synthetic case studies with 63 and 102 decision variables, respectively. It is observed that the *M* networks are hydraulically independent, and thus the theoretical best cost solution for the *M*-NET problem is actually *M* times of the current best-known cost solution of the single original WDS (NYTP or HP). Following the assumption that the current best solution is considered as the global optima as stated above, the global optimal solution for the *M*-NET problem can be accordingly obtained. For example, for the 3-NYTP and 3-HP in Figure 2, their global optimal solutions have costs of $115.92 million and $18.243 million, respectively.

Following the procedure as mentioned above, a large range of synthetic case studies were generated to assess EAs’ performance, with the number of the decision variables ranging from 21 to 3,400. More specifically, as shown in Figure 1, *M* ranges from 1 to 160 for the *M*-NYTP case studies, and hence, the number of decision variables ranges from 21 to 3,360. Similarly, for the *M*-HP case studies, we consider *M* ranging from 1 to 100, corresponding to 34 to 3,400 decision variables.

It should be noted that while the searching space landscapes of *M*-NET problems can be different to that of a real-world problem with the same size, such an increased size also significantly imposes the searching difficulties of EAs. As previously stated, the main merit of the proposed method in producing synthetic case studies is that their corresponding global optimal can be derived (but it is impossible to identify the global optima for a real-world WDS), which can be used to enable the absolution performance assessment.

### Algorithm parameterization

It has been well known that EAs’ performance is affected by the algorithm parameterizations used, and the best parameter set is often problem dependent (Suribabu 2010; Vasan & Simonovic 2010). This results in difficulties for parameter calibrations when dealing with complex optimization problems. A typical way to handle this is that the parameterization strategies recommended in previous studies are used, instead of comprehensive parameter calibration that requires tedious efforts and large computational budgets (Nicklow *et al.* 2010). These recommended parameterization strategies are often based on experience made by applying EAs to small case studies with the aid of a sensitivity analysis (Maier *et al.* 2014). Motivated by this, parameter values used for the three EAs considered in this paper are based on recommendations from previous studies. Furthermore, a preliminary analysis is also conducted to ensure the overall quality of the selected parameter values for each case study.

Table 1 outlines the selected parameter values for the SDE, which are obtained based on recommendations made in Zheng *et al.* (2015). In addition, a number of different SDE parameter combinations are tried, including different *CR* values of 0.5, 0.6, 0.65, 0.7, 0.75, 0.8 and 0.85, and different *F* values of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6 and 0.7. The parameter combinations that exhibit the best performance are selected and outlined in Table 1. The population size and the maximum allowable number of evaluations for each case study were determined based on a preliminary analysis. For the SGA and CGA, the population size and maximum number of evaluations are identical to the SDE, to enable a fair performance comparison. Following the recommendations in Savic & Walters (1997), the mutation probability used for each case study is 1/*np*, where *np* is the number of decision variables. The crossover probability is set to a value of 0.9 for each case study.

Case study . | Population size . | The number of maximum allowable evaluations (10^{6})
. | CR
. | F
. |
---|---|---|---|---|

1-NYTP | 100 | 25 | 0.6 | 0.5 |

2-NYTP | 150 | 37.5 | 0.6 | 0.5 |

5-NYTP | 200 | 50 | 0.6 | 0.4 |

10-NYTP | 300 | 75 | 0.6 | 0.3 |

15-NYTP | 400 | 100 | 0.5 | 0.08 |

20-NYTP | 500 | 125 | 0.5 | 0.08 |

30-NYTP | 600 | 150 | 0.5 | 0.08 |

40-NYTP | 800 | 200 | 0.5 | 0.08 |

50-NYTP | 1,000 | 250 | 0.5 | 0.08 |

60-NYTP | 1,200 | 300 | 0.5 | 0.08 |

70-NYTP | 1,500 | 375 | 0.5 | 0.08 |

80-NYTP | 2,000 | 500 | 0.5 | 0.08 |

90-NYTP | 2,500 | 625 | 0.5 | 0.08 |

100-NYTP | 3,000 | 750 | 0.5 | 0.08 |

120-NYTP | 3,500 | 875 | 0.5 | 0.08 |

140-NYTP | 4,000 | 1,000 | 0.5 | 0.08 |

160-NYTP | 4,500 | 1,125 | 0.5 | 0.08 |

1-HP | 100 | 25 | 0.8 | 0.7 |

2-HP | 200 | 50 | 0.8 | 0.5 |

5-HP | 300 | 75 | 0.8 | 0.4 |

10-HP | 500 | 125 | 0.8 | 0.3 |

15-HP | 700 | 175 | 0.65 | 0.1 |

20-HP | 800 | 200 | 0.65 | 0.1 |

30-HP | 1,000 | 250 | 0.65 | 0.1 |

40-HP | 1,500 | 375 | 0.65 | 0.1 |

50-HP | 2,000 | 500 | 0.65 | 0.1 |

60-HP | 2,500 | 625 | 0.65 | 0.1 |

70-HP | 3,000 | 750 | 0.65 | 0.1 |

80-HP | 3,500 | 875 | 0.65 | 0.1 |

90-HP | 4,000 | 1,000 | 0.65 | 0.1 |

100-HP | 4,500 | 1,125 | 0.65 | 0.1 |

Case study . | Population size . | The number of maximum allowable evaluations (10^{6})
. | CR
. | F
. |
---|---|---|---|---|

1-NYTP | 100 | 25 | 0.6 | 0.5 |

2-NYTP | 150 | 37.5 | 0.6 | 0.5 |

5-NYTP | 200 | 50 | 0.6 | 0.4 |

10-NYTP | 300 | 75 | 0.6 | 0.3 |

15-NYTP | 400 | 100 | 0.5 | 0.08 |

20-NYTP | 500 | 125 | 0.5 | 0.08 |

30-NYTP | 600 | 150 | 0.5 | 0.08 |

40-NYTP | 800 | 200 | 0.5 | 0.08 |

50-NYTP | 1,000 | 250 | 0.5 | 0.08 |

60-NYTP | 1,200 | 300 | 0.5 | 0.08 |

70-NYTP | 1,500 | 375 | 0.5 | 0.08 |

80-NYTP | 2,000 | 500 | 0.5 | 0.08 |

90-NYTP | 2,500 | 625 | 0.5 | 0.08 |

100-NYTP | 3,000 | 750 | 0.5 | 0.08 |

120-NYTP | 3,500 | 875 | 0.5 | 0.08 |

140-NYTP | 4,000 | 1,000 | 0.5 | 0.08 |

160-NYTP | 4,500 | 1,125 | 0.5 | 0.08 |

1-HP | 100 | 25 | 0.8 | 0.7 |

2-HP | 200 | 50 | 0.8 | 0.5 |

5-HP | 300 | 75 | 0.8 | 0.4 |

10-HP | 500 | 125 | 0.8 | 0.3 |

15-HP | 700 | 175 | 0.65 | 0.1 |

20-HP | 800 | 200 | 0.65 | 0.1 |

30-HP | 1,000 | 250 | 0.65 | 0.1 |

40-HP | 1,500 | 375 | 0.65 | 0.1 |

50-HP | 2,000 | 500 | 0.65 | 0.1 |

60-HP | 2,500 | 625 | 0.65 | 0.1 |

70-HP | 3,000 | 750 | 0.65 | 0.1 |

80-HP | 3,500 | 875 | 0.65 | 0.1 |

90-HP | 4,000 | 1,000 | 0.65 | 0.1 |

100-HP | 4,500 | 1,125 | 0.65 | 0.1 |

*Note*: *CP*, crossover rate; *F*, mutation weighting factor.

As with previous studies (Deb & Tiwari 2008), pressure constraints are handled by adding penalties to the infeasible solutions (when nodal pressure values are lower than the specified minimum allowable values). For each case study, a total of 20 runs with different random number seeds are performed, and the averaged results are used to enable performance assessments, representing a statistically rigorous analysis. All the three EAs are developed in C language, and the EPANET 2 network solver is used for hydraulic simulation. The entire experiment is performed on five computers (Dell PC (Inter R) at 2.9 GHz), with a total of 141.4 days simulation time (the total number of network evaluations is 8.3 × 10^{10}).

### EA performance assessment

*M*-NET problem is

*M*times the currently known best solution of its corresponding original WDS problem. Therefore, the relative performance of each EA algorithm can be assessed by using percentage deviation from the global optima. Considering 20 runs for each EA applied to each case study, 20 estimates of percentage deviations can be obtained, and its average is used to enable the performance assessment, which can be described as follows:where Dev is the averaged deviation between the optimal solutions identified by EAs and the global optimal for a given case study;

*T*is the total number of different runs (

*T*= 20 in this paper);

*F*is the optimal (minimal) solution obtained by the EA in the

_{i}*i*th run; and

*FG*is the global optima for the given case study. Obviously, a smaller value of Dev indicates a better performance of EAs.

To assess EAs’ ability in finding feasible solutions for case studies with different scales, we take the number of evaluations with which the first feasible solution (NEFFS) was found as an indicator. In a similar way to the solution quality assessment, the mean of the NEFFSs across 20 runs is used to enable performance analysis. It should be noted that the metric of NEFFSs is conditioned on the random sampling method that is used to initialize the EA's searching.

## RESULTS AND DISCUSSION

*M*-NYTP case studies

Figure 3 summarizes the performance of the three EAs applied to the *M*-NYTP case studies, with details given in Supplementary Table S1. Generally, the performance of the EAs deteriorates as the number of decision variables increases, which is expected as EAs are often challenged to identify optimal solutions for optimization problems with large search spaces (Maier *et al.* 2014). As illustrated in Figure 3, for the 1-NYTP with 21 decision variables, the average deviations of the SDE, SGA and CGA are 0, 0.8 and 2.8%, respectively. However, when the number of decision variables reaches up to 3,360, the corresponding average deviations become 191.9, 92.8 and 46.4%, as detailed in Supplementary Table S1.

Interestingly, three different performance-changing phases can be observed for the three simple EAs applied to *M*-NYTP case studies. For the first phase with (the number of decision variables is smaller than 420), the performance of the three EAs become worse with a moderate rate as a result of increasing scales of the optimization problems. This is followed by the second phase with 20 < (the number of decision variables is between 420 and 1,260) that the three EAs exhibit overall stable performance in terms of the solution quality, regardless of some variations. Subsequently, all the EAs consistently show a rapid decrease of performance in terms of the deviations from global optima, which is especially the case for the SDE.

In comparing the relative performance among the three EAs, the SDE exhibits the overall best performance for the case studies with . This is proven by the fact that the SDE is able to identify optimal solutions with averaged deviations below 1% for case studies with *M* < 10 as shown in Supplementary Table S1 (i.e. the number of decision variables is less than 210), and is capable of locating optimal solutions with averaged deviations approximately 18% for case studies with 10 < *M*60 (i.e. the number of decision variables is between 210 and 1,260). As observed in Figure 3, for the case studies with *M*60, the SGA and CGA show overall similar performance in terms of the solution quality.

It is interesting to note that the SDE's performance became worse than its counterparts (SGA and CGA) when *M* > 60 (i.e. the number of decision variables is greater than 1,260), with performance difference being more prominent for larger optimization problems. In contrast, while the CGA also shows decreased performance for optimization problems with *M* > 60, its deterioration rate is significantly lower than the SDE and SGA. This indicates that GAs with creeping mutation are effective at maintaining their searching quality when dealing with optimization problems of various sizes. In terms of performance variations across 20 runs, despite some variations, all three EAs exhibit satisfactory performance as their solution quality is not significantly affected by the starting random number seeds of the optimization runs (see Figure 3), but the CGA also exhibits relatively better performance, especially for the case studies with *M* > 60.

The NEFFS was found for each EA applied to each *M*-NYTP as shown in Figure 4. As a general pattern, the number of WDS evaluations required for the three EAs to find feasible regions increases as the number of decision variables becomes larger. Interestingly, the increasing patterns are quite similar for the three different EAs: the number of evaluations increases with a relatively slow rate when *M*60 (the number of decision variables is no more than 1,260), followed by a relatively rapid rate afterwards.

When comparing the relative performance among the three EAs in their efficiency to find feasible solutions, SGA and CGA exhibit similar performance, and the performance of SDE is slightly worse. More specifically, the three EAs have a similar ability to efficiently find feasible solutions when the number of decision variable is no more than 1,260 (*M* = 60). As the number of decision variables further increases, the SDE's performance deteriorates more rapidly than for the SGA or CGA. Furthermore, SDE's performance variability in finding feasible solutions becomes significant when dealing with large-scale problems. For example, when the number of decision variables is 3,360 (*M* = 160), the maximum number of evaluations required by the SDE to find the first feasible solution among 20 runs is 4.6 × 10^{5}, which is significantly higher than the minimum number of evaluations (3 × 10^{5}). This indicates that the SDE's performance is appreciably variable when handling such a large optimization problem.

*M*-HP case studies

The performance of the three EAs applied to the *M*-HP case studies are presented in Figure 5 and Supplementary Table S2. As can be seen, EAs’ performance when dealing with *M*-HP case studies is surprisingly different to those based on *M*-NYTP problems in Figure 3. These differences include (i) the SGA deteriorates extremely rapidly for the *M*-HP problems when *M* becomes larger (it cannot find feasible solutions when *M* > 60), in contrast to SGA's moderate deterioration rate when handling *M*-NYTP case studies; and (ii) the SDE exhibits good performance (smaller than 10% averaged deviations to the global optima) across all optimization problems with different numbers of decision variables, in contrast to SDE's rapidly decreasing performance for *M*-NYTP when *M* > 60. Such performance differences may be attributed to the significant variation between fitness-landscape properties associated with these two different WDS problems. This also highlights that an EA's performance is highly affected by the characteristics of the search spaces associated with the optimization problems.

While the three EAs show different abilities in approaching the global optimal for the *M*-HP problems, the deviations of near-optimal solutions identified by these three EAs are overall lower than 40% to their corresponding global optimal. This implies that EAs are able to find near-optimal solutions with relatively shorter distance to global optimal in solution space for the *M*-HP problems relative to *M*-NYTP problems. Such a difference may be associated with the problem properties, such as the length of the pipes, the sizes of the pipe diameters that can be selected, and the unit cost of the pipes.

Similar to the exhibited performance when applying to the *M*-NYTP case studies, the CGA also shows satisfactory performance for the *M*-HP problems, with a trajectory being: the performance of the CGA is slightly decreased when *M* increases up to 10, followed by a relatively stable status for *M* between 10 and 50, and then deteriorates in an increased rate for *M* > 50. For the largest problem with 3,400 decision variables (*M* = 100), the CGA is still able to find optimal solutions with approximately 25% deviations to the global optimal. This implies that the CGA is an effective algorithm in finding optimal solutions for the WDS design problems, and surprisingly, such effectiveness is achieved by using only a simple creeping mutation operator.

In terms of the number of evaluations required to find first feasible solutions, the CGA and SDE perform in a very similar way as shown in Figure 6, and both are significantly better than the SGA. Combining the results in Figure 4, it can be deduced that the simple GAs are challenged to deal with optimization problems with highly constrained search spaces (i.e. the HP problem has a small proportion of feasible regions as demonstrated in Bi *et al.* (2015)).

### Guideline of the choice of EAs when dealing with WDS design problems

Based on the findings from the large number of case studies, we have developed guidelines to assist the choice of suitable EAs for WDS design optimization problems with different sizes. It must be acknowledged that this guideline is dependent on the parameter values selected based on the recommendations made in previous studies. However, we believe it is still meaningful as it is rare for practitioners to extensively calibrate algorithm parameters when dealing with real-world problems.

As shown in Figure 7, for optimization problems with relatively small problems (the number of decision variables is lower than 200), the SGA is recommended as the appropriate optimization algorithm. Such a choice is based on its acceptable performance when handling relatively small WDS design problems as observed in this study, and more importantly its simpler algorithm structure relative to its counterparts (Wang *et al.* 2015a, 2015b). The SDE can be a suitable alternative when optimization problems have the number of decision variables between 200 and 500, as it consistently shows better performance than SGA and CGA for all case studies with the number of decision variables being within this specified range. If WDS design problems with a very large number of decision variables are optimized, the CGA is the best option due to its great robustness as observed in the present study.

## CONCLUSIONS

EAs have been widely used in the domain of water resources. However, there still exists a fundamental issue on how close optimal solutions found by an EA can approach the global optima for water resources problems with varying sizes and complexity. This paper aims to improve the understanding the performances of EAs’ solutions in approaching the global optima and provide guidance for the selection of EAs in practical applications. To achieve these goals, we evaluated three EAs in this paper, namely the SDE, the SGA and the CGA, to quantify the deviation of their optimal solutions from the global optimum as well as their ability in finding feasible regions. We investigated how the performance of the three EAs can change for WDS design problems of various sizes. This is achieved by synthetic case studies modified from two benchmark networks, consisting of different numbers of decision variables ranging from 21 to 3,400.

It is worth noting again that these results are conditioned on the parameterization strategies used in this paper, which are based on recommendations from previous studies. However, practical applicability of our findings is still meaningful as a comprehensive parameter calibration can be tedious and time consuming especially for large-scale optimization problems. Hence, in practical optimization for large WDSs, recommended parameter values from previous studies are often used, as addressed in this study. Results from this study greatly enhance the existing understanding of the performance of the three EAs on how close their optimal solutions can be to the global optima, as well as the ability for the EAs to efficiently find feasible solutions. Conditioned on the selected case studies as well as the algorithm parameterization strategies, the following observations and implications can be summarized:

- (1)
The main contribution of this study is that we have offered preliminary knowledge on how close the simple EAs’ optimal solutions can approach the global optima for WDS design problems of varying sizes. While different EAs exhibit different performance, the results can also be meaningful as an overall indicator of EAs’ ability in dealing with WDS design problems with different sizes. Given the fact that near-optimal solutions are often desirable within practical applications, such knowledge can be very useful as we may not need to use advanced and complex optimization techniques if the deviations of the optimal solutions identified by the simple EAs to the global optima, as outlined in the present study, are considered to be acceptable by practitioners.

- (2)
As a general observation, the performances of three EAs are similar and also satisfactory for WDS design problems with a relatively small number of the decision variables (if say smaller than 200 decision variables). For such relatively small-sized problems, the deviations from the global optima are overall lower than 10% as observed in this study.

- (3)
For relatively larger optimization problems (say greater than 500 decision variables), the EAs’ performances can be highly variable across individual optimization problems, which are inherently associated with the underlying properties of the search spaces. For example, the SDE is likely to have greater ability in navigating a highly constrained searching space relative to the SGA, as observed in the present research. This highlights the great importance of building prior knowledge on the landscape properties before the choice of the suitable EAs.

- (4)
The CGA exhibits satisfactory performance across different problem types and scales as it can find the optimal solutions for optimization problems ranging from 21 to 3,400 decision variables with deviations overall lower than 50%, which is significantly lower than SGA. While the SDE exhibits very good performance for the

*M*-HP problems, its performance turns out to be unsatisfactory when dealing with relatively large*M*-NYTP case studies. - (5)
Based on the results of the performance of EAs in finding feasible solutions, it seems that the CGA and SDE possess a great ability relative to the SGA in locating feasible regions, which is especially the case for large and highly constrained optimization problems. This implies that the CGA and SDE are better alternatives if feasible solutions are the primary objective for a given optimization problem.

Limitations of the present study include: (i) it may be difficult to generalize the findings based on the two WDS design problems (NYTP and HP) with decision variables being pipes only, and hence future study should consider more complex WDS design problems with tanks, pumps and valves involved; and (ii) the proposed way in which the number of variables is increased can be fundamentally different from the real-world networks which are more interlinked and hydraulically dependent, and hence the findings observed in this study need to be validated using real-world WDSs (although it is difficult to derive the global optima for such problems).

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.