## Abstract

The differential evolution (DE) algorithm has been demonstrated to be the most powerful evolutionary algorithm (EA) to optimally design water distribution systems (WDSs), but issues such as slow convergence speed, limited exploratory ability, and parameter adjustment remain when used for large-scale WDS optimization. This paper proposes a novel self-adaptation and sorting selection-based differential evolutionary (SA-SSDE) algorithm that can solve large-scale WDS optimization problems more efficiently while having the greater ability to explore global optimal solutions. The following two unique features enable the better performance of the proposed SA-SSDE algorithm: (1) the DE/current-to-*p*best/n mutation and sorting selection operators are used to speed up the convergence and thus improve the optimization efficiency; (2) the parameter adaptation strategy in JADE (an adaptive differential evolution algorithm proposed by Zhang & Sanderson 2009) is introduced and modified to cater for WDS optimization, and it is capable of dynamically adapting the control parameters (i.e., *F* and CR values) to the fitness landscapes characteristic of larger-scale WDS optimization problems, allowing for greater exploratory ability. The proposed SA-SSDE algorithm found new best solutions of $7.068 million, €1.9205 million, and $30.852 million for three well-known large networks (ZJ_{164}, Balerma_{454}, and Rural_{476}), having the convergence speed of 1.02, 1.92, and 5.99 times faster than the classic DE, respectively. Investigations into the searching behavior and the control parameter evolution during optimization are carried out, resulting in a better understanding of why the proposed SA-SSDE algorithm outperforms the classic DE, as well as the guidance for developing more advanced EAs.

## HIGHLIGHTS

The DE/current-to-

*p*best/n mutation and sorting selection operators were proposed to encourage exploitative searching.A new parameter adaptation strategy was developed for WDS optimization, enabling the greater ability in exploring global optimums.

New best solutions of $7.068, €1.9205, and $30.852 million for ZJ

_{164}, Balerma_{454}, and Rural_{476}were found, with the convergence speed of 1.02, 1.92, and 5.99 times faster than the classic DE, respectively.

### Graphical Abstract

## INTRODUCTION

Water distribution systems (WDSs) are among the most expensive urban infrastructures owing to the large capital expenditure required for system construction (Al-Khomairi *et al.* 2020; Albarakati *et al.* 2021; Diao 2021; Lyu *et al.* 2021; Liu & Kang 2022). This inspires intensive studies into the optimal design of WDSs in order to minimize construction costs (Olsson 2020). The optimal design of WDS (WDS optimization) can be considered as a process that seeks a pipe diameter combination with the lowest network cost while meeting demand and pressure requirements (Yin *et al.* 2021). This process, however, is challenged by: (i) the huge search space encountered typically in practical engineering and (ii) the highly nonlinear constraints to be assessed, which necessitate the solution of a computationally expensive network simulation model (Wang *et al.* 2020; Zhang *et al.* 2022).

Many traditional optimization approaches have been developed for WDS optimization, such as linear and nonlinear programming (Alperovits & Shamir 1977; Lansey & Mays 1989). However, they are prone to being trapped by local minima due to the high-dimensional and discrete solution space of WDS optimization problems. As a consequence, population-based evolutionary algorithms (EAs) have gained popularity for WDS optimization. This is owing to their greater ability to explore broadly the entire solution space and the ease with which they can handle discrete pipe diameters, which traditional methods find difficult to handle. The GA (Simpson *et al.* 1994; Dandy *et al.* 1996; Savic & Walters 1997), particle swarm optimization (Suribabu & Neelakantan 2006), ant cony optimization (Maier *et al.* 2003), and the differential evolution (DE) algorithm (Suribabu 2010) are examples of representative EAs. According to comparison tests (Marchi *et al.* 2014), the DE algorithm outperforms other EAs in terms of finding optimal solutions for a wide variety of WDSs. However, the substantial computational burden due to slow convergence remains an issue when using DE for large-scale WDS optimizations, as well as other EAs (Maier *et al.* 2014; Mala-Jetmarova *et al.* 2018).

Certain preconditioning or heuristic approaches (Zheng *et al.* 2011; Kang & Lansey 2012; Bi *et al.* 2015; Liu *et al.* 2020) have been developed to obtain good initial solutions to improve the convergence speed of EAs. However, the approach proposed by Zheng *et al.* (2011) involves an extension of the Dijkstra graph algorithm to identify the shortest-distance tree of looped WDSs with multisource, which can be challenging to implement for practitioners with little expertise. In the studies of Kang & Lansey (2012), Bi *et al.* (2015), and Liu *et al.* (2020), specific programs must be designed to incorporate domain knowledge in order to find good initial solutions, making practitioners reluctant to employ these methods in their design work. Therefore, it would be more sensible to develop a computationally efficient and user-friendly EA for WDS optimization, given that the working principle of EA is theoretically much simpler and more likely to become a full-automatic optimization tool.

Besides, regarding the computational efficiency of EAs, Bi *et al.* (2016) have proved that while good starting solutions produced via preconditioning approaches enable more rapid identification of near-optimal solutions, the searching mechanism dominates the algorithm performance. More specifically, the searching mechanism that encourages exploitative search and allows for the rapid reduction of population diversity is the key to effectively identifying optimal solutions for large-scale WDS optimization problems. Therefore, modifying EAs' searching mechanism to improve the exploitative searching ability appears to be the soundest strategy to boost the computational efficiency of EAs in large-scale WDS optimization.

In addition to the low optimization efficiency, another issue that exists in EA applications is parameter adjustment, which is problem-dependent and has a significant impact on the algorithm performance. Taking DE-based WDS optimization as an example, while various studies have been undertaken and attempted to give general guidelines for selecting the control parameters *F* (the differential weight adopted in the mutation operator) and CR (the crossover probability adopted in the crossover operator), no agreement has been reached. Specifically, Zheng *et al.* (2015) suggested that a small *F* combined with a moderate CR within [0.3, 0.6] is preferred for efficiently solving large-scale WDSs, whereas Marchi *et al.* (2014) suggested that *F* = 0.3 with CR = 0.1 are the best parameter configuration for a real-size network (Rural with 476 pipes). Furthermore, Zheng *et al.* (2011) used *F* = 0.3 and CR = 0.8 to optimally design the Balerma network (BN) with 454 pipes and the ZJ network with 164 pipes. Although the aforementioned studies have made significant contributions to appropriately selecting *F* and CR values for WDS optimization, their inconsistent options on *F* and CR values highlight the fact that selecting appropriate *F* and CR values can be difficult for even seasoned experts, let alone practitioners with little related experience.

To address the parameter adjustment issue, Zheng *et al.* (2013) proposed a self-adaptive DE (SADE) algorithm that eliminates the need to manually adjust the *F* and CR values. Moreover, based on the fact that individuals in DE will eventually converge to an identical one, Zheng *et al.* (2013) proposed a convergence criterion to automatically terminate algorithm proceeds, which further increases the utility of SADE in practice. For large-scale WDS optimization, however, it appears that the SADE algorithm proposed by Zheng *et al.* (2013) does not perform as well as other EAs or even the classic DE. Specifically, SADE's best solution for the BN_{454} case is €1.983 million, which is worse than the €1.923 million found by NLP-DE (Zheng *et al.* 2011), the €1.940 million found by HD-DDS (Tolson *et al.* 2009), the €1.9460 million found by FDE (Moosavian & Lence 2019), and the €1.982 million found by the classic DE (Zheng *et al.* 2011). Thus, further research is warranted to develop a more powerful self- adaptation DE for large-scale WDS optimization.

The goal of this paper is to propose a novel self-adaptation and sorting selection operator-based differential evolutionary (SA-SSDE) algorithm that can solve large-scale WDSs more efficiently, while also having a greater ability to explore global optimal solutions. This is achieved by introducing greedier mutation and selection operators that favor exploitation and a novel parameter adaptive strategy that can enhance exploration. The ability of an EA to broadly explore the whole search space is referred to as exploration, while the ability to intensively exploit the promising regions within the search process is referred to as exploitation (Maier *et al.* 2014). Excessive exploration causes slow convergence, whereas overexploitation induces premature convergence to inferior solutions. Therefore, a good balance between exploration and exploitation is important to ensure robust algorithm performance.

The following are contributions that fill gaps in the current literature and main findings of this study: (1) the DE/current-to-*p*best/n mutation and sorting selection operators are proposed for the first time for WDS optimization; (2) the parameter adaptation strategy pioneered by Zhang & Sanderson (2009) in JADE is firstly introduced and modified to cater for WDS optimization; (3) new best solutions with the cost of $7.068, €1.9205, and $30.852 million of three well-known large networks (ZJ_{164}, Balerma_{454}, and Rural_{476}) are found with the proposed SA-SSDE algorithm; and (4) besides comparisons with other EAs on end-of-run performance, the searching behavior and the control parameter evolution (i.e., *F* and CR) during optimization are explored, which enhance knowledge regarding why one algorithm outperforms another, thereby providing a new perspective for developing more advanced EAs.

The rest of this paper is organized as follows. First, the formulation of the WDS optimization problem is presented, followed by a detailed description of the proposed SA-SSDE algorithm. Then, a brief description of four well-known WDS case studies, namely Hanoi network (HAN), ZJ, BN, and rural network (RN), is presented, which are used to validate the algorithm's efficacy. Following that, the proposed SA-SSDE algorithm is thoroughly investigated in the result and discussion section, including parameter sensitivity analysis, performance comparisons with other EAs, searching behavior comparisons between different mutation and selection operators, and the visualization of control parameter evolution. Finally, a summary and conclusions are drawn.

## FORMULATION OF THE WDS OPTIMIZATION PROBLEM

*et al.*2015), having no effect on the algorithm performance validation and (2) the same consideration is taken for most other studies (Zheng

*et al.*2011; Marchi

*et al.*2014; Bi

*et al.*2016; Liu

*et al.*2020). Therefore, the objective function and constraints can be expressed as follows:

*f*(

**D**) is the network cost to be minimized;

**D**is the decision variable (i.e., a candidate pipe diameter combination), and Np represents the number of pipes in the network;

*C*(

*d*)·

_{i}*l*represents the cost of pipe

_{i}*i*with the diameter and length of

*d*and

_{i}*l*, respectively;

_{i}**h**(

**D**) is the nodal head vector identified based on

**D**, and

*h*(

_{j}**D**) represents the pressure head for node

*j*;

**A**is the set of optional pipe diameters, and

*h*

_{min}is the minimum nodal head that must be satisfied for a specific WDS optimization problem. The integer encoding strategy is used in this study because of its simplicity and broad application in the context of WDS optimization (Wang

*et al.*2015). Furthermore, the network is solved using EPANET (Rossman 2000), which allows the identification of constraint violation degrees based on Equation (4). When calculating Equation (4), the degrees of constraint violation are set to zero for nodes with pressure heads greater than

*h*

_{min}. This is because the feasibility of a solution is solely determined by the nodes whose pressure head does not meet the requirement, and has nothing to do to with nodes with pressures higher than

*h*

_{min}.

## THE PROPOSED SA-SSDE ALGORITHM

The evolutionary strategy of the proposed SA-SSDE algorithm is similar to that of the classic DE. Specifically, the algorithm enters a circle of evolutionary operations: mutation, crossover, and selection, following a random creation of the initial population. The proposed SA-SSDE, on the other hand, differs from classic DE in three aspects: (1) the DE/current-to-*p*best/1 mutation with the archive is used to replace the DE/rand/1 mutation strategy; (2) the sorting selection operator is developed to replace the tournament selection operators in classic DE; and (3) the parameter adaptation strategy of JADE is introduced and modified to accommodate WDS optimization. The three modifications will be elaborated subsequently, and the convergence criterion provided by Zheng *et al.* (2013) is used to automatically terminate the algorithm proceeds.

### DE/current-to-*p*best/1 mutation with an archive operator

*p*best/1 mutation with the archive operator was first proposed by Zhang & Sanderson (2009), as illustrated in Equation (5).where is randomly selected as one of the top 100

*p*% individuals in the current population with

*p*∈ (0,1]. is randomly chosen from the current population and an archive of parent solutions that failed in the selection operator.

*F*is the mutation factor that is applied at the individual level and regenerated at each generation. The size of

_{i}*p*affects the greed degree of the mutation operator, with a lower

*p*resulting in faster convergence to promising areas but a larger likelihood of becoming trapped by local optimal solutions, and vice versa. Therefore,

*p*∈ [0.05, 0.2] (top 5–20% best individuals) was suggested by Zhang & Sanderson (2009) to obtain a good tradeoff between convergence and robustness. The traditional mutation operator (e.g., DE/rand/l) is presented in Equation (6), so that the DE/current-to-

*p*best/1 mutation with the archive operator can be compared for better understanding.where , , and are three different vectors chosen at random from the current population. By comparing Equations (5) and (6), it is found that the DE/current-to-

*p*best/1 mutation with archive operator features the usage of the best solution information () and the optional external archive (i.e., ), leading to a broader searching region that is biased toward promising progress directions. In contrast, the traditional DE/rand/1 mutation operator searches a relatively small region that is not biased in any direction (Zhang & Sanderson 2009).

### The sorting selection operator

The selection operator determines which of the target vectors or trial vectors survive to the next generation. The tournament selection operator, which performs a pair-wise comparison between the trial vector and the corresponding target vector, has been used in classic DE and nearly all DE variations. Such a pair-wise comparison strategy results in a one-to-one spawning relationship and a longer life span of inferior individuals, which thus benefits the maintenance of population diversity. However, this benefit can sometimes turn into a drawback since it might lead to a significantly higher computational burden due to the resultant longer generations required to achieve convergence (Noman & Iba 2008). This is especially true for problems that require a longer simulation time for objective function evaluation, such as large-scale WDS optimization. In fact, as revealed by Bi *et al.* (2016), the searching mechanism that allows population diversity to be rapidly reduced (i.e., favors exploitative searching) is better suited for large-scale WDS optimization.

To this end, the sorting selection operator is developed to allow as many fitter trial vectors as possible to survive into the next generation, which therefore encourages the rapid reduction of population diversity. The proposed sorting selection procedure is straightforward and comprises the three phases listed subsequently. First, all individuals in trial and target vectors are classified as feasible or infeasible solutions. Second, individuals in the feasible solution group are sorted in ascending order based on objective function values, whereas individuals in the infeasible solution group are sorted in ascending order based on constraint violations. Third, selection can be accomplished by giving priority to choosing feasible solutions with smaller objective function values, followed by infeasible solutions with smaller constraint violations until the number of selected individuals reaches the pre-specified population size. The following pseudo-code can be used to sort a population with feasible and infeasible solutions.

**Pseudo-code for sorting population of feasible and infeasible solutions**

Input: all individuals in trial vectors and target vectors (defined as Y), with the population size of 2NP; |

Output: the sorted population new Y; |

1:For i =1: 2NP |

2: If (Cons(i) = =0) |

3: feasibleY = [feasibleY;Y] |

4: else |

5: infeasibleY = [infeasibleY;Y] |

6: end |

7: end |

8: Sorting feasibleY to build newfeasibleY based on their objective function values |

9: Sorting infeasibleY to build newinfeasibleY based on their constraint violations; |

10: newY = newfeasibleY+ newinfeasibleY |

Input: all individuals in trial vectors and target vectors (defined as Y), with the population size of 2NP; |

Output: the sorted population new Y; |

1:For i =1: 2NP |

2: If (Cons(i) = =0) |

3: feasibleY = [feasibleY;Y] |

4: else |

5: infeasibleY = [infeasibleY;Y] |

6: end |

7: end |

8: Sorting feasibleY to build newfeasibleY based on their objective function values |

9: Sorting infeasibleY to build newinfeasibleY based on their constraint violations; |

10: newY = newfeasibleY+ newinfeasibleY |

### Parameter adaptation strategy

The primary idea behind the parameter adaption strategy (PAS) is to record *F* and CR values that can generate better trial vectors in the current generation and then utilize them to generate new *F* and CR values in the next generation. The PAS in JADE, pioneered by Zhang & Sanderson (2009), has been demonstrated to work well with the DE/current-to-*p*best/1 mutation operator and is adopted in this study, but with the following two modifications to cater for WDS optimization.

*F*within (0.5, 1) and CR

_{i}*within (0, 1), as shown in Equation (7). Second, a novel PAS proposed by Zhan*

_{i}*et al.*(2019) is employed to replace the traditional one, as illustrated in Equation (8). Limiting

*F*generation in (0.5, 1) can prevent premature convergence and improve global exploratory ability. The novel PAS is introduced to speed up the process of updating individual parameters and provide more successful

_{i}*F*and CR values.where

*f*(

*x*) is the function value (i.e., network cost) of the target vector, and

_{k}*f*(

*u*) is the function value of the corresponding trial vector. In summary, five parameters that affect the algorithm performance need to be specified or initialized in the PAS, namely

_{k}*μ*,

_{F}*μ*

_{CR},

*σ*,

_{F}*σ*

_{CR}, and

*c*. To be more specific,

*μ*and

_{F}*μ*

_{CR}(mean value) determine the size of

*F*and CR

_{i}*to be generated using Equation (7), and their values are updated using Equation (8) in each generation based on feedback from evolutionary search.*

_{i}*S*and

_{F}*S*

_{CR}are the sets consisting of

*F*and CR

_{i}*that successfully generate better trial vectors. The other three parameters, namely*

_{i}*σ*,

_{F}*σ*

_{CR}(standard deviation) and

*c*, stay the same throughout the evolutionary process, with

*σ*and

_{F}*σ*

_{CR}determining the spread range of

*F*and CR

_{i}*, respectively, and*

_{i}*c*defining the updating speed of

*μ*and

_{F}*μ*

_{CR}. For better understanding purposes, the pseudo-code of the proposed SA-SSDE algorithm is presented below.

**Pseudo-code of the proposed SA-SSDE algorithm**

1: Begin |

2: Initializing μ_{F}, μ_{CR}, σ_{F}, σ_{CR}, c and p |

3: Generating a random initial population |

4: S = ∅; _{F}S = ∅; _{CR}A = ∅ |

5: For Generation = 1 to G |

6: Generating F = randc(_{i}μ, σ_{F}_{F}), CR = randc(_{i}μ, σ_{CR}_{CR}); |

7: Executing mutation and crossover operation based on F and _{i}CR _{i} |

8: Executing selection operation (individuals failed in selection used to update the A) |

9: Fori = 1 to NP |

10: If f(U) ≤ _{i}f(X) & _{i}U is feasible (_{i}U is the trial vector and X is the target vector) |

11: F_{i} → S; _{F}CR_{i} → S_{CR} |

12: End for |

13: Updating μ_{F} and μ_{CR} based on the renewed S and S_{F} using Eq. (8); _{CR} |

14: Break For if the convergence criterion is meet |

15: End For |

1: Begin |

2: Initializing μ_{F}, μ_{CR}, σ_{F}, σ_{CR}, c and p |

3: Generating a random initial population |

4: S = ∅; _{F}S = ∅; _{CR}A = ∅ |

5: For Generation = 1 to G |

6: Generating F = randc(_{i}μ, σ_{F}_{F}), CR = randc(_{i}μ, σ_{CR}_{CR}); |

7: Executing mutation and crossover operation based on F and _{i}CR _{i} |

8: Executing selection operation (individuals failed in selection used to update the A) |

9: Fori = 1 to NP |

10: If f(U) ≤ _{i}f(X) & _{i}U is feasible (_{i}U is the trial vector and X is the target vector) |

11: F_{i} → S; _{F}CR_{i} → S_{CR} |

12: End for |

13: Updating μ_{F} and μ_{CR} based on the renewed S and S_{F} using Eq. (8); _{CR} |

14: Break For if the convergence criterion is meet |

15: End For |

## CASE STUDIES

As shown in Table 1, four WDS case studies with increased network complexity have been used to test the efficacy of the proposed SA-SSDE algorithm. Detailed information for the HAN and BN, including the commercially available diameters and the corresponding unit pipe costs, is available from the website of the Centre for Water Systems at the University of Exeter (Wang *et al.* 2015). For details on the ZheJia (ZJ) network and the RN, see Zheng *et al.* (2011) and Marchi *et al.* (2014), respectively.

WDS case study . | Number of decision variables . | Number of available pipe diameters . | Search space size . |
---|---|---|---|

HAN | 34 | 6 | 2.865 × 10^{26} |

ZJ | 164 | 14 | 9.2257 × 10^{187} |

BN | 454 | 10 | 1 × 10^{454} |

RN | 476 | 15 | 6.61 × 10^{559} |

WDS case study . | Number of decision variables . | Number of available pipe diameters . | Search space size . |
---|---|---|---|

HAN | 34 | 6 | 2.865 × 10^{26} |

ZJ | 164 | 14 | 9.2257 × 10^{187} |

BN | 454 | 10 | 1 × 10^{454} |

RN | 476 | 15 | 6.61 × 10^{559} |

These four WDSs were chosen for the following two reasons. First, they are all from publicly available literature and based on EPANET simulation, and thus their fundamental information is readily available, allowing the exact replication of optimization results as well as a comparison with other EAs. Second, all four WDSs have either a rugged solution space or an unknown global optimum, which are representative of difficult optimization problems faced in real-world situations and hence useful in evaluating the algorithm's global searching ability.

## RESULT AND DISCUSSION

### Parameter sensitivity analysis

As mentioned previously, six parameters (*μ _{F}*,

*μ*

_{CR},

*σ*,

_{F}*σ*

_{CR},

*p*, and

*c*) that affect algorithm performance need to be initialized. Based on the four WDS cases mentioned in the preceding section, parameter sensitivity tests were performed and show that four of them (

*μ*,

_{F}*μ*

_{CR},

*c*, and

*p*) have an insignificant impact on algorithm performance, which agrees with the earlier work of Zhang & Sanderson (2009). Surprisingly, it was observed that

*σ*= 0.01 and

_{F}*σ*

_{CR}= 0.01 are better suited for WDS optimization problems. Table 2 presents the statistical results of the proposed SA-SSDE algorithm with various

*σ*and

_{F}*σ*

_{CR}values applied to four WDS case studies. Other parameters were set as

*μ*and

_{F}*μ*

_{CR}= 0.7,

*c*and

*p*= 0.2 based on the recommendation of Zhang & Sanderson (2009). A population size of 300 was used for all case studies, since increasing the population size only slightly improved solution quality while significantly increasing computational costs.

Case study . | The value of σ and _{F}σ_{CR}
. | Number of trials . | Best solution found . | Trials achieving the best solution (%) . | Average solution found^{a}
. | Average evaluations to find the first occurrence of the best solution . | Average evaluations to achieve final convergence . |
---|---|---|---|---|---|---|---|

HAN | 0.01 | 100 | 6.081 | 97 | 6.088 | 45,105 | 49,926 |

0.05 | 100 | 6.081 | 95 | 6.095 | 50,040 | 54,910 | |

0.1 | 100 | 6.081 | 93 | 6.098 | 54,348 | 59,637 | |

ZJ | 0.01 | 10 | 7.068 | 10 | 7.083 | 405,670 | 409,670 |

0.05 | 10 | 7.081 | 0 | 7.104 | 420,710 | 428,250 | |

0.1 | 10 | 7.095 | 0 | 7.138 | 478,250 | 484,080 | |

RN | 0.01 | 10 | 30.852 | 10 | 31.136 | 817,380 | 819,375 |

0.05 | 10 | 30.898 | 0 | 31.555 | 926,910 | 928,020 | |

0.1 | 10 | 30.921 | 0 | 31.734 | 1,113,570 | 1,114,440 | |

BN | 0.01 | 10 | 1.9205 | 10 | 1.924 | 787,365 | 789,705 |

0.05 | 10 | 1.9224 | 0 | 1.937 | 898,605 | 901,380 | |

0.1 | 10 | 1.9230 | 0 | 1.947 | 1,118,025 | 1,121,250 |

Case study . | The value of σ and _{F}σ_{CR}
. | Number of trials . | Best solution found . | Trials achieving the best solution (%) . | Average solution found^{a}
. | Average evaluations to find the first occurrence of the best solution . | Average evaluations to achieve final convergence . |
---|---|---|---|---|---|---|---|

HAN | 0.01 | 100 | 6.081 | 97 | 6.088 | 45,105 | 49,926 |

0.05 | 100 | 6.081 | 95 | 6.095 | 50,040 | 54,910 | |

0.1 | 100 | 6.081 | 93 | 6.098 | 54,348 | 59,637 | |

ZJ | 0.01 | 10 | 7.068 | 10 | 7.083 | 405,670 | 409,670 |

0.05 | 10 | 7.081 | 0 | 7.104 | 420,710 | 428,250 | |

0.1 | 10 | 7.095 | 0 | 7.138 | 478,250 | 484,080 | |

RN | 0.01 | 10 | 30.852 | 10 | 31.136 | 817,380 | 819,375 |

0.05 | 10 | 30.898 | 0 | 31.555 | 926,910 | 928,020 | |

0.1 | 10 | 30.921 | 0 | 31.734 | 1,113,570 | 1,114,440 | |

BN | 0.01 | 10 | 1.9205 | 10 | 1.924 | 787,365 | 789,705 |

0.05 | 10 | 1.9224 | 0 | 1.937 | 898,605 | 901,380 | |

0.1 | 10 | 1.9230 | 0 | 1.947 | 1,118,025 | 1,121,250 |

^{a}The cost unit for the HAN, ZJ and RN case studies is $ million, and the cost unit for the BN case study is € million.

Results in Table 2 reveal an interesting finding, and consistently for four WDS optimizations, the smaller *σ _{F}* and

*σ*

_{CR}(e.g.,

*σ*and

_{F}*σ*

_{CR}= 0.01) provide an improved algorithm performance in both solution quality and computational efficiency, particularly for three large-scale WDSs. Specifically, for the BN case, the algorithm with

*σ*and

_{F}*σ*

_{CR}= 0.01 produced an optimal solution of $1.9205 million, which is better than $1.9224 and $1.9230 million found by

*σ*and

_{F}*σ*

_{CR}= 0.05 and 0.1. Moreover,

*σ*and

_{F}*σ*

_{CR}= 0.01 enable faster convergence using only 789,705 evaluations averaged over 10 runs, nearly 30% fewer than the 1,118,025 evaluations required by

*σ*and

_{F}*σ*

_{CR}= 0.1. Similar observations can be made for the remaining three case studies, demonstrating that

*σ*and

_{F}*σ*

_{CR}= 0.01 are better suited for WDS optimization. This finding differs from SaDE (Qin

*et al.*2009), JADE (Zhang & Sanderson 2009), MDE_pBX (Islam

*et al.*2012), and ADDE (Zhan

*et al.*2019), in which the

*σ*and

_{F}*σ*

_{CR}are fixed as 0.1 to solve various numerical benchmarks, thereby providing improved knowledge on developing more powerful PAS to solve real-world engineering problems.

The improved algorithm performance associated with *σ _{F}* and

*σ*

_{CR}= 0.01 can be explained by the fact that smaller

*σ*and

_{F}*σ*

_{CR}enable

*F*and CR to be generated in a narrower but more suitable region than larger

*σ*and

_{F}*σ*

_{CR}. As a result of generating a greater number of suitable

*F*and CR, a larger number of better trial vectors can be produced, thereby leading to improved algorithm performance. It is worth noting that utilizing too small

*F*and CR values (such as 0.001) can result in

*F*and CR values being generated in an excessively limited range, leading to poor parameter adaptation ability and thereby poor algorithm performance.

*σ*) with

*μ*= 0.5. It is found that

*σ*= 0.01 enables approximately 90% of random numbers to be generated in [0.45, 0.55], and around 10% of random numbers in other ranges, which is appropriate to ensure the satisfactory algorithm performance while yielding good parameter adaptation ability. In contrast,

*σ*= 0.1 generates only about 34% of the random numbers in the range [0.45, 0.55] and about 66% of the random numbers in other ranges, which offer greater parameter adaptation ability but is insufficient for good algorithm performance. The opposite is the case for

*σ*= 0.001, which generates about 99% of the random numbers in the range [0.45, 0.55], but only 1% in other ranges. This emphasizes the importance of developing a suitable parameter adaptation strategy for large-scale WDS optimization, as ineffective population evolution movements and a waste of computational effort are common outcomes of impropriate

*F*and CR values.

### Performance comparison with other EAs

Performance comparisons with other EAs are undertaken in this section to verify the efficacy of the proposed SA-SSDE algorithm. Table 3 summarizes the optimization results of the proposed SA-SSDE algorithm and other EAs applied to four WDS case studies, where optimal results are ranked based on the best solution found, with the higher rank indicating more competence in exploring global optimums.

Case study . | Algorithm . | Number of trials . | Best solution found . | Trials achieving the best solution (%) . | Average solution . | Average evaluations to find the first occurrence of the best solution . | Maximum allowable evaluations or evaluations for convergence . |
---|---|---|---|---|---|---|---|

HAN | NLP-DE | 100 | 6.081 | 98 | 6.081 | 42,782 | 80,000 |

SA-SSDE | 100 | 6.081 | 97 | 6.088 | 45,105 | 49,926 | |

FDE | 100 | 6.081 | 97 | 6.088 | 2,065 | 100,000 | |

DE^{a} | 100 | 6.081 | 95 | 6.089 | 87,220 | 200,000 | |

SADE | 50 | 6.081 | 84 | 6.090 | 60,532 | 100,000 | |

DE^{b} | 300 | 6.081 | 80 | NA | 48,724 | 50,000 | |

ZJ | SA-SSDE | 10 | 7.068 | 10 | 7.083 | 405,670 | 409,670 |

NLP-DE | 10 | 7.082 | 0 | 7.093 | 400,853 | 2,000,000 | |

DE^{c} | 10 | 7.112 | 0 | 7.136 | 820,657 | 2,000,000 | |

RN | SA-SSDE | 10 | 30.852 | 10 | 31.136 | 817,380 | 819,375 |

DE^{d} | 10 | 30.906 | 0 | 31.323 | 5,712,809 | 5,790,400 | |

DE^{e} | 10 | 31.221 | 0 | 31.887 | – | 1,000,000 | |

BN | SA-SSDE | 10 | 1.9205 | 10 | 1.924 | 787,365 | 789,705 |

TS-NSGA | 10 | 1.9214 | 0 | 1.925 | 10,640,000 | 20,000,000 | |

NLP-DE | 10 | 1.9230 | 0 | 1.927 | 1,427,850 | 2,000,000 | |

HDP-GA | 10 | 1.9410 | 0 | 1.954 | – | 1,300,000 | |

FDE | 10 | 1.9460 | 0 | 1.980 | 495,607 | 500,000 | |

DE^{f} | 10 | 1.9820 | 0 | 1.989 | 9,210,143 | 10,000,000 | |

SADE | 10 | 1.9830 | 0 | 1.995 | 1,200,000 | 1,300,000 | |

DE^{g} | 10 | 1.9980 | 0 | 2.031 | 2,300,000 | 2,400,000 |

Case study . | Algorithm . | Number of trials . | Best solution found . | Trials achieving the best solution (%) . | Average solution . | Average evaluations to find the first occurrence of the best solution . | Maximum allowable evaluations or evaluations for convergence . |
---|---|---|---|---|---|---|---|

HAN | NLP-DE | 100 | 6.081 | 98 | 6.081 | 42,782 | 80,000 |

SA-SSDE | 100 | 6.081 | 97 | 6.088 | 45,105 | 49,926 | |

FDE | 100 | 6.081 | 97 | 6.088 | 2,065 | 100,000 | |

DE^{a} | 100 | 6.081 | 95 | 6.089 | 87,220 | 200,000 | |

SADE | 50 | 6.081 | 84 | 6.090 | 60,532 | 100,000 | |

DE^{b} | 300 | 6.081 | 80 | NA | 48,724 | 50,000 | |

ZJ | SA-SSDE | 10 | 7.068 | 10 | 7.083 | 405,670 | 409,670 |

NLP-DE | 10 | 7.082 | 0 | 7.093 | 400,853 | 2,000,000 | |

DE^{c} | 10 | 7.112 | 0 | 7.136 | 820,657 | 2,000,000 | |

RN | SA-SSDE | 10 | 30.852 | 10 | 31.136 | 817,380 | 819,375 |

DE^{d} | 10 | 30.906 | 0 | 31.323 | 5,712,809 | 5,790,400 | |

DE^{e} | 10 | 31.221 | 0 | 31.887 | – | 1,000,000 | |

BN | SA-SSDE | 10 | 1.9205 | 10 | 1.924 | 787,365 | 789,705 |

TS-NSGA | 10 | 1.9214 | 0 | 1.925 | 10,640,000 | 20,000,000 | |

NLP-DE | 10 | 1.9230 | 0 | 1.927 | 1,427,850 | 2,000,000 | |

HDP-GA | 10 | 1.9410 | 0 | 1.954 | – | 1,300,000 | |

FDE | 10 | 1.9460 | 0 | 1.980 | 495,607 | 500,000 | |

DE^{f} | 10 | 1.9820 | 0 | 1.989 | 9,210,143 | 10,000,000 | |

SADE | 10 | 1.9830 | 0 | 1.995 | 1,200,000 | 1,300,000 | |

DE^{g} | 10 | 1.9980 | 0 | 2.031 | 2,300,000 | 2,400,000 |

The cost unit for the HAN, ZJ and RN case studies is $ million, and the cost unit for the BN case study is € million.

NLP-DE, nonlinear programming with differential evolution (Zheng *et al.* 2011); FDE, fittest individual referenced differential evolution (Moosavian & Lence 2019); SADE, self-adaptive differential evolution (Zheng *et al.* 2013); TS-NSGA, two-stage optimization with NSGA (Cisty *et al.* 2016); HDP + GA, head loss-based design preconditioner with GA (Liu *et al.* 2020).

^{a}The classic DE with *N* = 100, *F* = 0.7, and CR = 0.8 (Zheng *et al.* 2011).

^{b}The classic DE with *N* = 100, *F* = 0.6–0.9, and CR = 0.3–0.6 (Suribabu 2010).

^{c}The classic DE with *N* = 500, *F* = 0.3, and CR = 0.8 (Zheng *et al.* 2011).

^{d}The classic DE with *N* = 500, *F* = 0.3, and CR = 0.5 (This work).

^{e}The classic DE with *N* = 50, *F* = 0.3, and CR = 0.1 (Marchi *et al.* 2014).

^{f}The classic DE with *N* = 500, *F* = 0.3, and CR = 0.8 (Zheng *et al.* 2011).

^{g}The classic DE with *N* = 500, *F* = 0.3, and CR = 0.5 (Zheng *et al.* 2013).

According to the results in Table 3, the proposed SA-SSDE algorithm outperforms other EAs in finding the global optimum for three large-scale WDSs (i.e., ZJ_{164}, BN_{454}, and RN_{476}). Specifically, the proposed SA-SSDE algorithm found new best solutions for ZJ, RN, and BN with costs of $7.068, $30.852, and €1.9205 million, which are better than the current best solution of $7.082, $31.221, and €1.9214 million reported by Zheng *et al.* (2011), Marchi *et al.* (2014), and Cisty *et al.* (2016), respectively. Furthermore, as shown in Table 3, the average solutions obtained by the proposed SA-SSDE method for ZJ_{164}, BN_{454}, and Rural_{476} cases are also better than those of other EAs, demonstrating the robustness of the proposed SA-SSDE algorithm. For the HAN case, the proposed SA-SSDE performs slightly worse than the NLP-DE. This is due to the HAN case's small solution space, for which the proposed SA-SSDE algorithm achieves convergence in a few generations and the *F* and CR values have not yet been self-adapted.

In terms of computational efficiency, the proposed SA-SSDE algorithm also performs better than the majority of other EAs for ZJ, RN, and BN cases. For example, for the BN case, the TS-NSGA (Cisty *et al.* 2016) method used 10,640,000 evaluations to find the current best solution of €1.9214 million, whereas the proposed SA-SSDE algorithm required just 787,365 evaluations to find the new best solution of €1.9205 million. More importantly, in contrast to the pre-specified 20,000,000 evaluations for TS-NSGA to terminate the algorithm proceeds, the proposed SA-SSDA algorithm uses only 819,375 evaluations to do that when convergence criteria are met, implying greater computational efficiency and application convenience.

It is worth noting that the advantage of the proposed SA-SSDE algorithm rises with network size increases, notably in terms of computational efficiency. Specifically, the proposed SA-SSDE algorithm found the optimal solution of $6.081 million for the HAN_{34} case, with a similar computational overhead to the standard DE, as shown in Table 3. In contrast, the proposed SA-SSDE algorithm found the new best optimal solutions for three larger cases with significantly less computational overhead than classic DE. More specifically, the proposed SA-SSDE algorithm only requires 405,670; 787,365; and 817,380 average evaluations to find the optimal solutions for ZJ, BN, and RN cases, whereas the traditional DE requires 820,657; 2,300,000; and 5,712,809 average evaluations, indicating that the proposed algorithm converges 1.02, 1.92, and 5.99 times faster than the classic DE for three large networks.

To summarize, the previously mentioned comparisons demonstrate the superiority of the proposed algorithm for large-scale WDS optimization. This can be attributed to (1) the adaptation of DE/current-to-*p*best/*n* mutation and the sorting selection operators, which encourages exploitative searching and thus improves optimization efficiency, and (2) the improved parameter self-adaptation strategy, which enables a dynamical balance between exploration and exploitation according to the fitness landscape characteristic of large-scale WDS optimization problems, resulting in enhanced exploring ability. Detailed explanations will be given in the following sections.

### Searching behavior comparison

This section compares the searching behavior between the traditional and the proposed mutation and selection operators, namely the DE/rand/1 mutation with the tournament selection operators used in classic DE and the DE/current-to-pbest/*n* mutation with the sorting selection operators used in the proposed SA-SSDE algorithm. Such a comparison aims to gain a deep understanding of how mutation and crossover operators affect the algorithm performance.

*d*

_{mean}) convergence metric is used to characterize the search behavior, with the rapid decrease indicating exploitative searching is encouraged and the slow decline indicating explorative searching is emphasized (Bi

*et al.*2015). Figure 2 illustrates the statistical results averaged across 10 different trials for each CR ∈ {0.1, 0.5, 0.9} paired with each

*F*∈ {0.1(black), 0.3(red), 0.5(blue), 0.7(green), 0.9(orange)}. Because consistent findings were obtained for other three larger networks, just the BN case results are provided here.

By comparing the decrease speed of *d*_{mean}(G) presented in the right and left panels in Figure 2, it is clearly seen that the proposed mutation and selection operators result in a notably faster decrease of *d*_{mean}(G) than traditional operators, regardless of which pairs of *F* and CR are used. The rapid decrease of *d*_{mean}(G) is most clearly shown in subfigure (b) for CR = 0.1, where the proposed mutation and selection operators require no more than 250 generations to reach convergence, regardless of the *F* values chosen. The classic DE with CR = 0.1, on the other hand, exhibits an extremely slow convergence speed even with *F* = 0.1, as seen in subfigure (a) in Figure 1. As a result, it can be concluded that the proposed mutation and selection operators favor exploitative searching as opposed to exploratory searching emphasized by traditional mutation and selection operators in classic DE. This is to be expected because the proposed mutation and selection operators are more greedy in terms of producing and selecting fitter trial vectors, resulting in faster convergence of *d*_{mean}(G), and thus exploitative searching behavior.

More importantly, the left panels of Figure 2 show that the classic DE with *F* ≥ 0.5 cannot reach convergence in 10,000 generations. This suggests that, under an appropriate computational budget, classical DE can only employ a lower value of *F* (*F* ≤ 0.3) for large-scale WDS optimization. This phenomenon has also been noted by Zheng *et al.* (2015) and Bi *et al.* (2016), resulting in a lack of exploratory ability of the classic DE in exploring global optimums for large WDSs (e.g., ZJ, RN, and BN), as seen from the results in Table 3. In contrast, due to the employment of greedy mutation and selection operators, the proposed SA-SSDE algorithm can converge fast even with extremely high *F*, as reflected by the subfigures (b), (d), and (f) in Figure 2. This enables the greater ability of the proposed SA-SSDE algorithm to explore global optimums while allowing for faster convergence.

The previous searching behavior comparisons illustrate why the proposed SA-SSDE algorithm outperforms the classic DE in optimization efficiency, guiding the design of more advanced EAs by manipulating its searching behavior with modified mutation and selection operators.

### Evolution of control parameters *μ*_{F} and *μ*_{CR}

_{F}

*μ*and

_{F}*μ*

_{CR}during the optimization process for four WDSs in the left panels, as well as the best solution found at each generation in the right panels. The continued decrease of the best solution found, as well as the better final solution located using less generation over traditional DE for three large case studies, demonstrates the greater exploring ability of the proposed algorithm.

As can be seen from left panels in Figure 3, except for the optimization of HAN, which needs less than 150 generations to attain convergence due to its relatively smaller solution space, three changing phases of *μ _{F}* and

*μ*

_{CR}are clearly observed for the optimization of three large networks (i.e., ZJ, BN, and RN). In particular, the value of

*μ*declines rapidly in contrast to the rapid rise of the

_{F}*μ*

_{CR}during phase I. This was followed by a continual increase in

*μ*and a consistent decrease in

_{F}*μ*

_{CR}during phase II. Finally, there is a significant increase in

*μ*

_{CR}during phase III, as well as a moderate increase of

*μ*.

_{F}*μ*and

_{F}*μ*

_{CR}demonstrates a dynamical balance between exploitative and exploratory searching, with lower

*μ*and higher

_{F}*μ*

_{CR}favoring exploitative searching in the early stages, whereas higher

*μ*and lower

_{F}*μ*

_{CR}support exploratory searching in the latter stages. This phenomenon can be explained by examining the fitness landscape characteristic of WDS optimization problems. According to Bi

*et al.*(2016), the fitness landscape of larger-scale WDS optimization problems typically has a big-bowl macrostructure, which consists of a complicated and rugged microstructure at the bottom but a reasonably smooth solution space above, as shown in Figure 4. As a result, exploitative search is preferred in phase I, assisting in the fast population movement to the promising region (i.e., the bottom of the big bowl). This is reflected by the fact that, as shown in the right panels of Figure 3, the best solution found in phase I decreases quickly.

However, given the rugged solution space at the bottom of the big bowl, exploratory searching associated with increasing *μ _{F}* will aid the search to escape from local optimal solutions during phase II. This is illustrated by the slow improvement rate of the best solution found in this phase, as shown in the right panels of Figure 3. In phase III, stronger exploratory searching is necessary to further polish optimal solutions identified in phase II, explaining why

*μ*and

_{F}*μ*

_{CR}abruptly increased. Based on the previous analysis, it is clear that the proposed PAS is capable of dynamically adapting appropriate

*μ*and

_{F}*μ*

_{CR}to the fitness landscape characteristics of larger-size WDS optimization problems, resulting in a dynamical balance between exploitation and exploration. This explains why the proposed SA-SSDE algorithm is superior at exploring global optimums while also being more efficient.

## SUMMARY AND CONCLUSION

This paper presents a novel SA-SSDE algorithm that aims to address issues regarding the slow convergence, reduced exploratory ability, and parameter adjustment encountered with the classic DE for large-scale WDS optimization. The DE/current-to-*p*best/n mutation and sorting selection operators, as well as an enhanced parameter adaptation strategy, are employed in the proposed SA-SSDE algorithm to achieve this aim. The efficacy of the proposed SA-SSDE algorithm was validated using four well-known WDSs, including three large well-known networks (ZJ_{164}, BN_{454}, and RN_{476}), with studies into searching behavior and control parameter evolution providing the following conclusions:

- (1)
The proposed SA-SSDE algorithm outperforms existing EAs in terms of exploring global optimums for large-scale WDS optimization problems, as evidenced by the discovery of new best solutions for ZJ

_{164}, BN_{454}, and RN_{476}with costs of $7.068 million, €1.9205 million, and $30.852 million, respectively. Nevertheless, it should be emphasized that because all EAs are stochastic, the uncertainty in solution quality might be as high as 3%. This uncertainty can be reduced by increasing population size and running the proposed SA-SSDE algorithm with a different initial population. - (2)
The proposed SA-SSDE algorithm has a significantly faster convergence speed than the classic DE, and this advantage grows with network size. This is reflected by the fact that the proposed SA-SSDE algorithm found optimal solutions 1.02, 1.92, and 5.99 times faster than the classic DE for ZJ

_{164}, BN_{454}, and RN_{559}cases with increased network size. Note that rather than the DE/current-to-*p*best/n mutation operator, the sorting selection operator is the key to speeding up DE's convergence. The DE/current-to-*p*best/n mutation operator works better with PAS than the traditional DE/rand/1 mutation operator, which is why this study used it. - (3)
The proposed DE/current-to-pbest/n mutation and sorting selection operators allow for rapid reduction of population diversity even with a large

*F*, encouraging exploitative searching. This explains why the proposed SA-SSDE algorithm surpasses the classic DE in terms of computational efficiency, providing the groundwork for developing more efficient EAs by modifying the mutation and tournament selection operators. - (4)
The proposed parameter self-adaptation strategy can dynamically adapt appropriate

*F*and larger CR values to the WDS optimization problems with a big-bowl function structure, with small*F*and larger CR generated early to encourage exploitative searching and large*F*and small (mediate) CR generated later to support exploration searching. The resultant dynamical balance between exploitation and exploration leads to the superior ability of the proposed SA-SSDE algorithm to explore global optimums. However, because the proposed SA-SSDE algorithm is designed specifically for WDS optimization problems with a large bowl function structure and extremes tightly grouped at the bottom, its performance on other problem types may suffer according to ‘no free lunch theorem’. More case studies are needed to evaluate the proposed algorithm's performance in future work. - (5)
The six parameters in PAS were subjected to parameter sensitivity tests, with the findings indicating that

*σ*and_{F}*σ*_{CR}= 0.01 are more suited for WDS optimization. This finding differs from existing studies that use*σ*and_{F}*σ*_{CR}= 0.1 for various numerical benchmarks, allowing researchers to develop more powerful PAS to solve real-world engineering problems. Besides, when compared to other well-known PASs like those in the jDE (Brest*et al.*2006) and the SADE (Zheng*et al.*2013), the PAS framework proposed by Zhang & Sanderson (2009) in JADE is more suited for WDS optimization and hence merits greater attention from WDS researchers. - (6)
Results in the case study demonstrate the merit of the proposed SA-SSDE algorithm in least-cost single-objective WDS optimization, and future research will extend it to multi-objective WDS optimization problems. In addition, the proposed SA-SSDE algorithm can also be utilized to solve other water resources management problems, such as burst localization (Zhou

*et al.*2019) and model calibration (Chen*et al.*2022).

## ACKNOWLEDGEMENTS

This work is supported by the Key R&D projects in Yunnan Province (202003AC100001), the Key R&D plan of Yunnan Province (202103AC10017), and the National Natural Science Foundation of China (51608424). The authors thank to Dr Qi Wang for providing the WDS simulation code which greatly facilitates algorithm performance testing. We are also thankful to the reviewers for their constructive suggestions which greatly improved the quality of this paper.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information. Pipe diameter combinations for the new best solutions of ZJ, RN and BN cases are given in supplementary file, as well as comparison between the optimal results of €1.9205 million and €1.9230 million for BIN case, aiming to help readers better understand the optimal results.

## CONFLICT OF INTEREST

The authors declare there is no conflict.