## Abstract

The differential evolution (DE) algorithm is considered the most powerful evolutionary algorithm (EA) for the optimal design of water distribution systems (WDSs). However, when dealing with large-scale WDS optimization, issues such as premature convergence become a concern. This paper presents an auto-enhanced population diversity and ranking selection-based differential evolutionary (AEPD-RSDE) algorithm for the optimal design of WDSs, which is the first work that incorporates an AEPD strategy to avoid the premature convergence issue and enhance the exploration ability of DE applied to WDS optimization. Besides, the proposed algorithm includes a ranking selection strategy that replaces the tournament selection operator to enhance convergence speed. Three well-known WDSs, i.e., the New York Tunnels (NYT), the Hanoi network (HAN), and the Balerma irrigation network (BIN), were used to validate the proposed algorithm. Results indicate the proposed algorithm is able to find the current best solution, with a success rate of 100% for the NYT and HAN cases and lower average cost solution of €1.921 million for the BIN case relative to other EAs. Instead of solely focusing on ultimate performance comparison, search behavior analyses are conducted between different mutation and selection operators, offering deep insight to guide the development of more advanced EAs.

## HIGHLIGHTS

This paper introduces an auto-enhanced population diversity strategy to overcome premature convergence and enhance the exploration ability of differential evolution (DE) in the optimization of water distribution systems (WDS).

The paper compares the searching behavior of mutation and selection operators and provides an explanation for why the proposed algorithm is more effective than traditional DE.

## INTRODUCTION

Water distribution systems (WDSs) require a significant amount of capital for their construction, making them one of the most expensive types of urban infrastructure to build (Al-Khomairi *et al.* 2020; Albarakati *et al.* 2021; Liu & Kang 2022; Xiong & Wang 2022). Therefore, researchers have conducted extensive studies on optimizing WDS design to minimize construction costs (Bi *et al.* 2015). This optimization process involves finding the combination of pipe diameters that results in the lowest network cost while meeting pressure and demand requirements (Yin *et al.* 2021; Wang *et al.* 2022; Yilmaz *et al.* 2022). However, this task is challenging due to the vast search space encountered in practical engineering and the highly nonlinear constraints that need to be evaluated (Maier *et al.* 2014; Wang *et al.* 2020; Zhang *et al.* 2022).

Traditional optimization methods, including linear and nonlinear programming (NLP), have been employed for WDS optimization (Alperovits & Shamir 1977; Lansey & Mays 1989; Mala-Jetmarova *et al.* 2018). However, the high-dimensional and discrete nature of WDS optimization problems can cause these methods to become trapped in local minima (Tolson *et al.* 2009). As a result, population-based evolutionary algorithms (EAs) have become increasingly popular for WDS optimization due to their ability to explore the solution space more broadly and handle discrete pipe diameters with ease, which traditional methods struggle to do. Some representative EAs include the genetic algorithm (GA) (Simpson *et al.* 1994; Dandy *et al.* 1996; Savic & Walters 1997), particle swarm optimization (Suribabu & Neelakantan 2006), ant colony optimization (Ursem 2002; Maier *et al.* 2003), and differential evolution (DE) algorithm (Suribabu 2010). Comparative analysis has demonstrated that the DE algorithm is superior to other EAs in finding optimal solutions for various WDSs (Marchi *et al.* 2014; Zheng *et al.* 2015; Moosavian & Lence 2019). Nevertheless, when applying DE to large-scale WDS optimizations, the issue of premature convergence persists and can lead to DE's inability to explore global optimal solutions (Qin *et al.* 2009; Zhang & Sanderson 2009).

Consider the optimal design of the Balerma irrigation network (BIN) as an example. The DE algorithm yielded the best solution of €1.983 million (Zheng *et al.* 2011), which was notably inferior compared to the solutions obtained using the HDP-GA (€1.941 million) (Liu *et al.* 2020), NLP-DE (€1.923 million) (Zheng *et al.* 2011), and TS-NSGA (€1.9214 million) (Cisty *et al.* 2016) algorithms. However, although these hybrid algorithms have shown great performance, it is worth noting that they require additional expertise or mathematical skills to narrow down the solution space or generate an initial solution to guide the EA's search. As a result, these could be challenging for practitioners to use in engineering practice. For instance, the HDP-GA algorithm involves identifying a solution based on the least head loss for each pipe in the supply path, followed by refining the solution using the GA. Practitioners need to develop a specific program to iteratively adjust pipe diameters when implementing the HDP method. Similarly, the NLP-DE approach utilizes graph theory and NLP algorithms to generate an initial solution and identify the approximate region of the optimal solution, necessitating graph theory and NLP expertise. Additionally, the TS-NSGA method involves a two-stage optimization process, where sub-optimal solutions are obtained in the first stage to narrow down the search space in the second stage. This method requires implementing NSGA-II twice with varying parameters and expertise in narrowing down the search space.

This paper presents a novel optimization algorithm called the auto-enhanced population diversity and ranking selection-based differential evolutionary (AEPD-RSDE) algorithm for the optimal design of WDSs. Unlike the above hybrid algorithms, AEPD-RSDE overcomes the issue of premature convergence and improves the exploration of global optimal solutions without requiring additional expertise or mathematical skills, making it more practical for real-world applications. The algorithm's core idea is to automatically enhance population diversity when convergence and stagnation are detected at the dimensional level. In addition, to accelerate convergence and reduce computational overhead, the algorithm includes a ranking selection strategy that replaces the tournament selection operator. This strategy is designed to select the most superior individuals in the trial and target vectors for the next generation, rather than using pair-wise comparison as in the tournament selection operator.

The design concept of the proposed AEPD-RSDE algorithm is to incorporate a greedy selection strategy to facilitate rapid convergence and utilize the AEPD strategy to enhance the exploration of the solution space. This concept is derived from the latest developments in the field of DE research. Specifically, in DE research, common approaches to enhance convergence speed involve modifying mutation, crossover, and selection strategies. For example, Das *et al.* (2009) introduced two topological neighborhood models into the DE mutation operator. Mallipeddi & Suganthan (2010) presented EPSDE, a DE variant that utilizes an ensemble of mutation and crossover strategies with control parameters. Dorronsoro & Bouvry (2011) developed HdDE, a heterogeneous distributed algorithm with two islands employing DE/rand/1/bin and GPBX-*α* mutation strategies. Tian *et al.* (2017) proposed a novel DE algorithm that combines a diversity-based selection strategy with a combined mutation strategy to improve search efficiency. In the AEPD-RSDE algorithm, the DE/current-to-pbest/1 mutation operator proposed by Zhang & Sanderson (2009), and the ranking selection operators proposed by Du *et al.* (2022) are adopted to enable fast convergence.

Another crucial aspect of DE research focuses on preserving population diversity, which is essential for preventing premature convergence and facilitating effective exploration of the solution space. This direction can be categorized into three approaches. The first approach involves adjusting the values of parameters *F* and Cr to increase the variance of the trial vector population, thereby promoting population diversity. Adaptive DE algorithms such as SaDE (Qin & Suganthan 2005), jDE (Brest *et al.* 2006), and JADE (Zhang & Sanderson 2009) fall into this category. The second approach addresses population diversity by perturbing population elements through the re-initialization of specific individuals (Mendes & Mohais 2005), specifically designed for dynamic optimization problems. Another strategy is selective destruction (Maresky *et al.* 1995), where a certain proportion of genes that have converged are retained for the next convergence phase, while the remaining genes are reinitialized. The third approach involves structured populations and controlled migration to preserve population diversity. For example, Blackwell & Branke (2004) proposed the DynDE algorithm, which utilizes an exclusion strategy to ensure that each population remains on different peaks. In the proposed algorithm, the AEPD strategy, initially proposed by Yang *et al.* (2015), is introduced and modified to restart the research when convergence or stagnation is detected.

This study makes two significant contributions to the optimal design of WDS. Firstly, it introduces the novel application of AEPD to DE-based optimal design of WDSs, which is validated for the first time using three well-known benchmark networks. Secondly, the study compares different mutation and selection strategies, not only focusing on ultimate performance but also on algorithm search behavior. This provides valuable insights into the impact of varied mutation and selection strategies on algorithm search behavior, guiding the development of more advanced EAs in the future.

The remaining sections of this paper are structured as follows. Firstly, the optimal design formulation for WDS is presented, followed by a description of the proposed AEPD-RSDE algorithm. Then, a brief overview of three WDS case studies is presented. The results and conclusions are presented in the following section, which includes a comparison of the algorithm's performance with that of other EAs, an illustration of the effectiveness of the AEPD strategy, and a comparison of the searching behavior between different mutation and selection strategies. Finally, the paper concludes with a summary and conclusions.

## FORMULATION OF THE OPTIMAL DESIGN OF WDS

*et al.*2011; Marchi

*et al.*2014; Bi

*et al.*2016; Liu

*et al.*2020). Second, this simplification allows the optimization problem to remain an intractable and non-deterministic polynomial hard problem, which has no effect on the validation of the algorithm's performance (Kang & Lansey 2012; Wang

*et al.*2015). Therefore, the objective function and constraints for this optimization problem can be written as follows:

*f*(

**D**) is the objective of the optimization problem that minimizes the network cost;

**D**represents a combination of candidate pipe diameters. The cost of each pipe

*i*with diameter

*d*and length

_{i}*l*is represented by

_{i}*C*(

*d*)·

_{i}*l*. The nodal head vector,

_{i}**H**(

**D**), is identified based on the decision variable

**D**. The set of optional pipe diameters is denoted by

**A**and

*h*

_{min}is the minimum nodal head that must be satisfied for a specific WDS optimization problem. Besides, the real-value coding scheme is adopted in this study due to its documented effectiveness in previous studies (Wang

*et al.*2015). In addition, the network is solved using EPANET software (Rossman 2000), and the degree of constraint violation is identified using Equation (4). In this equation, nodes with pressure heads greater than

*h*

_{min}are assumed to have zero value of constraint violation. It is worth noting that the calculation of constraint violation allows for the comparison of two infeasible solutions, enabling the selection of the better one based on a smaller degree of constraint violation. Consequently, this process plays a crucial role in achieving the evolution of the population.

## THE PROPOSED AEPD-RSDE ALGORITHM

The proposed AEPD-RSDE algorithm has a similar evolutionary scheme to the traditional DE algorithm, which includes mutation, crossover, and selection. However, the proposed algorithm has three main improvements over the traditional DE algorithm: (1) it uses the DE/current-to-*p*best/1 mutation with an archive operator instead of the DE/rand/1 mutation; (2) it uses a ranking selection strategy instead of a tournament selection; and (3) it introduces and modifies the AEPD strategy to accommodate the WDS optimization problem. These three improvements are elaborated as follows.

### The DE/current-to-*p*best/1 mutation with an archive operator

*p*best/1 mutation operator was firstly proposed by Zhang & Sanderson (2009) by incorporating the information of

*p*best solution, such that the converge speed can be accelerated. However, this may cause premature convergence, so they used an archive of historical individuals that failed in selection to improve population diversity. The following describes the DE/current-to-

*p*best/1 mutation with an archive operator.where is the mutation vector; , and are randomly chosen from

**P**, with being one of the top 100

*p*% individuals with

*p*∈ (0,1]; is randomly chosen from

**P**∪

**A**.

**P**represents the current population, while

**A**denotes the set of archived inferior solutions that failed in the selection operator. For further information on the formulation and specifics of

**A,**please refer to the work of Zhang & Sanderson (2009).

### The ranking selection operator

Selection is a crucial operator that enables population evolution by selecting better individuals after applying mutation and crossover operators. DE and most DE variants employ the tournament selection operator, which compares the trial vector and the target vector pair-wise. Therefore, this strategy is beneficial in maintaining population diversity as it extends the life span of inferior individuals (Noman & Iba 2008; Islam *et al.* 2012). However, this strategy might also slow down the convergence and significantly increase the computation cost, especially for problems with long simulation times, such as large-scale WDS optimization. In fact, Bi *et al.* (2016) showed that a searching mechanism that reduces population diversity quickly and favors exploitative searching is better for large-scale WDS optimization.

To this end, the ranking selection operator is designed to allow as many superior individuals as possible in the trial vector and the target vector to survive to the next generation instead of pair-wise comparison. The proposed ranking selection strategy is intuitively straightforward and consists of three steps. First, we classify the individuals in the trial and target vectors as either feasible or infeasible solutions. Second, we sort the feasible solutions in ascending order of their objective function values, and the infeasible solutions in ascending order of their constraint violations. Third, we select the individuals by giving preference to the feasible solutions with the lowest objective function values, and then to the infeasible solutions with the lowest constraint violations, until we reach the desired population size. The pseudo code shows how to sort a population with both feasible and infeasible solutions.

**Pseudo code for sorting population with both feasible and infeasible solutions**

Input: The combination of trial and target vectors (defined as X), with population size 2NP; |

Output: The ranking population (defined as Y); |

1:For i= 1: 2NP |

2:If (Cons(i) = =0) |

3:feasibleY = [feasibleX;X] |

4:else |

5:infeasibleY = [infeasibleX;X] |

6:end |

7: end |

8: Sort the feasibleY to build the new feasibleY based on their objective function values; |

9: Sort the infeasibleY to build the newinfeasibleY based on their constraint violations; |

10: Y = [newfeasibleY; newinfeasibleY] |

Input: The combination of trial and target vectors (defined as X), with population size 2NP; |

Output: The ranking population (defined as Y); |

1:For i= 1: 2NP |

2:If (Cons(i) = =0) |

3:feasibleY = [feasibleX;X] |

4:else |

5:infeasibleY = [infeasibleX;X] |

6:end |

7: end |

8: Sort the feasibleY to build the new feasibleY based on their objective function values; |

9: Sort the infeasibleY to build the newinfeasibleY based on their constraint violations; |

10: Y = [newfeasibleY; newinfeasibleY] |

### The AEPD strategy

The AEPD strategy is a mechanism that automatically diversifies a population at the dimensional level when it converges or stagnates, based on the population distribution measurement. Yang *et al.* (2015) first proposed the AEPD strategy for continuous variable optimization problems, which requires some modifications to adapt to WDS optimization that deals with real number variables. The modified AEPD strategy is applied after the selection operator, as described below.

**Step 1: Identification of Population Convergence or Stagnation**

*std*is used to measure population diversity in the

_{j,G}*j*th dimension, with the smaller value indicating a lower population diversity, and zero value representing complete convergence. For WDS optimization with real number variables,

*std*= 0 indicates that all individuals converge to an identical one. Therefore, a flag () can be used to denote whether the population has converged in the

_{j,G}*j*th dimension at

*G*th generation.

*std*≠ 0, but the distribution of population does not change, leading to an inability to generate better offspring. Stagnation can be identified by monitoring the variation in the value of

_{j,G}*m*and

_{j,G}*std*across successive generations. Specifically, the value of the stagnation counter (denoted as ) for the

_{j,G}*j*th dimension at the

*G*th is calculated as follows:where . is used as the flag that indicates whether there has been a stagnation in the

*j*th dimension at the

*G*th generation.

Yang *et al.* (2015) proposed that UN, an integer value, could be equivalent to the total population size. If *λ _{j,G}* ≥

*UN*, the value of is set to 1, indicating that the population reaches stagnation in the

*j*th dimension as its distribution remains unchanged for several consecutive generations.

**Step 2: Enhancement of Population Diversity**

*j*th dimension at the

*G*th generation, immediately re-diversifying the population can disrupt the evolutionary process, even in dimensions that were not selected for diversification. To address this issue (Yang

*et al.*2015), the following mechanism is adopted:where the variable

*z*serves as a flag to determine whether the population needs to undergo re-diversification in the

_{G}*G*th generation. If

*z*= 1, the random re-initialization method is used to re-diversify individuals, except the best one. Equation (12) shows that the population may undergo re-diversification with a low probability (

_{G}*c*= 10

^{−3}), by incorporating a uniformly distributed random number (

*randu*), in addition to complete convergence or stagnation. This approach, which was suggested by Yang

*et al.*(2015), addresses the issue of requiring a large number of function evaluations to flag all dimensions for re-diversification in high-dimensional problems. In addition, it is important to note that the method of randomly re-initializing populations used in this study for WDS optimization differs from the approach proposed by Yang

*et al.*(2015) for continuous variable optimization problems. This is because WDS optimization problems employ a real-value coding scheme, and the variables involved in these problems typically vary within a narrow range, making the specialized approach used for continuous variable optimization unnecessary.

## CASE STUDIES

The AEPD-RSDE algorithm proposed in this study was evaluated using three case studies, which are listed in Table 1. The necessary information about the networks, including the available diameters and the corresponding unit pipe costs, can be found on the Centre for Water Systems website at the University of Exeter (Wang *et al.* 2015). These three cases have been most widely utilized in the previous research to assess the effectiveness of different EAs. Thus, using these cases facilitates comparisons between the proposed AEPD-RSDE algorithm and other EAs.

WDS case study . | Number of decision variables . | Number of available pipe diameters . | Search space size . |
---|---|---|---|

NYT | 21 | 16 | 1.934 × 10^{25} |

HAN | 34 | 6 | 2.865 × 10^{26} |

BIN | 454 | 10 | 1 × 10^{454} |

WDS case study . | Number of decision variables . | Number of available pipe diameters . | Search space size . |
---|---|---|---|

NYT | 21 | 16 | 1.934 × 10^{25} |

HAN | 34 | 6 | 2.865 × 10^{26} |

BIN | 454 | 10 | 1 × 10^{454} |

## RESULT AND DISCUSSION

### Algorithm performance comparison and discussion

Table 2 provides a summary of the results obtained from applying the proposed AEPD-RSDE algorithm and other EAs to three different case studies. The table allows for a comparison of the performance of different EAs on networks of varying sizes. The results in Table 2 are ranked based on the percentage of trials that resulted in the best solution found (fifth column). Note that in the case of the NYT and HAN networks, it appears that different algorithms achieved the same average cost solutions. This can be attributed to the rounding-up of calculations when the solutions are very close. In such situations, the ranking of algorithms is determined based on the percentage of the best solution found.

Case study . | Algorithm . | Number of trials . | Best solution found . | Percent with the best solution found . | Average cost solution . | Average evaluations to find the first occurrence of the final solution . | Maximum allowable evaluations or evaluations for convergence . |
---|---|---|---|---|---|---|---|

NYT | AEPD-RSDE | 100 | 38.64 | 100 | 38.64 | 17,864 | 20,000 |

NLP-DE | 100 | 38.64 | 99 | 38.64 | 10,631 | 20,000 | |

SADE | 50 | 38.64 | 92 | 38.64 | 6,598 | 9,227 | |

GHEST | 60 | 38.64 | 92 | 38.64 | 11,464 | — | |

HD-DDS | 50 | 38.64 | 86 | 38.65 | 47,000 | 50,000 | |

Suribabu DE | 300 | 38.64 | 71 | NA | 5,492 | 10,000 | |

MMAS-ACO | 20 | 38.64 | 60 | 38.84 | 30,700 | 50,000 | |

PSO | 30 | 38.64 | 33 | 38.93 | NA | 80,000 | |

HAN | AEPD-RSDE | 100 | 6.081 | 100 | 6.081 | 49,012 | 200,000 |

NLP-DE | 100 | 6.081 | 98 | 6.081 | 42,782 | 80,000 | |

SA-SSDE | 100 | 6.081 | 97 | 6.081 | 45,105 | 49,926 | |

DE^{a} | 100 | 6.081 | 95 | 6.089 | 87,220 | 200,000 | |

SADE | 50 | 6.081 | 84 | 6.090 | 60,532 | 100,000 | |

DE^{b} | 300 | 6.081 | 80 | NA | 48,724 | 50,000 | |

GHEST | 60 | 6.081 | 64 | NA | 43,149 | — | |

PSO variant | 2000 | 6.081 | 5 | 6.310 | NA | 500,000 | |

BIN | AEPD-RSDE | 10 | 1.9205 | 10 | 1.921 | 2,073,600 | 2,500,000 |

SA-SSDE | 10 | 1.9205 | 10 | 1.924 | 787,365 | 789,705 | |

TS-NSGA | 10 | 1.9214 | 0 | 1.925 | 10,640,000 | 20,000,000 | |

NLP-DE | 10 | 1.923 | 0 | 1.927 | 1,427,850 | 2,000,000 | |

HDP-GA | 10 | 1.941 | 0 | 1.954 | — | 1,300,000 | |

DE^{c} | 10 | 1.982 | 0 | 1.989 | 9,210,143 | 10,000,000 | |

SADE | 10 | 1.983 | 0 | 1.995 | 1,200,000 | 1,300,000 | |

DE^{d} | 10 | 1.998 | 0 | 2.031 | 2,300,000 | 2,400,000 |

Case study . | Algorithm . | Number of trials . | Best solution found . | Percent with the best solution found . | Average cost solution . | Average evaluations to find the first occurrence of the final solution . | Maximum allowable evaluations or evaluations for convergence . |
---|---|---|---|---|---|---|---|

NYT | AEPD-RSDE | 100 | 38.64 | 100 | 38.64 | 17,864 | 20,000 |

NLP-DE | 100 | 38.64 | 99 | 38.64 | 10,631 | 20,000 | |

SADE | 50 | 38.64 | 92 | 38.64 | 6,598 | 9,227 | |

GHEST | 60 | 38.64 | 92 | 38.64 | 11,464 | — | |

HD-DDS | 50 | 38.64 | 86 | 38.65 | 47,000 | 50,000 | |

Suribabu DE | 300 | 38.64 | 71 | NA | 5,492 | 10,000 | |

MMAS-ACO | 20 | 38.64 | 60 | 38.84 | 30,700 | 50,000 | |

PSO | 30 | 38.64 | 33 | 38.93 | NA | 80,000 | |

HAN | AEPD-RSDE | 100 | 6.081 | 100 | 6.081 | 49,012 | 200,000 |

NLP-DE | 100 | 6.081 | 98 | 6.081 | 42,782 | 80,000 | |

SA-SSDE | 100 | 6.081 | 97 | 6.081 | 45,105 | 49,926 | |

DE^{a} | 100 | 6.081 | 95 | 6.089 | 87,220 | 200,000 | |

SADE | 50 | 6.081 | 84 | 6.090 | 60,532 | 100,000 | |

DE^{b} | 300 | 6.081 | 80 | NA | 48,724 | 50,000 | |

GHEST | 60 | 6.081 | 64 | NA | 43,149 | — | |

PSO variant | 2000 | 6.081 | 5 | 6.310 | NA | 500,000 | |

BIN | AEPD-RSDE | 10 | 1.9205 | 10 | 1.921 | 2,073,600 | 2,500,000 |

SA-SSDE | 10 | 1.9205 | 10 | 1.924 | 787,365 | 789,705 | |

TS-NSGA | 10 | 1.9214 | 0 | 1.925 | 10,640,000 | 20,000,000 | |

NLP-DE | 10 | 1.923 | 0 | 1.927 | 1,427,850 | 2,000,000 | |

HDP-GA | 10 | 1.941 | 0 | 1.954 | — | 1,300,000 | |

DE^{c} | 10 | 1.982 | 0 | 1.989 | 9,210,143 | 10,000,000 | |

SADE | 10 | 1.983 | 0 | 1.995 | 1,200,000 | 1,300,000 | |

DE^{d} | 10 | 1.998 | 0 | 2.031 | 2,300,000 | 2,400,000 |

a = the classic DE with N=100, F=0.7 and CR=0.8 (Zheng *et al*. 2011);

b = the classic DE with N=100, F=0.6∼0.9 and CR=0.3∼0.6 (Suribabu 2010);

c = the classic DE with N=500, F=0.3 and CR=0.8 (Zheng *et al*. 2011);

d = the classic DE with N=500, F=0.3 and CR=0.5 (Zheng *et al*. 2013);

Besides, for the AEPD-RSDE algorithm, the following tuned-parameter settings were used: NP = 100, *F* = 0.9, and Cr = 0.6 for the NYT case; NP = 300, *F* = 0.7, and Cr = 0.7 for the HAN case study; and NP = 500, *F* = 0.6, and Cr = 0.2 for the BIN case, where NP represents the number of population size. For parameter settings in different WDS optimization problems, it is advisable to select *F* values ranging from 0.5 to 0.7 and Cr values ranging from 0.1 to 0.3 for large-size networks. For small-size networks, it is recommended to choose *F* values ranging from 0.7 to 0.9 and Cr values ranging from 0.5 to 0.7.

Based on the results shown in Table 2, it can be observed that the proposed AEPD-RSDE algorithm performs better than other EAs in terms of exploring the global optimum for the three case studies. This is evidenced by the fact that the proposed algorithm achieved a 100% success rate in finding the current best solution for the NTY and HAN cases, which is higher than other EAs. Moreover, the proposed AEPD-RSDE algorithm obtained the current best solution of €1.9205 million, which is better than the solutions of €1.9214 and €1.923 million reported by Cisty *et al.* (2016) and Zheng *et al.* (2011) using the TS-NSGA and NLP-DE algorithms, respectively. Additionally, the proposed AEPD-RSDE algorithm outperformed the SA-SSDE (Du *et al.* 2022) in terms of the average cost solution, with €1.921 million being smaller than €1.924 million. These results highlight the merit of the proposed AEPD-RSDE algorithm in exploring the global optimal solution for larger-size WDSs, which are representative of those encountered when solving practical engineering problems. This superior performance can be attributed to the adoption of the AEPD strategy, which will be illustrated in the next section.

In terms of computational efficiency, comparisons across the three case studies demonstrate the advantage of the proposed AEPD-RSDE algorithm in solving large-size networks. For instance, in the BIN case with 454 variables, the proposed AEPD-RSDE algorithm identified the optimal solution using only 2,073,600 average evaluations, which is significantly less than the 10,640,000 evaluations required by TS-NSGA and the 9,210,143 evaluations for traditional DE^{f}. Although NLP-DE takes fewer evaluations to find the optimal solution, it involves the use of graph theory and NLP algorithms to identify a good initial solution to guide DE's search. Additionally, while SADE requires only 1,200,000 evaluations to identify the optimal solution, the identified solution with a cost of €1.983 million is considerably worse than the solution of €1.9205 million found by the proposed AEPD-RSDE algorithm. The great computational benefit of the SA-SSDE algorithm can be attributed to its use of the parameter self-adaptation strategy, while the proposed algorithm uses a fixed parameter setting in addition to the AEPD strategy.

Despite the computational efficiency of the BIN case, the proposed AEPD-RSDE algorithm requires more evaluations for the NYT cases than most other EAs. Specifically, the proposed AEPD-RSDE algorithm found the final optimal solution for the NYT case using 17,864 evaluations, which is significantly greater than the 5,492 and 6,598 evaluations taken by traditional DE and traditional SADE, respectively. This can be explained by the fact that the NYT case has a small and smooth solution space, whose optimal solution can be easily identified without needing the proposed AEPD strategy. Nevertheless, it is important to note that the solving time for small-size networks is typically quite small, and the increased evaluations associated with optimization will not result in a heavy computational burden, thus not impeding the practical application of the proposed algorithm.

### Illustration of the effectiveness of the AEPD strategy

*D*

_{mean}value and the optimal solution achieved during the evolutionary search, with the proposed AEPD-RSDE algorithm applied to three case studies. The

*D*

_{mean}is a convergence measure first proposed by Zecchin

*et al.*(2012) that calculates the mean of the total pair-wise Hamming distance between all solutions within the population, as depicted in the following equation.

The term 2/*N*(*N* − 1) represents the total number of pairs of candidate solutions. The *D*_{mean} metric provides a quantitative measure of solution dispersion across the search domain, where larger and lower values correspond to periods of high exploration (widespread search within the decision space) and exploitation (focused search within the decision space), respectively. The variation of mean search distance with each generation indicates the convergence behavior. A *D*_{mean} value of 0 signifies that all individuals converge to the same solution, leading to a loss of ability to explore better solutions.

As shown in the left panel of Figure 1, the overall value of *D*_{mean} decreases as the number of generation increases, suggesting a convergence of individuals through the process of evolution. Furthermore, the red dots indicate that the proposed AEPD strategy automatically enhances population diversity when the *D*_{mean} value reaches zero. This feature allows the evolutionary search to restart and explore better solutions. Specifically, for the NYT case, one auto-diversify action was performed, resulting in an improvement from $38.81 million to $38.64 million (see Figure 1(b)). For the HAN and BIN cases, two and three auto-diversity actions were observed, respectively, to achieve the final optimal solution (see Figure 1(c)–1(f)). Additionally, as shown by the blue dot in Figure 1(e), an auto-diversify action is seen prior to full convergence for the BIN case. This random action aids in restarting the search, avoids the local optimum, and is especially beneficial while exploring the global optimum of large-size WDS with high problem dimensions.

The above illustration is based on just one trial, and to provide a better understanding of the proposed algorithm, Table 3 summarizes the auto-diversity activities across all trials. The results show that for the NYT and HAN cases, the vast majority (88 out of 100 and 85 out of 100 trials, respectively) did not require any auto-diversity actions to achieve the current best solution. However, for the BIN case, 6 out of 10 trials required more than 5 times of auto-diversity actions to identify the final optimal solution. This is expected as the BIN case has a substantially larger solution space compared to the NYT and HAN cases, and hence more auto-diversity actions are required. The results indicate that the proposed AEPD approach, when sufficient computational resources are available, can consistently diversify the population and explore enhanced solutions, particularly for large-scale WDS optimization problems. This explains the excellent performance of the proposed algorithm in exploring the global optimal solution.

Case study . | Auto-diversity actions . | |||||||
---|---|---|---|---|---|---|---|---|

0 time . | 1 time . | 2 time . | 3 time . | 4 time . | 5 time . | 6 time . | 7 time . | |

NYT | 88 | 6 | 2 | 0 | 2 | 1 | 1 | 0 |

HAN | 85 | 6 | 5 | 2 | 1 | 0 | 1 | 0 |

BIN | 0 | 0 | 0 | 1 | 1 | 2 | 2 | 4 |

Case study . | Auto-diversity actions . | |||||||
---|---|---|---|---|---|---|---|---|

0 time . | 1 time . | 2 time . | 3 time . | 4 time . | 5 time . | 6 time . | 7 time . | |

NYT | 88 | 6 | 2 | 0 | 2 | 1 | 1 | 0 |

HAN | 85 | 6 | 5 | 2 | 1 | 0 | 1 | 0 |

BIN | 0 | 0 | 0 | 1 | 1 | 2 | 2 | 4 |

An interesting observation from the BIN case is that fewer generations are required to achieve convergence after auto-diversity action. Figure 1(e) shows that the search required around 2,000 generations to achieve the first convergence, but after the first auto-diversity action, it only took less than 1,000 generations to converge again. This phenomenon can be attributed to the fact that the search after auto-diversity is guided by the optimal solution previously identified, which leads to faster convergence and, subsequently, fewer evolutions required. This partly explains why the proposed AEPD-RSDE algorithm efficiently explores the global optimum.

### Searching behavior comparison between different mutation and selection strategies

In Figure 2, the black, blue, red, and purple lines represent the results of using the DE/rand/1 mutation with tournament selection (traditional DE), the DE/current-to-*p*best/1 mutation with tournament selection operators, the DE/rand/1 mutation with ranking selection operators, and the DE/current-to-*p*best/1 mutation with ranking selection (proposed algorithm), respectively. All results in Figure 2 are based on the same parameter setting with a population size of 500, *F* = 0.6, and Cr = 0.2. We did not consider the AEPD strategy to focus on the performance difference across different combinations of mutation and selection operators.

Figure 2(a) illustrates that the proposed algorithm has the most rapid decrease speed in *D*_{mean} values (purple line), while the traditional DE exhibits the slowest decrease (black line). This finding explains why the proposed AEPD-RSDE algorithm can converge faster than the traditional DE, resulting in computational efficiency. Moreover, the DE/current-to-*p*best/1 mutation operator (blue line) shows a relatively faster decline than the DE/rand/1 mutation operator (black line). This outcome is not surprising as the DE/current-to-*p*best/1 mutation operator is a greedy operator that favors the quicker reduction of population, leading to a faster decrease of *D*_{mean}. However, the ranking selection strategy proves to be more effective in reducing *D*_{mean} than the DE/current-to-*p*best/1 mutation operator, as the red line declines faster than the blue line in Figure 2(a). This observation suggests that the proposed ranking selection strategy, rather than the DE/current-to-*p*best/1 mutation operator, is the primary driver that enables the rapid decrease of *D*_{mean}, making the proposed AEPD-RSDE algorithm a faster-convergent one.

Analysis of both Figure 2(a) and 2(b) indicates that the algorithm's ability to explore better optimal solutions is positively correlated with its convergence speed. Specifically, the more rapid the increase in *D*_{mean} value, the better the algorithm is at finding optimal solutions. This finding aligns with Bi *et al.*’s (2016) discovery that a searching mechanism favoring exploitation (i.e., a rapid reduction in population diversity) enables the identification of near-optimal solutions more quickly. For instance, as seen in Figure 2(b), the proposed algorithm identified the solution of €1.923 million and achieved convergence in approximately 2,000 generations, whereas the traditional DE failed to converge even after 5,000 generations, and its identified solution of €2.997 million is significantly worse than €1.923 million.

It is important to note that the performance of the algorithm is highly dependent on the parameter settings (i.e., the values of *F* and Cr), and the values of *F* = 0.6 and Cr = 0.2 used in the above tests may not be the best parameter settings for the traditional DE. Zheng *et al.* (2011, 2013) found that the traditional DE with *F* = 0.3 and Cr = 0.8 achieved the best performance, identifying the solution of €1.982 million using approximately 18,420 generations. However, this performance is still significantly worse than that of the proposed algorithm, which identified the solution of €1.923 million in only 2,000 generations. Therefore, it can be concluded that combining the DE/current-to-pbest/1 mutation operator with the proposed ranking selection strategy can be a better alternative to the traditional DE for optimal design of large-size networks. This finding is valuable as it provides new insights into designing advanced EAs by modifying the selection operator.

## SUMMARY AND CONCLUSION

The previous studies have reported that DE is a powerful algorithm for optimizing WDSs, but it can encounter problems with premature convergence, particularly for large networks. To address this issue, this paper proposes the AEPD-RSDE algorithm, which incorporates an AEPD strategy to restart the search when population convergence is detected. Additionally, a ranking selection strategy is developed to facilitate faster convergence, leading to improved computational efficiency. The proposed algorithm is evaluated on three well-known networks, and the results indicate that:

- (1)
The proposed AEPD-RSDE algorithm outperforms other EAs in exploring the global optimum, thanks to its AEPD strategy that consistently diversifies the population for improved solutions, especially for large WDS optimization problems. Its reliability in locating high-quality solutions is evident in achieving a 100% success rate in finding the best solution for the NYT and HAN cases. In addition, for the BIN case, the AEPD-RSDE algorithm obtained an average cost solution with a cost of €1.921 million, surpassing the €1.924 million solution reported by Du

*et al.*(2022) using SA-SSDE. - (2)
Comparing mutation and selection strategies, the proposed ranking selection strategy drives the rapid convergence of the AEPD-RSDE algorithm, making it more computationally efficient than traditional DE. The DE/current-to-

*p*best/1 mutation operator with the ranking selection strategy could be better suited for large network optimization than DE/rand/1 mutation with tournament selection, as evidenced by the former finding the €1.923 million solution in 2,000 generations, compared to the latter's 18,420 generations to find the €1.982 million solution. These findings inform the development of advanced EAs by providing a better understanding of operator influences.

Although the AEPD-RSDE method has demonstrated remarkable performance in solving large-scale WDS optimization problems using the BIN network, further investigations are necessary to fully assess its effectiveness in optimizing real-life WDSs with thousands of nodes and pipes. Additionally, in order to provide a more comprehensive demonstration of the advantages of the proposed algorithm, it is crucial to conduct further comparisons with other metaheuristic methods and explore the potential benefits of combining the proposed strategies with other existing metaheuristic approaches.

## ACKNOWLEDGEMENTS

This work was funded by the National Natural Science Foundation of China (52260011) and the Yunnan Key Research and Development Program (202203AC100004).

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## CONFLICT OF INTEREST

The authors declare there is no conflict.

Applications of Evolutionary Computing, Evoworkshops: Evobio, Evocomnet, Evohot, Evoiasp, Evomusart, & Evostoc, Coimbra, Portugal, April DBLP, pp. 489–500