To maintain a sufficient chlorine residual in water distribution systems (WDSs), chlorine dosage needs to be regulated. The majority of previous studies that aimed to optimize chlorine dosage in WDSs considered single-species water quality (WQ) models featuring chlorine decay with simple reaction kinetics. Recent efforts have proposed using multi-species water quality (MS-WQ) models to account for chlorine interactions with various chemical and microbiological species, thus providing a comprehensive and accurate evaluation of the WQ within WDSs. Nevertheless, the key challenge of implementing MS-WQ models within optimization frameworks is their high computational cost and poor scalability for larger WDSs. Furthermore, previous optimization studies generally relied on evolutionary algorithms (EAs), which require conducting a significant number of WQ simulations. Bayesian optimization (BO) has been recently suggested as an efficient alternative to EAs for the optimization of computationally expensive functions. This study aims to present a systematic comparison between BO and other widely used EAs for the optimization of MS-WQ in WDSs. A case study featuring a real-life, midsized benchmark WDS was implemented to comprehensively evaluate all three optimization techniques. The results revealed that BO is notably more computationally efficient and less sensitive to changes in the constraints than EAs.

  • This study presents a systematic comparison between the performance of BO and other widely used EAs, namely GA and PSO, for the optimization of MS-WQ in WDSs. The results revealed that BO is notably more efficient than GA and PSO, as it requires significantly fewer evaluations of computationally expensive MS-WQ simulation models.

Controlling drinking water quality (WQ) is essential for sustainable development, as it safeguards public health by preventing waterborne diseases, protects aquatic ecosystems, and supports economic activities reliant on clean water. It also promotes social equity by ensuring universal access to safe water, a fundamental human right. Efficient WQ control reduces the environmental footprint of water treatment processes by conserving resources and minimizing greenhouse gas emissions. Additionally, maintaining WQ aids in adapting to climate change impacts, enhancing community resilience.

The deterioration of the WQ in water distribution systems (WDSs) is a key challenge facing the provision of safe drinking water to the public. After the water leaves water treatment plants, the WQ gradually deteriorates through a set of physical, chemical, and microbiological processes that take place as the water is transported through the WDS (Xin et al. 2019; Monteiro et al. 2020). To prevent recontamination of the treated drinking water by pathogenic species, a sufficient chlorine dose must be applied to ensure a minimum concentration of disinfectant exists everywhere in the WDS. On the other hand, excessive chlorine dosage leads to the formation of hazardous disinfection byproducts (DBPs) (Maheshwari et al. 2020; Zhang & Lu 2021), in addition to causing unpleasant taste and odor in the drinking water (Quintiliani et al. 2018; Maheshwari et al. 2020). Therefore, chlorine injection into the WDS needs to be carefully regulated to overcome the decay of chlorine by side reactions within the bulk flow and at the pipe walls, while simultaneously reducing the formation of DBPs (Nouiri 2017; WHO 2017; Roth & Cornwell 2018; Onyutha & Kwio-Tamale 2022).

Numerous computational frameworks have been introduced in the literature to optimize chlorine dosage in WDSs with various objectives, including minimizing the total injected chlorine mass, maximizing chlorine residuals, and minimizing the formation of DBPs in the WDS. Of the various optimization methods proposed in the literature, evolutionary algorithms (EAs), such as the genetic algorithm (GA) and particle swarm optimization (PSO), remain the most commonly implemented (Kanakoudis & Tsitsifli 2017; Mala-Jetmarova et al. 2017). For instance, Ostfeld & Salomons (2006) applied GA to refine the scheduling of pumping units while also optimizing the design and operation of booster chlorination stations, aiming to reduce both the costs associated with pumping and those related to the design and operation of chlorine boosters. Ohar & Ostfeld (2014) devised a framework based on GA to determine the necessary chlorination dosage at booster stations, ensuring water is delivered with satisfactory residual chlorine and DBP levels. This framework also aims to reduce the total expenses associated with the placement, construction, and operation of these booster stations. Ayvaz & Kentel (2014) created a combined optimization method that integrates GA with linear programming to identify the optimal network of booster stations for a WDS. This approach aims to minimize both the total dosage of chlorine injections and the number of booster stations, ensuring that the residual chlorine concentrations stay within the targeted limits. Munavalli & Kumar (2003) employed GA to evaluate the optimal chlorine dosage at the sources, adhering to the established minimum and maximum chlorine concentration limits at all monitoring points. Kang & Lansey (2010) applied GA to identify the best strategies for valve operations and scheduling booster disinfection to either reduce the amount of chlorine introduced at the sources or lower the high chlorine concentrations at withdrawal locations, all while ensuring that minimum chlorine levels and pressures are maintained across the system. Goyal & Patel (2018) utilized PSO to determine the best placement for booster stations, aiming to reduce the overall rate of chlorine injection. Wang et al. (2010) introduced a combined PSO-GA framework to determine the optimal placement of booster chlorination stations.

Beyond GA and PSO, other optimization methods rooted in evolutionary optimization have also been suggested, such as the immune algorithm (IA), ant colony optimization (ACO), and differential evolution (DE). Chu et al. (2008) explored the use of the IA in conjunction with EPANET to optimize the scheduling of WQ sources, aiming to reduce the total daily amount of chlorine added. Gokce & Ayvaz (2014) utilized an EPANET-DE framework to refine chlorine source scheduling, aiming to enhance the proportion of total safe water supplied to a designated area of the network, while concurrently reducing the overall mass of chlorine injected into the WDS.

Although widely implemented in previous literature, evolutionary optimization methods come at a significant computational cost, which can be attributed to the fact that they require conducting copious evaluations of the underlying objective function. Bayesian optimization (BO) has recently gained significant popularity as an efficient alternative to EAs for solving various optimization problems, such as the optimization of machine learning hyperparameters (Gelbart et al. 2014; Wu et al. 2017). BO builds, and sequentially updates, a probabilistic surrogate model of the objective function by sampling the underlying numerical model. The key to BO's efficiency is that it relies on an explicit acquisition function to balance exploration and exploitation of the function domain, leading to fast convergence. A recent study suggested that BO requires significantly fewer objective function evaluations than EAs in the case of optimizing single-species chlorine concentrations in a small-scale WDS (Moeini et al. 2023b). However, BO displayed higher sensitivity to the choice of optimization parameters than GA (Moeini et al. 2023b).

The majority of previous WQ optimization studies relied on single-species WQ models to describe the transport and decay of chlorine in WDSs (e.g., EPANET). Such models often feature simple chlorine decay kinetics (e.g., first-order decay) that fail to account for the complex reactions between chlorine and other chemical and microbiological species in the WDS. Additionally, single-species WQ models are incapable of accounting for the formation of DBPs. Alternatively, multi-species water quality (MS-WQ) models can account for the complex interactions between chlorine and other species as well as the formation of DBPs. Thus, MS-WQ models can provide a comprehensive evaluation of the WQ in WDSs, which is crucial for reliable optimization of chlorine dosage. Nevertheless, MS-WQ models are computationally expensive to evaluate as they involve the numerical solution of a complex system of coupled partial differential equations. The significant computational expense required to assess such black-box numerical models hinders their application for real-time WQ regulation, especially when implemented within evolutionary optimization routines.

A recent study introduced a computational framework for applying BO to optimize MS-WQ in WDSs (Moeini et al. 2023a). Although BO showed robust performance in optimizing chlorine dosage in WDSs, the results revealed that different BO methods produce widely variable convergence speeds, implying the sensitivity of BO to the choice of the surrogate modeling approach and the acquisition function (Moeini et al. 2023a). Despite the robust performance of BO, further investigation into how its performance compares to that of popular evolutionary optimization approaches for MS-WQ optimization in real-life WDSs is required.

The overarching objective of this study is to conduct a systematic investigation into the performance of BO against two of the most popular EAs, namely, GA and PSO, for the optimization of MS-WQ in WDSs. To that end, three simulation-optimization frameworks were developed to optimize chlorine dosage while minimizing the formation of DBPs. A case study featuring a real-life, midsized WDS was implemented to comprehensively evaluate each of the three optimization methods. Each optimization algorithm was coupled with an EPANET-MSX model to conduct MS-WQ simulations for a number of different scenarios. First, the simulation-optimization frameworks were used to independently conduct a comprehensive sensitivity analysis on each optimization method (GA, PSO, and BO). This sensitivity analysis aims to help understand how different optimization parameters influence the performance of different optimization approaches and to establish the best combination of parameters that provide the best performance for each optimization method. Second, the simulation–optimization frameworks were used to compare the best performance among the three optimization methods in terms of convergence speed, final objective function, and number of objective function evaluations.

Optimization framework

Figure 1 shows an overview of the simulation–optimization framework developed in this study. The framework allows the users to select any of the three optimization algorithms, namely GA, PSO, and BO, to optimize MS-WQ in WDSs. The Python-based framework couples each of the optimization algorithms with the functions' library toolkit of EPANET-MSX. The latter is used to conduct hydraulic and WQ simulations in order to simulate chlorine concentrations at all WDS junctions for any given chlorine dosage scenario. Below, a detailed description of the optimization model formulation is provided, followed by an explanation of the processes involved in each of the three optimization algorithms.
Figure 1

Chlorine dosage scheduling simulation–optimization framework.

Figure 1

Chlorine dosage scheduling simulation–optimization framework.

Close modal

Optimization model

The optimization process aims to minimize the total cost of chlorination while fulfilling specific WQ criteria at all nodes within the WDS. Chlorination cost includes two components, namely the capital cost of the chlorination system design (CSD) and the operational cost of chlorine dose injection (CDI), which for a single chlorine source can be described as (Ostfeld & Salomons 2006; Abokifa et al. 2019; Moeini et al. 2023a):
(1)
(2)
where is the chlorine mass injection rate during injection event ‘’ (); is the number of injection events in 1 day (); is the duration of injection event ‘’ in minutes (); is the chlorine injection cost per unit mass of chlorine (); is the daily return value coefficient (day−1) as a function of the annual interest () and chlorination system design lifetime (years); is the maximum injection rate (); V is the total injected mass of chlorine (); and , and are empirical capital cost coefficients. In this regard, is a coefficient that scales the effect of the maximum disinfectant injection rate on the capital cost. This coefficient captures the influence of increasing the injection rate capacity of the chlorine source ( on the overall cost of the system design (. Moreover, represents the correlation between and , where for the higher the value of , the greater the increase in the capital cost per increase in the maximum injection rate. Finally, scales the effect of the total amount of disinfectant (chlorine) injected over time on the capital cost. It accounts for the impact of the total chlorine dosage injected, considering different injection rates for each case study.
To satisfy WQ standards of multiple species (e.g., chlorine, DBPs, etc.), the abovementioned optimization is constrained by minimum and maximum thresholds for the concentration of each of the simulated species in EPANET-MSX at all junctions. The constraints are formulated as a penalty function (PEN) to account for violations of the upper and lower concentration bounds:
(3)
where and are the penalty cost coefficients for violating the minimum and maximum constraints for each species, respectively; and is the residual concentration at junction ‘’ during time step ‘’ [] for species ‘’. The latter is obtained by running a WQ simulation in EPANET-MSX for each species using different injection scenarios for the chlorine source.
Finally, the objective function is the summation of capital cost, operational cost, and penalty value:
(4)

The decision variables of the optimization frameworks are the chlorine injection rates, which are indicated by in Equation (1). Index ‘’ represents a given injection event within a periodic 24-h operation schedule. In this study, each injection rate is assumed to last for 4 h, leading to a total of six decision variables (i.e., ).

WQ model

Optimizing chlorine source scheduling in WDSs requires the development of a WQ model to simulate the transport and reaction of disinfectant residuals and DBPs through the WDS. In this study, the multiple-species model (EPANET-MSX) is implemented for conducting the MS-WQ simulations. The main inputs to the WQ simulations are (1) the layout of the WDS, including information about network components, such as pipes, nodes, pumps, tanks, boosters, etc., (2) water demand information, and (3) chlorine injection rates at the chlorine source station ().

The WQ model accounts for simulating both chlorine decay and the formation of DBP (e.g., trihalomethanes (THM)). The kinetic model used in this study can be described as (Ohar & Ostfeld 2014):
(5)
where ‘’, ‘’, and ‘’ are the concentrations of chlorine, a fictitious reactant, and DBPs (e.g., THMs), respectively [mg/L]; k is the reaction rate constant [L(mg·h)−1]; and ‘’ and ‘’ are the ratios between the stoichiometric coefficients of the fictitious reactant and THMs to that of chlorine in mg/mg.

Optimization algorithms

In this section, a brief overview of each of the three optimization approaches (GA, PSO, and BO) is provided, together with an introduction to the different parameters used in the comparison and the sensitivity analyses.

Genetic algorithm

GA is a search heuristic inspired by the theory of natural evolution. This algorithm mimics the process of natural selection, where the fittest individuals are selected for reproduction in order to produce offspring of the next generation. GAs are commonly used to find high-quality solutions to difficult problems with large, complex search spaces where traditional optimization methods struggle (Asthana 2000; Abdel-Basset et al. 2018). The process begins by randomly generating a population, and evaluating the fitness of each individual (i.e., solution). This is then followed by selection, in which the fittest individuals are selected for further reproduction (i.e., the better fitness, the bigger the chance to be selected). Reproduction of parents from the previous generation to form the individuals of the following generation involves two key processes, namely crossover and mutation. With a crossover probability, the characteristics of parents may crossover to form the new offspring (children). If no crossover is performed, the offspring is an exact copy of the parents (Whitley 1994; Kumar et al. 2020). Additionally, with a mutation probability, the new offspring could have slightly different characteristics from their parents. The steps of selection, crossover, and mutation are repeated until the GA is terminated when a stopping condition is satisfied, e.g., when no significant enhancement in the fitness of the fittest individual is achieved over a certain number of generations (Ahn & Ramakrishna 2003; Pröll & Werthner 2005; Yang 2008).

Particle swarm optimization

PSO is an evolutionary computation technique inspired by the social behavior patterns of birds flocking or fish schooling. It is a population-based optimization tool, which means it works by simulating a population moving through the problem space, with each member of the population representing a potential solution to the optimization problem (Kennedy & Eberhart 1995). PSO is widely used for solving complex and multidimensional optimization problems due to its simplicity, ease of implementation, and ability to quickly converge to a good solution. In PSO, each potential solution is called a ‘particle’. Each particle has a position represented by coordinates in the problem space, and a velocity, which determines the direction and speed with which they move across the problem space (Poli et al. 2007). Similar to GAs, PSO uses a fitness function to evaluate how good a solution a particular particle is (Venter 2003).

PSO starts with a group (swarm) of random particles. Then the algorithm iterates through several steps to update the velocities and positions of the particles according to simple mathematical formulae based on their personal best positions and the global best among them. For each particle, the velocity update is influenced by the particle's current velocity, the distance from its personal best, and the distance from the global best. The position is then updated by adding the new velocity to the current position. The velocity update equation incorporates three components: (1) the inertia component, which contributes to the particle's momentum, (2) the cognitive component, which represents the particle's memory of its own best position, and (3) the social component, representing the swarm's knowledge of the best position found by any particle. The velocity and position for all particles are repeatedly updated, together with their personal bests and the global best, until a predefined stopping criterion is met, such as a maximum number of iterations or a satisfactory fitness level (Venter 2003; Shi & Eberhart 2004; Cuturi & Fukumizu 2007).

Bayesian optimization

BO is a strategy for optimizing objective functions that are expensive to evaluate. It's particularly useful for global optimization of black-box functions, where the form of the function is unknown, and no derivative information is available. This makes BO ideal for tasks such as hyperparameter tuning in machine learning models, where evaluations can be time-consuming and costly (Seeger 2004; Candelieri et al. 2018; Frazier 2018a). BO is built on two key concepts: the surrogate model and the acquisition function. The surrogate model is a probabilistic model used to approximate the objective function. The surrogate provides a belief about the behavior of the objective function based on the data observed so far. Gaussian processes (GPs) are commonly used as surrogate models in BO due to their ability to estimate the uncertainty in predictions, which helps in exploring the objective function efficiently. The acquisition function is used to decide where to sample next. It balances the exploration of the search space (trying areas with high uncertainty) and exploitation (focusing on areas near known good results) (Seeger 2004; Wilson & Adams 2013; Shahriari et al. 2016; Frazier 2018b).

BO starts by using the initial observations to train the surrogate model. The model tries to capture the underlying pattern of the objective function, including where it might find high or low values. Then, the trained surrogate model is used to optimize the acquisition function to find the next point to sample that balances the exploration of unvisited regions of the search space against the exploitation of regions that are likely to offer an improvement over the current best observation. Then, the new observation is added to the dataset, the surrogate model is updated to incorporate the new information, thus becoming more accurate. The process of optimizing the acquisition function, sampling the objective function, and updating the surrogate model, is repeated until a stopping criterion is met, such as a maximum number of iterations or a satisfactory level of optimization (Wu et al. 2017; Frazier 2018a; Candelieri et al. 2018).

Water distribution system

The ‘ZJ’ benchmark WDS (Figure 2) is implemented herein to conduct the sensitivity analyses on each optimization method (i.e., GA, PSO, and BO), and to compare their performance under different scenarios. The network comprises one reservoir (source), 113 nodes, and 164 pipes. This benchmark network is based on the Zhi Jiang water distribution system in the eastern province of China and was originally developed by Zheng et al. (2011). The ‘ZJ’ WDS has been used in previous literature as a case study for WDS optimization applications (Zheng et al. 2011; Bi et al. 2015).
Figure 2

ZJ benchmark water distribution network.

Figure 2

ZJ benchmark water distribution network.

Close modal

Chlorine is assumed to be injected at one source location, which is junction 114 in Figure 2. A simulation duration of 168 h was implemented with a hydraulic time step of 15 min. Chlorine concentrations during the last 24 h were considered in the calculation of constraint violations to ensure that cyclical conditions have been established (Ohar & Ostfeld 2014). The empirical parameters in the objective function were taken consistent with the literature as , , , and (Ohar & Ostfeld 2014). The penalty coefficients for constraint violations were set at . The upper and lower bounds for chlorine concentration through the WDS are and 4 , respectively, and the maximum allowable concentration of THMs was considered as 0.08 based on WQ regulatory standards (US EPA 1998; Abokifa et al. 2019).

Sensitivity analysis of optimization parameters

The developed simulation-optimization framework is implemented to conduct a systematic sensitivity analysis of each of the three optimization methods to understand the influence of different optimization parameters and to select the best set of parameters for each optimization technique. Table 1 lists the ranges of parameters used for each method. For GA, the influence of three parameters was investigated, namely the population size (i.e., the number of individual solutions in each generation), the mutation probability, and the crossover probability. For all scenarios, the maximum number of generations was fixed at 500. The sensitivity of PSO to four important parameters, namely swarm size, cognitive parameter, social parameter, and inertia parameter, was tested. For all scenarios, the total number of swarms was fixed at 500. Finally, we investigated the sensitivity of BO to the choice of the surrogate function and the acquisition function, in addition to two other important parameters that control the performance of BO, namely the covariance kernel length scale, and the number of initial points. Previous work suggested that conducting a systemic analysis of these parameters improves the performance of BO in WDS optimization applications. The total number of iterations was limited to 500, consistent with previous studies (Moeini et al. 2023a).

Table 1

Parameters considered for each optimization method in sensitivity analysis

MethodParameterRange/function
GA Population size 10–20–30–40–50–100–150–200 
Mutation probability 0.02–0.025–0.03–0.035 
Crossover probability 0.6–0.7–0.8–0.9 
PSO Swarm size 10–20–30–40–50–100–150–200 
Cognitive parameter 1.5–1.75–2–2.25 
Social parameter 1.5–1.75–2–2.25 
Inertia parameter 0.6–0.7–0.8–0.9 
BO kernel length scale [1–10e + 10] 
Covariance kernel Matern – squared exponential – gamma exponential – rational quadratic 
Number of initial evaluations [10–250] 
Acquisition function Expected improvement – probability of improvement – entropy search – upper confidence bound 
MethodParameterRange/function
GA Population size 10–20–30–40–50–100–150–200 
Mutation probability 0.02–0.025–0.03–0.035 
Crossover probability 0.6–0.7–0.8–0.9 
PSO Swarm size 10–20–30–40–50–100–150–200 
Cognitive parameter 1.5–1.75–2–2.25 
Social parameter 1.5–1.75–2–2.25 
Inertia parameter 0.6–0.7–0.8–0.9 
BO kernel length scale [1–10e + 10] 
Covariance kernel Matern – squared exponential – gamma exponential – rational quadratic 
Number of initial evaluations [10–250] 
Acquisition function Expected improvement – probability of improvement – entropy search – upper confidence bound 

The variation in the number of scenarios across the three methods (GA, PSO, and BO) is attributed to each algorithm's distinct characteristics and sensitivity requirements. GA and PSO typically require more scenarios to adequately explore the parameter space due to their stochastic nature and reliance on population-based search strategies. On the other hand, BO features a more distinct and limited number of surrogate and acquisition functions due to its sequential nature and use of probabilistic models to guide the search, which is a potential advantage over GA and PSO. The number of scenarios and ranges were selected based on previously published work to ensure that each algorithm's strengths are leveraged effectively, leading to a more comprehensive and accurate sensitivity analysis for the optimization of chlorine dosage in WDSs. The PyGAD (pygad), PySwarms (pyswarms), and pyGPGO (pygpgo) are implemented as Python libraries for GA, PSO, and BO, respectively. All simulations were executed on a desktop computer with Intel(R) Core (TM) i7-10700 CPU @ 2.90 GHz processor and 16 GB RAM.

Genetic algorithm

Figure 3 shows the performance of GA using different population sizes represented by (a) the convergence profile and (b) the best objective function value. The figure shows that using 40 and 50 as the population size results in the best performance. Further increasing the population size results in worse final objective function values. Furthermore, in all scenarios, no improvement in the objective function was observed after the first 50 generations, and no significant variability in the convergence speed was observed among the different population sizes. Overall, the results showed that using 40 as the population size provides the best performance, converging within 20 generations to the best objective function value.
Figure 3

Performance of GA using different population sizes, represented by (a) the convergence profile and (b) the best objective function value.

Figure 3

Performance of GA using different population sizes, represented by (a) the convergence profile and (b) the best objective function value.

Close modal
Figure 4 shows the performance of GA using different mutation rates represented by (a) the convergence profile, and (b) the best objective function value. The figure shows that changing the mutation rate affects the convergence speed and best objective function value achieved by GA. However, no direct correlation between the mutation probability value and the performance of GA was observed, where Figure 4(a) shows that a mutation rate of 0.025 results in a faster convergence compared to 0.02 and 0.03, but slower convergence than a mutation rate of 0.035. Additionally, Figure 8(b) shows that a mutation rate of 0.025 converges to a better objective value compared to 0.02 and 0.03, but worse than 0.035. Based on these results, the mutation rate is selected as 0.035, which provides both the fastest convergence (i.e., 180 generations) and the lowest final objective value compared to the other mutation values.
Figure 4

Performance of GA using different mutation rates, represented by (a) the convergence profile and (b) the best objective function value.

Figure 4

Performance of GA using different mutation rates, represented by (a) the convergence profile and (b) the best objective function value.

Close modal
Figure 5 shows the performance of GA using different crossover probabilities represented by (a) the convergence profile, and (b) the best objective function value. Overall, GA showed significantly less sensitivity toward crossover probability compared to the mutation probability. Figure 5(a) shows that all convergence profiles are comparable, indicating that the crossover rate has no significant effect on the convergence speed of GA. Furthermore, Figure 5(b) shows that the final objective values obtained using each of the crossover rates are also comparable. Overall, a crossover probability of 0.9 resulted in the best objective value and was hence selected for the remainder of the analysis in this study.
Figure 5

Performance of GA using different crossover probabilities, represented by (a) the convergence profile and (b) the best objective function value.

Figure 5

Performance of GA using different crossover probabilities, represented by (a) the convergence profile and (b) the best objective function value.

Close modal

Particle swarm optimization

Figure 6(a) shows the convergence profiles achieved by using different swarm sizes. Overall, no significant enhancement in the objective function is observed after 160 swarms. The figure shows that increasing swarm size leads to a faster convergence, which is attributable to the higher number of evaluations in each swarm. Figure 6(b) shows the best objective values obtained using each swarm size. This figure depicts that although increasing the value of swarm size results in faster convergence, it does not increase the quality of the optimization solution. The figure suggests that using 50 particles in each swarm results in the lowest objective function, but converges slower than 200 particles (i.e., within 138 swarms for a swarm size of 50 compared to 10 swarms for a swarm size of 200). Thus, using 50 as the swarm size appears to provide the best performance for PSO.
Figure 6

Performance of PSO using different swarm sizes, represented by (a) the convergence profile and (b) the best objective function value.

Figure 6

Performance of PSO using different swarm sizes, represented by (a) the convergence profile and (b) the best objective function value.

Close modal
Figure 7 shows the convergence profiles and the best objective function values achieved by using different values of the cognitive parameter for PSO. Figure 7(a) suggests that varying the cognitive parameter value results in very little change in convergence speed. Although the figure shows that increasing the cognitive parameter value (e.g., from 1.5 to 2) results in slower convergence, it appears that using 2.25 for the cognitive parameter leads to a faster convergence compared to a value of 2. Furthermore, comparing a cognitive value of 1.5–2.25 shows that not only does the latter result in faster convergence (i.e., within 15 swarms), but it also achieves the lowest objective function value. Figure 7(b) also indicates that there is no specific trend in the best objective function values achieved by using different values of the cognitive parameter. However, both figures show that using a cognitive value of 2.25 yields the best performance for PSO and was hence selected for the remaining analysis in this study.
Figure 7

Performance of PSO using different values of the cognitive parameter, represented by (a) the convergence profile and (b) the best objective function value.

Figure 7

Performance of PSO using different values of the cognitive parameter, represented by (a) the convergence profile and (b) the best objective function value.

Close modal
Figure 8

Performance of PSO using different values of the social parameter, represented by (a) the convergence profile and (b) the best objective function value.

Figure 8

Performance of PSO using different values of the social parameter, represented by (a) the convergence profile and (b) the best objective function value.

Close modal

The social parameter is another crucial parameter that affects the performance of PSO. Figure 8(a) and 8(b) show the convergence profiles and the best objective values obtained by using different values of the social parameter. Figure 8(a) shows that decreasing the value of the social parameter leads to faster convergence, where PSO converges within only 10 swarms using a social value of 1.5. However, Figure 8(b) suggests that, although using a value of 1.5 results in the lowest objective function value, there is no clear pattern of achieving better objective values using higher or lower social values. Overall, a social value of 1.5 appeared to achieve the best performance for PSO.

Lastly, the sensitivity of the performance of PSO to the inertia parameter is examined. Figure 9(a) presents the convergence profiles, while Figure 9(b) shows the best objective values obtained by a range of different inertia values. The figures indicate that increasing the inertia value from 0.6 to 0.8 has no significant effect on the performance of PSO, resulting in comparable convergence profiles and objective function values. However, using a value of 0.9 leads to a slower convergence (within 18 swarms) compared to other values (within 10 swarms), and a lower objective value compared to other inertia values. These findings suggest that PSO is less sensitive to the choice of inertia values compared to other parameters.
Figure 9

Performance of PSO using different values of the inertia parameter, represented by (a) the convergence profile and (b) the best objective function value.

Figure 9

Performance of PSO using different values of the inertia parameter, represented by (a) the convergence profile and (b) the best objective function value.

Close modal

Bayesian optimization

In this section, the results of the sensitivity analysis of the covariance kernel length scale, the number of initial points, the acquisition function, and the surrogate function are presented.

Covariance kernel length scale

The length scale parameter of the covariance kernel is a critical parameter that determines the relationship between surrogate function values within the domain of the objective function. This parameter has a significant impact on the performance of BO. In this section, the length scale parameter is investigated using the cross-validation mean squared error (CV-MSE) between the objective function values and the mean values predicted by the surrogate model. For the Gaussian process regression (GPR) surrogate model implemented in this study, four different covariance kernels were tested in the sensitivity analysis, namely, matérn (MA), squared exponential (SE), gamma exponential (GE), and rational quadratic (RQ).

Figure 10 illustrates that the accuracy of the surrogate model is greatly influenced by the value of the length scale parameter for all four covariance kernel functions. Although the optimal values differ for each kernel function, they all exhibit comparable behavior. The surrogate function has lower accuracy for both very low and very high values of length scale parameters, compared to mid-range values. The optimal length scale values are 75,000, 7,500, 5,000,000, and 25,000 for MA, SE, GE, and RQ, respectively. These results highlight the importance of conducting sensitivity analysis on the length scale parameter to improve the accuracy of the surrogate function, which in turn affects the performance of BO.
Figure 10

Cross-validation RMSE of the GPR model vs. the length scale of each of the tested BO covariance kernels: (a) Matérn, (b) squared exponential, (c) gamma exponential, and (d) rational quadratic.

Figure 10

Cross-validation RMSE of the GPR model vs. the length scale of each of the tested BO covariance kernels: (a) Matérn, (b) squared exponential, (c) gamma exponential, and (d) rational quadratic.

Close modal

Number of initial points

Another crucial parameter that significantly affects the performance of BO is the number of initial points. While increasing the number of initial evaluations can improve the performance of BO, it also increases the computational cost of the optimization. In this section, we investigate the effect of using different numbers of initial points on the convergence of BO. For this purpose, we evaluate the accuracy of the surrogate function fitting to the objective function using cross-validation root mean squared error (CV-RMSE) and coefficient of determination (R2) scores between the objective function values and the mean values predicted by the surrogate model. These evaluations are applied to the four covariance kernels: MA, SE, GE, and RQ.

Figure 11 shows that increasing the number of initial evaluations leads to a decrease in CV-RMSE and an increase in R2 values, indicating an improvement in the accuracy of the surrogate function for all four covariance kernel functions. However, the results showed that most of the improvement in accuracy is realized within the first 150 initial points. Therefore, using 150 initial evaluations is a reasonable choice to optimize the computational cost of BO. Additionally, Figure 11 demonstrates that the results obtained using different covariance functions are comparable, suggesting a lower sensitivity of BO to kernel functions compared to the length scale parameter. Therefore, tuning the length scale parameter can result in similar performance when using different covariance kernel functions.
Figure 11

Cross-validation RMSE and R2 of the GPR model vs. the number of initial evaluations used to fit the GPR model using each of the tested BO covariance kernels: (a) Matérn, (b) squared exponential, (c) gamma exponential, and (d) rational quadratic.

Figure 11

Cross-validation RMSE and R2 of the GPR model vs. the number of initial evaluations used to fit the GPR model using each of the tested BO covariance kernels: (a) Matérn, (b) squared exponential, (c) gamma exponential, and (d) rational quadratic.

Close modal

Acquisition-covariance combination

To conduct a sensitivity analysis on the two main components of BO, namely the acquisition (AC) and covariance (COV) functions, 16 different AC-COV combinations were tested by coupling 4 different acquisition functions, namely expected improvement (EI), probability of improvement (PI), entropy search (ES), and upper confidence bound (UCB), with 4 covariance kernels (MA, SE, GE, and RQ).

Figure 12(a) shows that different AC-COV combinations converge to widely different objective function values. Most of this variability can be attributed to the choice of the acquisition function rather than the covariance kernel. Generally, EI and PI acquisitions showed the most significant variability with different covariance kernels, while ES and UCB acquisitions showed a more stable performance. The results also showed that UCB consistently achieved lower objective function values than EI, PI, and ES when combined with all four covariance kernels. Figure 12(b) shows the convergence profile of the four best-performing BO methods (i.e., the best AC-COV combinations). The figure confirms that the significant variability in BO's performance is attributed to the choice of the acquisition function. The figure also shows that the UCB-MA combination shows the fastest convergence speed, followed by PI-MA and EI-GE. Taken together, the results of the sensitivity analysis showed that the UCB acquisition function, in conjunction with the MA kernel, provides the best BO performance in terms of both the convergence speed and the final objective function value.
Figure 12

(a) Comparison of the best objective function value achieved by different AC-COV combinations and (b) convergence profiles of the four best-performing AC-COV combinations.

Figure 12

(a) Comparison of the best objective function value achieved by different AC-COV combinations and (b) convergence profiles of the four best-performing AC-COV combinations.

Close modal

Comparison of three optimization algorithms

After conducting a comprehensive sensitivity analysis for each of the three optimization approaches independently, the performance of the three methods is compared using the best set of parameters obtained for each method.

Convergence profile

Figure 13(a) depicts the convergence profile of each of the three optimization approaches. It is worth noting that in Figure 13(a), the x-axis represents the number of evaluations, in which each evaluation includes running a full MS-WQ simulation of the WDS to evaluate the objective function using each framework. The figure shows that BO converges after only 43 evaluations of the objective function, while neither GA nor PSO converges within 100 evaluations. In fact, GA and PSO do not achieve convergence until after 324 and 933 evaluations, respectively, requiring numerous evaluations of the computationally expensive MS-WQ simulation model compared to BO.
Figure 13

Comparison between the performance of BO, genetic algorithm (GA), and PSO: (a) the convergence profile and (b) the best objective function value.

Figure 13

Comparison between the performance of BO, genetic algorithm (GA), and PSO: (a) the convergence profile and (b) the best objective function value.

Close modal

Figure 13(b) shows a comparison of the objective function achieved by each of the three methods after completely converging to their respective final solutions. Although BO required significantly fewer evaluations of the objective function (43 evaluations) than GA and PSO, it managed to achieve a final value of the objective function of 212.19 ($/day). On the other hand, the corresponding objective function value achieved after 343 iterations of GA is 232.25 ($/day), and after 933 iterations of PSO is 218.7 ($/day). Therefore, BO appears to achieve not only faster convergence (i.e., using significantly fewer evaluations) but also converge to a lower objective function value compared to both GA and PSO. Also, GA converges faster than PSO; however, PSO achieves a lower objective function value compared to GA.

In addition to the number of iterations required by each optimization method to converge to a stable solution, significant differences were also observed in the computation time required by each method. On average, GA required approximately 134.37 s per iteration, substantially longer than PSO and BO. PSO, with an average iteration time of 64.89 s, was considerably faster than GA, but still slower than BO, which demonstrated the highest efficiency in terms of computation time, requiring only 2.06 s per iteration on average. Taken together, these results indicate that BO is significantly more computationally efficient than EAs, especially for the optimization of expensive MS-WQ models. These results demonstrate that BO offers a more efficient and reliable alternative to EAs for scenarios where time and resource constraints are critical (e.g., real-time control of extremely large WDSs). Further enhancements in the convergence speed of BO can also be achieved by exploring other surrogate models. While GPs are standard, other data-driven approaches such as Random Forests or Bayesian Neural Networks can offer significant reductions in the computational cost of training and evaluating the surrogate model, especially for cases where the objective function is high-dimensional.

Sensitivity to constraints

In many cases, violations of the minimum constraint should receive a greater penalty than those of the maximum constraint. This is especially true for the case of chlorine, where a significant drop in the chlorine concentration leaves the water vulnerable to microbial contamination, which could have significant public health impacts. Here, we compared the sensitivity of each algorithm to the choice of the penalty cost coefficients ( and in Equation (3)). This was achieved by setting the penalty for violating the minimum chlorine concentration to 0.5 instead of 0.1, while keeping the penalty for violating maximum concentrations unchanged (0.1).

Overall, increasing the penalty for violating the minimum constraint resulted in increasing the objective function values achieved by the three algorithms from 232, 218, and 212 $/day to 382, 375, and 372 $/day for GA, PSO, and BO, respectively. Although BO still outperformed GA and PSO, the margin was smaller, with BO being only 10 points better than GA and 3 points better than PSO. These results highlight the sensitivity of each algorithm to the choice of constraint violation parameters. However, the results showed that BO consistently achieved the lowest objective function values, demonstrating its robustness and superior performance in optimizing MS-WQ in WDSs. Moreover, the results showed that while all algorithms benefit from optimized penalty cost coefficients, BO's performance is less sensitive to the variability in the optimization constraints compared to GA and PSO.

This study aims to contribute to the development of efficient and effective strategies for maintaining WQ in WDSs while minimizing the overall cost of chlorine dosing. The fundamental task of the research study involves comparing BO-based frameworks against two of the most commonly used evolutionary-based frameworks for WQ management in drinking WDSs. Specifically, this study aims to conduct a systematic comparison between the performance of BO against GA and PSO for the optimization of chlorine dosage scheduling in WDSs.

The results show that GA and PSO display lower sensitivity toward different optimization parameters compared to BO. However, the convergence of BO requires significantly fewer evaluations of the objective function than both GA and PSO. The results reveal that the performance of BO is mainly dependent on the choice of the acquisition function, with little variability observed between the performance of different covariance kernels. The results also highlight the importance of carefully selecting the length scale of the covariance kernel and the number of initial evaluations used to fit the GPR model to ensure the best performance is achieved.

Overall, the presented study can help water utilities better manage the WQ in their drinking WDSs, which is crucial for preserving environmental integrity, economic stability, and social equity.

Funding support from the National Science Foundation under Grant 2015603 is gratefully acknowledged.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abdel-Basset, M., Abdel-Fatah, L. & Sangaiah, A. K. (2018) Metaheuristic Algorithms: A Comprehensive Review. Elsevier Inc.
Abokifa
A. A.
,
Maheshwari
A.
,
Gudi
R. D.
&
Biswas
P.
(
2019
)
Influence of dead-end sections of drinking water distribution networks on optimization of booster chlorination systems
,
J. Water Resour. Plann. Manage.
,
145
,
4019053
.
Ahn
C. W.
&
Ramakrishna
R. S.
(
2003
)
Elitism-based compact genetic algorithms
,
IEEE Trans. Evol. Comput.
,
7
,
367
385
.
Asthana
R. G. S.
(
2000
)
Evolutionary Algorithms and Neural Networks
.
Ayvaz
M. T.
&
Kentel
E.
(
2014
)
Identification of the best booster station network for a water distribution system
,
J. Water Resour. Plann. Manage.
,
141
,
04014076
.
Candelieri
A.
,
Perego
R.
&
Archetti
F.
(
2018
)
Bayesian optimization of pump operations in water distribution systems
,
J. Global Optim.
,
71
,
213
235
.
Chu, C. W., Der Lin, M. & Tsai, K. T. (2008) Optimal scheduling of booster chlorination with immune algorithm, Third International Conference on Convergence and Hybrid Information Technology, Busan, Korea (South), pp. 1226–1232. doi:10.1109/ICCIT.2008.411.
Cuturi
M.
&
Fukumizu
K.
(
2007
)
Kernels on structured objects through nested histograms
,
Adv. Neural Inf. Process. Syst.
, 19,
329
336
.
Frazier
P. I.
(
2018a
)
A Tutorial on Bayesian Optimization, ArXiv. 1–22
.
Frazier
P. I.
(
2018b
)
INFORMS Tutorials in Operations Research Bayesian Optimization Bayesian Optimization
, pp.
9
11
.
Gelbart
M. A.
,
Snoek
J.
&
Adams
R. P.
(
2014
). ‘
Bayesian optimization with unknown constraints
’,
Uncertain. Artif. Intell. – Proc. 30th Conf. UAI 2014
, AUAI Press, pp.
250
259
.
Gokce
S.
&
Ayvaz
T.
(
2014
). ‘
A simulation-optimization model for optimal estimation of the locations and chlorine injection rates of the booster stations in water distribution networks
’,
Int. Conf. HYDROINFORMATICS, City University of New York (CUNY)
.
NY, USA, CUNY Academic Works Conference: International Conference on Hydroinformatics
.
Kennedy
J.
&
Eberhart
R.
(
1995
)
Particle swarm optimisation
,
Stud. Comput. Intell.
,
927
,
5
13
.
Kumar
M.
,
Husain
M.
,
Upreti
N.
&
Gupta
D.
(
2020
)
Genetic algorithm: Review and application
,
SSRN Electron. J.
,
2
,
451
454
.
Mala-Jetmarova
H.
,
Sultanova
N.
&
Savic
D.
(
2017
)
Lost in optimisation of water distribution systems? A literature review of system operation
,
Environ. Modell. Software
,
93
,
209
254
.
Moeini
M.
,
Sela
L.
,
Taha
A. F.
&
Abokifa
A. A.
(
2023a
)
Bayesian optimization of booster disinfection scheduling in water distribution networks
,
Water Res.
,
242
,
120117
.
Moeini
M.
,
Sela
L.
,
Taha
A. F.
&
Abokifa
A. A.
(
2023b
). ‘
Optimization techniques for chlorine dosage scheduling in water distribution networks: A comparative analysis
’,
World Environ. Water Resour. Congr. 2023 Adapt. Plan. Des. an Age Risk Uncertain. – Sel. Pap. from World Environ. Water Resour. Congr. 2023
, pp.
987
998
.
Monteiro
L.
,
Carneiro
J.
&
Covas
D. I. C.
(
2020
)
Modelling chlorine wall decay in a full-scale water supply system
,
Urban Water J.
,
17
,
754
762
.
Munavalli
G. R.
&
Kumar
M. S. M.
(
2003
)
Optimal scheduling of multiple chlorine sources in water distribution systems
,
J. Water Resour. Plann. Manage.
,
129
,
493
504
.
Onyutha, C. & Kwio-Tamale, J. C. (2022) Modelling chlorine residuals in drinking water: A review, Int. J. Environ. Sci. Technol., 19, 11613–11630. https://doi.org/10.1007/s13762-022-03924-3.
Pröll
B.
&
Werthner
H.
(
2005
)
Lecture Notes in Computer Science: Preface
.
Quintiliani
C.
,
Di Cristo
C.
&
Leopardi
A.
(
2018
)
Vulnerability assessment to trihalomethane exposure in water distribution systems
,
Water (Switzerland)
,
10
,
1
15
.
Roth
D. K.
&
Cornwell
D. A.
(
2018
)
DBP impacts from increased chlorine residual requirements
,
J. Am. Water Works Assoc.
,
110
,
13
28
.
Seeger
M.
(
2004
)
Gaussian processes for machine learning
,
Int. J. Neural Syst.
,
14
,
69
106
.
Shahriari
B.
,
Swersky
K.
,
Wang
Z.
,
Adams
R. P.
&
De Freitas
N.
(
2016
)
Taking the human out of the loop: A review of Bayesian optimization
,
Proc. IEEE.
,
104
,
148
175
.
US EPA
(
1998
)
Stage 1 Disinfectants and Disinfection Byproducts Rule (Stage 1 DBPR): Laboratory Quick Reference Guide, 2
.
Venter
D.
(
2003
)
Particle swarm optimization
,
Soc. Transit.
,
34
,
197
205
.
Wang
H.
,
Guo
W.
,
Xu
J.
&
Gu
H.
(
2010
). ‘
A hybrid PSO for optimizing locations of booster chlorination stations in water distribution systems
’,
2010 Int. Conf. Intell. Comput. Technol. Autom
. ICICTA 2010. 1
, Changsa, China: IEEE, pp.
126
129
.
Wang, D., Tan, D. & Liu, L. (2018) Particle swarm optimization algorithm: an overview. Soft Comput., 22, 387–408. https://doi.org/10.1007/s00500-016-2474-6.
Whitley
D.
(
1994
)
A genetic algorithm tutorial
,
Stat. Comput.
,
4
,
65
85
.
WHO
(
2017
)
Guidelines for Drinking-water Quality: First addendum to the Fourth Edition. Geneva: World Health Organization
.
Wilson
A. G.
&
Adams
R. P.
(
2013
). ‘
Gaussian process kernels for pattern discovery and extrapolation
’,
30th Int. Conf. Mach. Learn. ICML 2013
.
28
, pp.
2104
2112
.
Wu
J.
,
Poloczek
M.
,
Wilson
A. G.
&
Frazier
P. I.
(
2017
). ‘
Bayesian optimization with gradients
’,
Adv. Neural Inf. Process. Syst.
,
31st Annual Conference on Neural Information Processing Systems (NIPS 2017), California, USA: Neural Information Processing Systems Foundation, Inc. (NeurIPS)
, pp.
5268
5279
.
Yang, S. (2008) Genetic algorithms with memory- and elitism-based immigrants in dynamic environments. Evolutionary Computation, 16 (3), 385–416. doi: 10.1162/evco.2008.16.3.385.
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).