Verified static and dynamic models of an operational works were used alongside Monte-Carlo conditions and non-dominated sorting genetic algorithm II (NGSAII) to optimise operational regimes. Static models were found to be more suitable for whole water treatment works optimisation modelling and offered the additional advantage of reduced computational burden. Static models were shown to predict solutions of comparable cost when applied to optimisation problems while being faster to simulate than dynamic models.

## ACRONYMS

- CAPEX
capital expenditure

- CSTR
continuously stirred tank reactor

- DAF
dissolved air flotation

- EX
extent

- IE
ɛ-indicator

- GD
generational distance

- GA
genetic algorithm

- HBC
hopper bottomed clarifier

- NN
non-dominated number

- NSGA
non-dominated sorting algorithm

- OPEX
operating expenditure

- RGF
rapid gravity filter

- SM
S-metric

- SC
spacing

- SS
suspended solids

- THM
trihalomethane

- TOC
total organic carbon

- TOTEX
total expenditure

- TN
true number

- UN
unique non-dominated number

- WTW
water treatment works

## INTRODUCTION

The demand for improved water quality is resulting in treatment becoming more rigorous, energy intensive and costly (Plappally & Lienhard 2013). This increase in treatment costs can be illustrated by the specific real costs of energy and chemicals increasing at Oslo's Water Treatment Works (WTW) by approximately 250% between 2000 and 2009 (Venkatesh & Brattebo 2011). Lowering the costs of establishing and operating water works is therefore necessary to help ensure sustainable provision of good quality drinking water in the future. Optimisation of water treatment strives to achieve the water quality demanded while also minimising capital, operational or life costs. This process is essential to ensure that water suppliers remain economical.

To compare different water treatment solutions over their entire lifespan it is necessary to evaluate total expenditure (TOTEX). Annual TOTEX estimations can be calculated by summing the annual operational (OPEX) and the annualised capital (CAPEX) expenditure values (based on assumed asset lifespans and interest rates). The calculation of CAPEX and OPEX costs of different treatment methods can be estimated using empirical relationships based on previous projects (Gumerman *et al.* 1979; McGivney & Kawamura 2008; Sharma *et al.* 2013). These estimated costs are traditionally specified by treated volumes independent of quality, with construction considerations such as tank volumes and pump specifications not considered. These relationships can be of use when planning costs, assessing budgets, evaluating options and seeking funding and design services but they have a degree of uncertainty of approximately 30% (Sharma *et al.* 2013). Detailed costing of WTWs is not possible until detailed specifications and designs have been completed. It was not possible to optimise WwTWs in terms of TOTEX here due to a lack of appropriate costing formulas which could consider the influence of design on operating performance.

Optimising water treatment is complex as it involves multiple, non-linear relationships between solution parameters that are often constrained and multiple objectives that are often conflicting. It is also important that the varying operating conditions of WTWs (for example, raw water turbidity or temperature) are represented accurately. These challenges can be met using numerical models (which allow the impact of process modifications on final water quality); Monte-Carlo methods (which allow the influence of variability to be assessed) and genetic algorithms (GAs) (which have historically been proven to be effective at solving non-linear problems). In this work, for the first time, operating regimes, identified by GAs from performance criteria assessed by static and dynamic WTW models, were compared. This work is also novel in the application of whole works optimisation techniques to case study data from an operational works. The models used were calibrated and verified to observed performance and both solids removal and disinfection performance criteria were assessed.

## METHODS

### Site description

### Computational WTW models

The clarification (DAF and HBC), filtration and disinfection processes were all modelled statically and dynamically for comparative purposes. The coagulation and GAC processes, which were only modelled statically, were included so that the influence of varying organic matter concentrations on the solids removal and disinfection models could be assessed.

In the dynamic model, the HBCs were modelled using a similar method to that presented in Head *et al.* (1997). The clarifier was modelled as a series of continuously stirred tank reactors (CSTRs) which may contain a sludge blanket which varies in size and composition dependent on the velocity and solids concentration of the water passing through it. Making the assumptions that the blanket concentration and height remain consistent and the flow through the clarifier is plug flow, the removal of solids was modelled as an exponential decay equation in the static model. These differences meant that the dynamic model, unlike the static model, would be able to represent the influence of sludge blanket condition, including blanket loss, more accurately for changeable conditions.

Flow through the DAF tank was modelled as plug flow in the static model by an exponential decay equation with the rate of decay dependent on the attachment efficiency of bubbles onto suspended solids (SS) (Edzwald 2006). In the static model, the attachment was assumed to occur only in the initial contact zone. In the dynamic model, mixing was applied using a representative number of CSTRs and the entire tank modelled as a contact zone. The dynamic model would have provided more stable clarified turbidity than the static model due to the degree of mixing that would have been modelled.

The removal of solids by filtration was modelled in the static model using the Bohart & Adams model (1920). In the dynamic model, the input SS concentration and the superficial velocity were taken as running means over a filtration run. This acted to dampen the response of the output turbidity to fluctuating water quality. Backwashes could also be triggered by head loss or filtered turbidity exceeding maximum limits in the dynamic model. Clean bed head loss was estimated on the assumption of Darcy flow (using the Kozeny–Carman equation and head loss due to solids accumulation was calculated using a relationship from Adin & Rebhun (1977)). The static model did not require head loss to be calculated as unscheduled backwashes were not modelled.

Chlorine decay within the static model was calculated using a first order exponential decay curve. In the dynamic model, a representative number of CSTRs identified based on the contact tank hydraulic efficiency were used, again allowing a degree of mixing to be represented. An overview of the mechanisms used to model the works are shown in Table 1.

Process | Parameter | Model | |
---|---|---|---|

Dynamic | Static | ||

General | Water density | Empirical relationship with temperature (Civan 2007) | |

Dynamic viscosity | Empirical relationship with temperature (Kestin et al. 1978) | ||

Degree of mixing | Approximation to plug flow proportional to number of continuous stirred tank reactors (CSTRs) in series | Plug flow | |

Suspended solids (SS) | SS (mg/l) : turbidity (NTU) ratio 2:1 (WRc 2002; Binnie et al. 2006) | ||

SS removal efficiency parameters | Empirical relationships with reservoir turbidity (Swan 2015) | ||

Coagulation by ferric or aluminium-based coagulants | SS | Stoichiometric analysis based on assumption that metal ions in coagulants form metal hydroxides which precipitate out of solution (Binnie et al. (2006)) | |

TOC | TOC adsorption onto coagulants surface using a Langmuir isotherm (Edwards 1997). Dosing model to attain target clarified TOC concentration | ||

pH | Carbonate chemistry (Stumm & Morgan 1970; Snoeyink & Jenkins 1980) similar to method described in Najm (2001) | ||

HBC | SS | Removal by varying density floc blanket (Head et al. 1997) | Exponential decay |

DAF | SS | Attachment efficiency of flocs onto air bubbles (Edzwald 2006) | |

Attachment occurs throughout mixed tank (WRc 2002) | Attachment occurs only in contact zone under plug flow (Edzwald 2006) | ||

RGF | SS | Adsorption of SS onto filter media (Bohart & Adams 1920; Saatci & Oulman 1980). Filter ripening represented by empirical attachment coefficient (WRc 2002) | |

Input SS and superficial velocity are taken as running means over a filtration run | Historic conditions have no influence | ||

Backwashes triggered by duration, head loss or filtered turbidity exceeding set values | Backwashes scheduled only | ||

Head loss | Clean bed head loss assumes Darcy flow (using the Kozeny–Carman equation). Influence of solids accumulation (Adin & Rebhun 1977) | ||

GAC | TOC | Typical reduction of 25% of clarified TOC due to filtration and GAC adsorption (Brown et al. 2011) | |

Chlorination | Residual free Cl _{2} | Instantaneous demand assumed to be met between dosing and water reaching contact tank. The bulk decay of chlorine in the contact tank is modelled using first order decay rate. An empirical decay rate parameter relationship with initial dose, temperature, TOC and bromide concentration based on Brown (2009) | |

CSTRs represent degree of mixing occurring | Plug flow assumed | ||

Contact time | t_{10}, the time taken for 10% of the concentration of a tracer chemical to be detected at the outlet of the tank after being added at the inlet (Teixeira & Siqueira 2008) | ||

Trihalomethanes (THM) | Formation of THMs proportional to free chlorine consumption (Clark & Sivaganesan 1998; Hua 2000; Brown et al. 2010) | ||

Discharge | Empirical relationship between time since last RGF backwash and treated volumes (Swan 2015) |

Process | Parameter | Model | |
---|---|---|---|

Dynamic | Static | ||

General | Water density | Empirical relationship with temperature (Civan 2007) | |

Dynamic viscosity | Empirical relationship with temperature (Kestin et al. 1978) | ||

Degree of mixing | Approximation to plug flow proportional to number of continuous stirred tank reactors (CSTRs) in series | Plug flow | |

Suspended solids (SS) | SS (mg/l) : turbidity (NTU) ratio 2:1 (WRc 2002; Binnie et al. 2006) | ||

SS removal efficiency parameters | Empirical relationships with reservoir turbidity (Swan 2015) | ||

Coagulation by ferric or aluminium-based coagulants | SS | Stoichiometric analysis based on assumption that metal ions in coagulants form metal hydroxides which precipitate out of solution (Binnie et al. (2006)) | |

TOC | TOC adsorption onto coagulants surface using a Langmuir isotherm (Edwards 1997). Dosing model to attain target clarified TOC concentration | ||

pH | Carbonate chemistry (Stumm & Morgan 1970; Snoeyink & Jenkins 1980) similar to method described in Najm (2001) | ||

HBC | SS | Removal by varying density floc blanket (Head et al. 1997) | Exponential decay |

DAF | SS | Attachment efficiency of flocs onto air bubbles (Edzwald 2006) | |

Attachment occurs throughout mixed tank (WRc 2002) | Attachment occurs only in contact zone under plug flow (Edzwald 2006) | ||

RGF | SS | Adsorption of SS onto filter media (Bohart & Adams 1920; Saatci & Oulman 1980). Filter ripening represented by empirical attachment coefficient (WRc 2002) | |

Input SS and superficial velocity are taken as running means over a filtration run | Historic conditions have no influence | ||

Backwashes triggered by duration, head loss or filtered turbidity exceeding set values | Backwashes scheduled only | ||

Head loss | Clean bed head loss assumes Darcy flow (using the Kozeny–Carman equation). Influence of solids accumulation (Adin & Rebhun 1977) | ||

GAC | TOC | Typical reduction of 25% of clarified TOC due to filtration and GAC adsorption (Brown et al. 2011) | |

Chlorination | Residual free Cl _{2} | Instantaneous demand assumed to be met between dosing and water reaching contact tank. The bulk decay of chlorine in the contact tank is modelled using first order decay rate. An empirical decay rate parameter relationship with initial dose, temperature, TOC and bromide concentration based on Brown (2009) | |

CSTRs represent degree of mixing occurring | Plug flow assumed | ||

Contact time | t_{10}, the time taken for 10% of the concentration of a tracer chemical to be detected at the outlet of the tank after being added at the inlet (Teixeira & Siqueira 2008) | ||

Trihalomethanes (THM) | Formation of THMs proportional to free chlorine consumption (Clark & Sivaganesan 1998; Hua 2000; Brown et al. 2010) | ||

Discharge | Empirical relationship between time since last RGF backwash and treated volumes (Swan 2015) |

The models were programmed using Simulink, an extension of MATLAB that provides an interactive graphical environment for modelling time varying systems. Process models were built as modules that were then grouped together to represent the whole WTW. For further details of the models applied, see Swan (2015) and Swan *et al.* (2016).

The models were calibrated using a combination of data collected every 15 minutes by the eScada system and manual monthly measurements during 2011. The models were then verified using data from the first nine months of 2012. Separate calibration and verification data were used so that the models were not replicating conditions previously observed. A data set for the entirety of 2012 was not used due to incomplete data sets for some of the parameters required. Observed *coagulant doses* and a dosing algorithm were used with the process models in separate simulations. The algorithm calculated the required dose to ensure the *clarified total organic carbon (TOC)* did not exceed a specified concentration using Edwards’ (1997) model, which is based on the Langmuir equation.

The root mean square errors (RMSEs) of the models were found to be approximately ±0.3 NTU for *clarified turbidity*; ±0.05 NTU for *filtered turbidity*; ±0.15 mg/l for *residual free chlorine*; and ±5 μg/l for *trihalomethane formation*. This degree of accuracy was acceptable as it was comparable to the tolerances which were allowed between automated and manual readings taken at the observed WTW (±0.25 NTU for *clarified turbidity*; ±0.1 NTU for *filtered turbidity*; and ±0.1 mg/l for *residual free chlorine*).

The dynamic models were found to be more accurate than the static models. When observed time series input data were applied to the models, the RMSEs of the dynamic model were found to be at least 5% less for the solids removal models (*HBC* and *DAF clarified* and *rapid gravity filtered turbidity*) and between 1% and 3% less for the disinfection models (*residual chlorine concentration, CT* and *THM formation*). The mean filtered turbidity and THM formation were also found to be underpredicted by the models. This was taken into consideration in the analysis of the optimisation results. Further details of the accuracy of the models is provided elsewhere (Swan *et al.* 2016).

Correlations between water quality parameters and abstraction rates were not represented. No correlations between abstraction rate and raw turbidity or temperature were found to exist. Possible relationships between TOC or bromine concentration with UV_{254} absorption were not assessed due to a lack of sufficient data. These relationships have been shown to exist elsewhere by Clark *et al.* (2011) and could have been present. Although the lack of representation of correlations between water quality parameters is a potential limitation of the Monte-Carlo approach, the accuracy of the model to predict failure likelihood was not found to decrease substantially when it was applied. *Coagulant doses* were calculated using a method based on the Edwards (1997) algorithm dependent on reservoir organics concentration and composition identified stochastically (see Swan *et al.* (2016) for further details).

The likelihood that one or more of the target criteria, given in Table 2, were not achieved at any moment was used as the performance parameter P(failure). The observed P(failure) for 2012 was approximately 0.3. When historical time series input data were applied to the models, P(failure) was predicted to within ±0.15. Applying Monte-Carlo conditions resulted in the error in predicted P(failure) increasing to ±0.20.

Parameter | Success criteria |
---|---|

Blended clarified turbidity | <1 NTU |

Filtered turbidity | <0.1 NTU |

CT | >60 mg·min/l |

THM | <25 μg/l |

Parameter | Success criteria |
---|---|

Blended clarified turbidity | <1 NTU |

Filtered turbidity | <0.1 NTU |

CT | >60 mg·min/l |

THM | <25 μg/l |

### Operating cost and failure likelihood GA optimisation

A multi-objective optimisation problem was set to minimise the *operating cost* and *failure likelihood* of a WTW. The operating regimes were constrained, as shown in Table 3. The performance of solutions was evaluated over a simulated year with stochastically varying conditions for each generation. Water quality and abstraction rates were sampled independently each simulated day from characteristic probability distributions (see Figures 2–9).

Parameter | Range | Increments |
---|---|---|

Proportion of water treated by DAF stream | 0% to 100% | 1% |

Target clarified TOC concentration (mg/l) | 1 to 5 | 0.1 |

DAF compressor pressure (kPa) | 300 to 700 | 10 |

Filtration run duration (hrs) | 24 to 96 | 1 |

Contact tank inlet chlorine concentration (mg/l) | 1 to 6 | 0.1 |

Parameter | Range | Increments |
---|---|---|

Proportion of water treated by DAF stream | 0% to 100% | 1% |

Target clarified TOC concentration (mg/l) | 1 to 5 | 0.1 |

DAF compressor pressure (kPa) | 300 to 700 | 10 |

Filtration run duration (hrs) | 24 to 96 | 1 |

Contact tank inlet chlorine concentration (mg/l) | 1 to 6 | 0.1 |

The design of the works in terms of the numbers of clarification and filtration units, and the volume of the contact tank were the same as observed at the operational site (see Table 4).

Parameter | Value |
---|---|

HBC units | 10 |

DAF units | 7 |

RGF units | 8 |

Contact tank volume | 2,400 m^{3} |

Parameter | Value |
---|---|

HBC units | 10 |

DAF units | 7 |

RGF units | 8 |

Contact tank volume | 2,400 m^{3} |

*£*= total comparative cost (£);

_{total}*£*= cost of coagulant (£);

_{coagulant}*£*= cost of DAF clarification (£);

_{DAF}*£*= cost of filter backwashing (£);

_{backwash}*£*= cost of sludge disposal (£);

_{sludge}*£*= cost of chlorination (£);

_{Cl2}*£*= cost of sodium bisulphite (£); and

_{SBS}*£*= cost of lime (£).

_{lime}Evolutionary algorithms (EAs) have repeatedly proved to be flexible and powerful tools for solving a plethora of water resource problems (Nicklow *et al.* 2010). Over the past 20 to 25 years, research in this field has focused on developing and testing new EAs and applying them to new problems (Maier *et al.* 2014). It has been found that certain EAs work better for certain problems than others but our understanding of why is limited (Maier *et al.* 2014). The choice of an appropriate method and associated parameters is dependent on achieving the best balance between *exploiting* the fittest solutions found so far and *exploring* the unknown. This work contributes towards increasing our understanding of applying GAs (a type of EA) to a real-world context along with the complexities this entailed. A GA was applied alongside a moderately computationally intensive simulation and with uncertainty in operating conditions represented by Monte-Carlo methods (also computationally demanding). To improve the efficiency of the process it was attempted to calibrate the GA's internal parameters and to limit the precision of the solutions.

The optimisation of the multi-objective problem was carried out using a NSGAII method (Deb *et al.* 2002). Real-value coded NSGAII has previously been shown to exhibit good diversity preservation in comparison with some other GAs (Pareto archived evolution strategy (PAES) (Knowles & Corne 1999), strength Pareto evolutionary algorithm (SPEA) (Zitzler & Thiele 1998) and binary coded NSGAII) and to be able to identify Pareto fronts in both constrained and non-constrained problems (Deb *et al.* 2002; Laumanns *et al.* 2002). NSGAII was also found to give the best overall performance in comparison to five other state-of-the-art multi-objective EAs when applied to 12 benchmark problems by Wang *et al.* (2015). Some papers have shown that other GAs (usually created by the paper's authors) can outperform NSGAII using a range of benchmark test problems and performance parameters. These other GAs include: FastPGA (Eskandari *et al.* 2007); EMOPOS (Toscano-Pulido *et al.* 2007); MOCell, OMOPSO, AbYSS (Nebro *et al.* 2008); SMPSO (Durillo *et al.* 2010); SMPSO (again); ɛMOEA; and EMOACO-I (Mortazavi-Naeini *et al.* 2015). Despite NSGAII being outperformed in these cases, it continues to be used as a well-established benchmark for new developed methods in computationally intensive problems. This is due to its common usage, established performance and availability of code (Mortazavi-Naeini *et al.* 2015). It is possible that another GA could have been more efficient in identifying near-optimal solutions to the problem posed but NSGAII was deemed a suitable algorithm for proof of concept that GAs could be used to optimise WTW operation and design.

To identify suitable internal parameters for the NSGAII algorithm, preliminary optimisations were carried out over an arbitrary 12-hour period using a control set of parameters (Table 5) and alternative runs where individual parameters were adjusted. The values selected for the preliminary trial were based on values used in previous literature (Nazemi *et al.* 2006; Sarkar & Modak 2006; Tang *et al.* 2006; Jain *et al.* 2007; Sharifi 2009). A complete cross-comparison between the parameters was not completed due to the prohibitive computational demands of achieving this. The final generation of solutions identified by the GAs were used to assess the effectiveness of the optimisations. Comparisons of solutions generated from multi-object problems should evaluate: (i) distance of the obtained Pareto front from the true Pareto front; (ii) uniformity of distribution of solutions in the Pareto front; and (iii) the extent of the obtained Pareto front to ensure that a wide range of objective values is covered (Zitzler *et al.* 2000). As no single metric completely measures algorithm performance, eight metrics, as suggested by Mala-Jetmarova *et al.* (2015), were used to measure the quality of the solutions identified and their similarity and proximity to the true Pareto front. An overall score was calculated for each optimisation with uniform weighting for each metric. Non-uniform weighting, as applied in Mala-Jetmarova *et al.* (2015), was not used as it adds unnecessary subjectivity.

Control | η_{c}=30 | η_{c}=10 | η_{m}=30 | η_{m}=10 | P_{c}=0.9 | P_{c}=0.5 | P_{m}=0.15 | P_{m}=0.05 | pop=50 | pop=10 | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Operating cost optimisation | Dynamic model | NN | 100% | 100% | 97% | 54% | 97% | 80% | 97% | 100% | 90% | 92% | 100% |

UN | 34% | 27% | 24% | 34% | 30% | 44% | 17% | 17% | 67% | 14% | 13% | ||

TN | 0% | 0% | 83% | 0% | 0% | 50% | 0% | 0% | 90% | 0% | 0% | ||

GD* | 24.5 | 7.5 | 0.2 | 3.3 | 24.4 | 4.7 | 15.1 | 44.8 | 0 | 44.4 | 1.1 | ||

IE | 2.0 | 2.1 | 1.1 | 2.1 | 2.0 | 1.3 | 2.2 | 1.4 | 2.0 | 1.7 | 4.2 | ||

SM* | £156 | £154 | £150 | £150 | £151 | £152 | £152 | £152 | £151 | £155 | £154 | ||

EX | 140% | 100% | 91% | 89% | 140% | 99% | 107% | 139% | 86% | 124% | 0% | ||

SC* | 7.0 | 1.6 | 0.3 | 0.1 | 7.4 | 1.2 | 20.5 | 7.0 | 0.0 | 5.7 | NaN | ||

Score | 66% | 68% | 83% | 64% | 65% | 78% | 53% | 61% | 85% | 57% | 36% | ||

Static model | NN | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |

UN | 77% | 64% | 93% | 70% | 73% | 17% | 77% | 77% | 77% | 32% | 90% | ||

TN | 0% | 0% | 10% | 0% | 27% | 0% | 27% | 57% | 20% | 2% | 0% | ||

GD* | 47.0 | 48.6 | 86.4 | 115.3 | 27.1 | 71.2 | 3.0 | 10.2 | 30.7 | 37.4 | 2.3 | ||

IE | 1.7 | 1.2 | 1.1 | 1.7 | 1.6 | 1.7 | 1.1 | 1.0 | 1.7 | 2.0 | 4.2 | ||

SM* | £141 | £147 | £146 | £145 | £143 | £151 | £142 | £140 | £147 | £142 | £140 | ||

EX | 80% | 122% | 81% | 187% | 91% | 166% | 83% | 100% | 116% | 80% | 65% | ||

SC* | 2.5 | 10.5 | 2.7 | 19.8 | 1.0 | 30.6 | 9.7 | 16.1 | 10.5 | 6.4 | 5.3 | ||

Score | 69% | 72% | 71% | 68% | 77% | 59% | 77% | 80% | 76% | 63% | 63% | ||

Mean of scores±standard deviation | 68±2% | 70±33% | 77±88% | 66±3% | 71±9% | 69±713% | 65±117% | 70±14% | 80±6% | 60±4% | 50±19% |

Control | η_{c}=30 | η_{c}=10 | η_{m}=30 | η_{m}=10 | P_{c}=0.9 | P_{c}=0.5 | P_{m}=0.15 | P_{m}=0.05 | pop=50 | pop=10 | |||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Operating cost optimisation | Dynamic model | NN | 100% | 100% | 97% | 54% | 97% | 80% | 97% | 100% | 90% | 92% | 100% |

UN | 34% | 27% | 24% | 34% | 30% | 44% | 17% | 17% | 67% | 14% | 13% | ||

TN | 0% | 0% | 83% | 0% | 0% | 50% | 0% | 0% | 90% | 0% | 0% | ||

GD* | 24.5 | 7.5 | 0.2 | 3.3 | 24.4 | 4.7 | 15.1 | 44.8 | 0 | 44.4 | 1.1 | ||

IE | 2.0 | 2.1 | 1.1 | 2.1 | 2.0 | 1.3 | 2.2 | 1.4 | 2.0 | 1.7 | 4.2 | ||

SM* | £156 | £154 | £150 | £150 | £151 | £152 | £152 | £152 | £151 | £155 | £154 | ||

EX | 140% | 100% | 91% | 89% | 140% | 99% | 107% | 139% | 86% | 124% | 0% | ||

SC* | 7.0 | 1.6 | 0.3 | 0.1 | 7.4 | 1.2 | 20.5 | 7.0 | 0.0 | 5.7 | NaN | ||

Score | 66% | 68% | 83% | 64% | 65% | 78% | 53% | 61% | 85% | 57% | 36% | ||

Static model | NN | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | 100% | |

UN | 77% | 64% | 93% | 70% | 73% | 17% | 77% | 77% | 77% | 32% | 90% | ||

TN | 0% | 0% | 10% | 0% | 27% | 0% | 27% | 57% | 20% | 2% | 0% | ||

GD* | 47.0 | 48.6 | 86.4 | 115.3 | 27.1 | 71.2 | 3.0 | 10.2 | 30.7 | 37.4 | 2.3 | ||

IE | 1.7 | 1.2 | 1.1 | 1.7 | 1.6 | 1.7 | 1.1 | 1.0 | 1.7 | 2.0 | 4.2 | ||

SM* | £141 | £147 | £146 | £145 | £143 | £151 | £142 | £140 | £147 | £142 | £140 | ||

EX | 80% | 122% | 81% | 187% | 91% | 166% | 83% | 100% | 116% | 80% | 65% | ||

SC* | 2.5 | 10.5 | 2.7 | 19.8 | 1.0 | 30.6 | 9.7 | 16.1 | 10.5 | 6.4 | 5.3 | ||

Score | 69% | 72% | 71% | 68% | 77% | 59% | 77% | 80% | 76% | 63% | 63% | ||

Mean of scores±standard deviation | 68±2% | 70±33% | 77±88% | 66±3% | 71±9% | 69±713% | 65±117% | 70±14% | 80±6% | 60±4% | 50±19% |

The performance metrics used and the scoring method applied are defined as follows: ×10^{3}.

**Known Pareto front (PF _{known}):** final Pareto front returned at termination, for the particular parameter setting combination.

**True Pareto front (PF _{true}):** best possible Pareto front (often not known for complex problems). Formed here out of all of the solutions identified using all the parameter setting combinations.

**Non-dominated number (NN):** the percentage of non-dominated solutions in PF_{known}.

**Unique non-dominated number (UN):** percentage of unique non-dominated solutions in PF_{known}.

**True number (TN):** percentage of solutions in PF_{known}, which are members of PF_{true}.

**Generational distance (GD):** measure of how close PF_{known} is to the PF_{true}. Calculated as RMSE of Euclidean distance between all solutions in PF_{known} and the nearest solution in PF_{true}. GD = 0 indicates that PF_{known} = PF_{true}.

**ɛ-indicator (IE):** ‘the smallest distance that an approximation set (PF_{known}) must be translated in order to completely dominate a reference set (PF_{true}) (Kollat *et al.* 2008). Factor by which PF_{known} is worse than PF_{true} with respect to all objectives. The minimum factor such that any objective vector in PF_{known} is dominated by at least one objective vector in (PF_{true}) (Zitzler *et al.* 2003). The IE metric adopts values equal or bigger than 1. A result IE = 1 indicates that PF_{known} = PF_{true}.

**S-Metric (SM):** the area covered by the PF_{known} from the worst possible solution specified.

**Extent (EX):** ratio of Euclidean distance between the objective function values of two outer solutions in PF_{known} to Euclidean distance between the objective function values of two outer solutions in PF_{true} (expressed as percentage).

**Spacing (SC):**represents the spread of solutions in PF

_{known.}It is calculated using Equation (2) where

*ɛ*is the Euclidean distance between the i

_{i}^{th}solution and its closest neighbour in PF

_{known}, is the mean of all ɛ

_{i}.

Score calculated as mean value of all metric scores where GD scored 0% for the maximum value and 100% for a value of zero; IE scored 0% for the maximum value and 100% for a value of 1; SM scored 0% for a value of 0 and 100% for a maximum value (P(failure) = 1, operating cost = £200,000) and SC scored 0% for the maximum value and 100% for a value of 0.

Through examination of the sensitivity analysis results, no clearly optimal set of parameters were identified but conclusions were drawn regarding some of the parameters (see Table 5). A *mutation probability (P _{m})* of 0.05 was found to improve the meta score of the optimisations substantially. The optimisations performance score proved to be relatively insensitive to

*mutation distribution index (η*. Based on the results of the sensitivity analysis, the GA internal parameters finally applied are shown in Table 6. The suitability of using a hundred generations was assessed by assessing the influence of simulating an additional hundred generations on the performance of the GA (see Results and Discussion sections). A

_{m})*cross-over distribution index (η*of 30 was also applied based on the performance of another optimisation process which was carried out at the same time (see Swan (2015)). Although individually tailored NSGAII parameters for each optimisation may have increased efficiency, consistent values were used so that the influence of model type on the process could be assessed more clearly.

_{c})η _{c} | η _{m} | P _{c} | P _{m} | pop | Generations | |
---|---|---|---|---|---|---|

Control | 20 | 20 | 0.7 | 0.1 | 30 | n/a |

Final | 10 | 20 | 0.7 | 0.05 | 30 | 100 |

η _{c} | η _{m} | P _{c} | P _{m} | pop | Generations | |
---|---|---|---|---|---|---|

Control | 20 | 20 | 0.7 | 0.1 | 30 | n/a |

Final | 10 | 20 | 0.7 | 0.05 | 30 | 100 |

*pop* = population; *P _{c}* = probability of cross-over;

*η*= cross-over distribution index;

_{c}*P*= probability of mutation;

_{m}*η*= mutation distribution index.

_{m}To make the search for near-optimal solutions more thorough, and to reduce the influence of possible premature convergence, the optimisation was carried out three times using different initial random seeding. The loss of Pareto solutions, a known deficiency of the NSGAII process, was addressed through the compilation of a secondary population of all parent solutions identified through each optimisation. A non-dominated sorting algorithm was then applied to these solutions to compile a new super Pareto set as previously applied by Wang *et al.* (2015) to identify best-known Pareto fronts to benchmarking problems.

The University of Birmingham's BlueBEAR high powered computing cluster (HPC) was used to complete the optimisations. Optimisations were carried out using multiple 48-hour sessions on a single core of a 64-bit 2.2 GHz Intel Sandy Bridge E5-2660 worker with 32 GB of memory. The computational time required to simulate and evaluate a generation of solutions (up to 60 solutions) using the dynamic model took approximately 1 hour. The static model, in comparison, took approximately 20 minutes. The time spent evaluating solutions using the NSGAII algorithm was insignificant in comparison to the time spent simulating WwTW performance.

## RESULTS

### Degree of optimisation achieved

The degree of optimisation achieved by the GA was assessed by observing the variance of four optimisation metrics. These metrics assessed how the objective functions, non-dominated fraction and convergence of the solution population varied generationally. Greater optimisation was assumed if these metrics were found to stabilise, indicating that the solution set was not evolving significantly towards fitter solutions. To give greater confidence in the degree of optimisation achieved after an initial hundred generations, an additional hundred generations were simulated for comparison. Based on visual assessment of the optimisation metrics (*convergence metric, mean cost function, mean failure likelihood* and *proportion of Pareto solutions*), no improvements in optimisation results were observed by increasing the number of generations from 100 to 200 for both optimisation problems.

The application of dynamic or static models was not found to consistently identify more optimistic or conservative solutions to the optimisation problem. The relative costs of the solutions identified were dependent on the failure likelihood of the solutions identified. An overview of the optimal values identified in comparison to the currently applied values is given in Table 7.

Parameter | Currently applied | Static model optimal value | Dynamic model optimal value |
---|---|---|---|

Operating regime optimisation | |||

Water treated by DAF stream | 55% | 55% to 100% | 85% to 100% |

Target clarified TOC concentration | 2.5 mg/l (estimated) | 4.6 mg/l to 5.0 mg/l | 4.8 mg/l to 5.0 mg/l |

DAF compressor pressure | 400 kPa | 400 kPa to 550 kPa | 510 kPa to 700 kPa |

Filtration run duration | 48 hrs | 96 hrs | 89 hrs to 96 hrs |

Contact tank inlet free chlorine concentration | 1.6 mg/l | 1.3 mg/l to 1.5 mg/l | 1.3 mg/l to 1.8 mg/l |

Parameter | Currently applied | Static model optimal value | Dynamic model optimal value |
---|---|---|---|

Operating regime optimisation | |||

Water treated by DAF stream | 55% | 55% to 100% | 85% to 100% |

Target clarified TOC concentration | 2.5 mg/l (estimated) | 4.6 mg/l to 5.0 mg/l | 4.8 mg/l to 5.0 mg/l |

DAF compressor pressure | 400 kPa | 400 kPa to 550 kPa | 510 kPa to 700 kPa |

Filtration run duration | 48 hrs | 96 hrs | 89 hrs to 96 hrs |

Contact tank inlet free chlorine concentration | 1.6 mg/l | 1.3 mg/l to 1.5 mg/l | 1.3 mg/l to 1.8 mg/l |

### Coagulation

*target clarified TOC concentrations*of between 4 and 5 mg/l were identified as being optimal using both models (approximately double the concentration currently predicted at the operational site) regardless of the solutions’ reliabilities. The

*target TOC concentrations*predicted using both models were similar; with their Pareto optimal solutions both having mean values of 4.9 mg/l and standard deviations of 0.2 mg/l. The higher

*target clarified TOC concentrations*resulted in lower

*coagulant doses*and subsequently reduced: (i) coagulant; (ii) pH/alkalinity adjusting chemical; and (iii) sludge disposal costs. Lower solids loading of the clarification and filtration stages was also achieved. The findings suggest that the historically greater use of coagulant at the site was inefficient and potentially necessary only due to known mixing issues at the site. Higher

*TOC concentrations*would, however, likely result in increased

*THM formation*(which was seen to be underpredicted by the model) and biological growth in the distribution system.

### Filtration

*filtration run lengths*identified in the operational cost optimisation were found to be in the region of the maximum value of 96 hours (Figure 12), with low standard deviations and negligible correlation with failure likelihood (dynamic model 95.3 ± 2.7 hours, static model 95.1 ± 3.7 hours). The models therefore predicted that

*filtration run durations*could be increased significantly beyond their existing operational duration of 48 hours, without increasing the failure likelihood of the works substantially. As solutions identified using the static model predicted these extended durations, frequent unscheduled backwashes were not required to achieve this performance and therefore disruption to operational routine was predicted to be minimal.

### Chlorination

The *inlet free chlorine concentration* identified as optimal reduced as the *failure likelihood* increased. This relationship was comparable for both models. Solutions with *failure likelihoods* less than 40% were found to require greater than 1 mg/l of *free chlorine* and the maximum dose identified using the dynamic model was 1.8 mg/l in comparison to 1.5 mg/l using the static model. These results indicate that for the observed operating conditions, the existing inlet concentration of 1.6 mg/l is appropriate to provide the required degree of disinfection cost effectively without exceeding the final water *THM concentration* limit set often.

## DISCUSSION

The failure likelihood of the solutions was unconstrained and most Pareto solutions identified had *failure likelihoods* greater than 50%. As reliable solutions are of greater interest, the use of some mechanism to limit the *failure likelihood* could have resulted in more efficient use of computational resources, although premature convergence would have been a concern. Not constraining the *failure likelihood* of solutions also resulted in the near-optimal solutions identified by the static and dynamic models being difficult to compare, as they inhabit different regions of the search space. The use of constrained or pseudo-constrained acceptable *failure likelihoods*, as carried out by Gupta & Shrivastava (2006, 2008, 2010), would have allowed easier comparison of solutions identified using the different models.

Constraining the precision of solutions (using the increments allowable in Table 3) and simulating only unique solutions each generation improved the efficiency of the search process. Solutions identified in previous generations did however required their failure likelihood to be reassessed each generation. This was necessary because of the variance in conditions between runs (found to result in approximately a 5% variance in *failure likelihood*). This continual assessment of failure likelihood did have the advantage that over multiple generations, the solution population was assessed against an increasingly diverse set of conditions, resulting in a more robust population evolving. If the sampling of the conditions was increased so that the variance in performance of the model was negligible between runs, then it could be possible that only newly identified solutions would need their failure likelihood evaluated. For computationally demanding models this could improve the reliability of results (as a greater combination of conditions could be assessed) and possibly reduce the computational demand (as individual solutions would only be assessed once). Further research is required to examine the potential of this.

Although the GA process identified contact tank *inlet free chlorine concentrations* similar to those applied in reality, in future it would be more useful to optimise contact tank *outlet concentrations.* This is because in practice residual free chlorine concentration is closely controlled by feedback control systems. The influence of coagulant dosing on the consumption/cost of chlorination could then be optimised and the formation of disinfection by-products could be predicted more accurately.

A relatively high *target clarified TOC concentration* (approximately 5 mg/l) was identified as being optimal due to the lower doses of coagulant required. Although this was predicted not to result in excessive free chlorine consumption or disinfection by-product formation, application of this operating regime may not be suitable, as insufficient destabilisation of colloids or excessive organic growth in the distribution network could result. Longer duration filtration runs were also identified as being preferable. This agrees with the observed performance, where excessive *head loss* or *turbidity* breakthrough were rarely observed at the WTW. As the identified optimal *filtration duration* (96 hours) was considerably outside the calibration conditions observed, limited confidence should be placed in this estimate but it is believed that the application of longer filtration runs would have been more efficient at the examined site.

The recommendations from this research have not been applied to the WTW from which the case study data were taken. Attempting to apply the amendments to the operating regime suggested by the optimisations through pilot plant or full-scale investigations would be informative future research.

## CONCLUSIONS

Static models were found to have similar accuracy as dynamic models and their use alongside GAs predicted similar solutions to an operational optimisation problem. The application of dynamic or static models was not found to consistently identify more economical or costlier solutions. The use of static models reduced the computational requirements of carrying out optimisations (the optimisations using the dynamic models were found to take five times the computational resources of the static models), allowing a greater number of operating conditions to be considered and/or generations to be simulated. Static models also had no requirement for the sampling frequency of operating condition parameters to be defined. Based on these findings, it is concluded that future whole WTW modelling optimisation studies should favour the use of static models.

The constraining of the precision of solution parameter values and simulation of only unique solutions was identified as a method of increasing the optimisation efficiency. Increasing the number of stochastic conditions which are simulated so that the variance in performance between runs using alternative seeds is insignificant could allow unique solutions to only require a single evaluation for all generations. This method should be considered for future Monte-Carlo optimisation studies. Future comparisons of failure/cost optimisations using different model types should also consider limiting the failure likelihood to allow easier comparison of results.

In comparison to the observed operating conditions at the WwTW from which the case study came, the following predictions were made by the optimisations to comply with the performance goals specified more than 95% of the time:

It should be possible to reduce the coagulant dose applied while still achieving sufficient treatment. This reduction in coagulant dosing could only be made if sufficient mixing was achieved at the site and the influence on distribution network organic growth was assessed to be tolerable.

*Filtration run durations*could be increased significantly beyond their existing value of 48 hours.

Finally, effective future CAPEX and TOTEX optimisation work will benefit greatly if costing formulas for WwTWs which can be linked to predicted performance are developed.

## ACKNOWLEDGEMENTS

This research was made possible by funding provided by the University of Birmingham's Postgraduate Teaching Assistantship programme.