## Abstract

This paper considers the problem of robust optimization, and presents the technique called Robust Optimization and Probabilistic Analysis of Robustness (ROPAR). It has been developed for finding robust optimum solutions of a particular class in model-based multi-objective optimization (MOO) problems (i.e. when the objective function is not known analytically), where some of the parameters or inputs to this model are assumed to be uncertain. A Monte Carlo simulation framework is used. It can be straightforwardly implemented in a distributed computing environment which allows the results to be obtained relatively fast. The technique is exemplified in the two case studies: (a) a benchmark problem commonly used to test MOO algorithms (a version of the ZDT1 function), and (b) a design problem of a simple storm drainage system, where the uncertainty is associated with design rainfall events. It is shown that the design found by ROPAR can adequately cope with these uncertainties. The approach can be useful for assisting in a wide range of risk-based decisions.

## INTRODUCTION

The main focus of this research is on robust multi-objective optimization (RMOO) and the introduction of a new algorithm, with the ultimate aim of testing and using it in water-related problems.

### Previous work

#### Robust multi-objective optimization in water-related systems

The water sector started to use mathematical optimization algorithms in the 1960s (Karmeli *et al.* 1968; Schaake & Lai 1969). Nicklow *et al.* (2009) devoted a paper to review this topic, particularly focusing on the use of genetic algorithms. However, although for many years in typical problem settings the issue of uncertainty was not considered in multi-objective optimization (MOO) of water systems, in the last 10–15 years this aspect has started to be given attention. This type of optimization is often (albeit not always) referred to as robust optimization. The notion of robustness is treated in different studies in various ways and is determined by the particular needs of a case study or the authors' preferences.

Literature sources present various definitions of robustness (Jin & Sendhoff 2003; Gunawan & Azarm 2005; Savic 2005; Deb & Gupta 2006; Beyer & Sendhoff 2007; Gaspar-Cunha & Covas 2008; Petrone *et al.* 2011; Erfani & Utyuzhnikov 2012). For the purpose of this study, the following definition is adopted, based on the sources just cited: *Robust optimization of multiple objectives is an optimization method modified in such a way that it generates a solution or a set of solutions that have minimum (or limited) variability of the objective functions when some elements or parameters of the modelled real system vary due to their uncertainty*.

Some of the first works applying RMOO methods were in the field of water distribution systems. In these works (Babayan *et al.* 2004; Kapelan *et al.* 2005, 2006; Savic 2005, 2006; Odan *et al.* 2015), the robustness of the identified solutions was achieved by including a measure of robustness as one of the objective functions, which depends on the type of problem being solved. One example of such a specific objective was used by Kapelan *et al.* (2006): measuring the probability that at the same time the heads of all the nodes in the network comply with at least the minimum head required for each node. In contrast, in the approach proposed in this paper, the robustness is reached not by requiring an objective function measuring the robustness depending on the problem being solved, but by using criteria of general applicability.

Other research in the water area has dealt with the optimum design of urban drainage networks and wastewater systems. All these works use the smoothing of the objective function as the method to accommodate uncertainty. (This process of smoothing is explained in more detail in the next section.) Andino-Santizo (2012) used two methods to find the robust optimum design of a drainage network considering uncertainty in the population growth. Kang & Lansey (2012), Kebede (2014), Vojinovic *et al.* (2014), and Yazdi *et al.* (2014) used a robust optimum design for water, storm- and wastewater infrastructures considering as uncertain the following parameters, respectively: amount of water; climate change, urbanization, population growth, and pipe aging; rainfall; and Manning roughness. Galindo-Calderon *et al.* (2015) found the robust optimum type and location of the Best Management Practices for a storm drainage network.

Yet another approach of RMOO used for water-related systems follows the notion of deep uncertainty. Walker *et al.* (2013) classified the uncertainty in two categories: first, uncertainties that can be statistically analysed; and second, uncertainties that cannot be statistically analysed due to the unforeseen future (named deep uncertainty). Beh *et al.* (2015) solved the problem of the optimal sequencing of additional water supply sources over a 40-year planning horizon. Mortazavi-Naeini *et al.* (2015) optimized planning and operation of bulk water systems foreseeing conditions of extreme drought. Watson & Kasprzyk (2017) optimized water allocation from multiple market-based supply instruments. In our research, uncertainty is assumed to be such that it allows for probabilistic description and analysis.

In the reviewed literature, the problem formulations used do indeed take uncertainty into account, but it can be seen that uncertainty is in a way ‘hidden’ and cannot be directly estimated. The problem of robustness is formulated by adding a specific metric of robustness depending on the considered problem type, and therefore varies in different case studies; besides, the process of probabilistic uncertainty propagation to the final solutions is typically not explicit either. This prompted the development of a new algorithm for robust optimization with a general applicability. The novelty of this study also relates to addressing features not fully considered by the approaches previously mentioned. First, it is a method where the robustness is not defined by a metric of robustness depending on the type of problem being optimized. Second, this method does not change an objective function, e.g. by its smoothing, employed earlier. Third, the method characterizes the uncertainty explicitly, using probability density functions.

#### Methods of robust multi-objective optimization

MOO is a mature technology, and RMOO uses MOO as the basic method. In MOO, when the objective function is not known analytically, the most popular approach is to use a randomized search, and the most widely used group of methods is evolutionary (genetic) optimization (Deb *et al.* 2002, 2005, 2009; Kollat & Reed 2006; Kukkonen & Deb 2006; Vrugt *et al.* 2009; Tiwari *et al.* 2011; Hadka & Reed 2013). For more details and additional literature, the reader is referred to the reviews by Reed *et al.* (2013) and Maier *et al.* (2014).

Some of the published literature on MOO techniques which take uncertainty into account (so which could be attributed to a group of robust optimization methods) was analysed, and a possible categorization of these techniques is presented below.

- A.
*Methods minimizing the mean of the objective function*(Zitzler & Kunzli 2004; Kapelan*et al.*2005; Limbourg 2005; Savic 2005; Deb & Gupta 2006; Kuzmin 2009; Andino-Santizo 2012; Kebede 2014; Vojinovic*et al.*2014). This method minimizes a smoothed version of an objective functionnamed*f**f*, which uses the average of the objective function values in the proximity of a given point. This is done instead of directly minimizing the original objective function (i.e._{μ}).*f* - B.
*Methods minimizing the mean–variance of the objective function*(Jin & Sendhoff 2003; Fieldsend & Everson 2005; Gaspar-Cunha & Covas 2008; Kang & Lansey 2012; Zeferino*et al.*2012). This approach is similar to the previous one, but besides calculating the objective function averagesin the proximity of the point in question, the standard deviation*f*_{μ}*f*is also estimated, to end up with their combination_{σ}*f*, which is optimized instead of_{μσ}.*f* - C.
*Methods using an additional objective function related to robustness*(Babayan*et al.*2004; Kapelan*et al.*2006; Erfani & Utyuzhnikov 2012; Zeferino*et al.*2012; Toloh 2014; Marchi*et al.*2016; Roach*et al.*2016). In this approach, a vector of objective functions*f*which represents some measure of the solution's robustness is added to the original vector of objective functions_{r}. This measure of robustness*f**f*could be the above-mentioned_{r}*f*or_{μ}*f*or their combination_{σ},*f*or another way of measuring the robustness of the system being optimized._{μσ} - D.
*Methods using additional constraints related to robustness*(Gunawan & Azarm 2005; Deb & Gupta 2006). In this approach, a vector of constraints is added to the original inequality constraints of the problem, which sets a boundaryto a given measure of robustness.*B* - E.
*Method based on comparing the cumulative distribution functions*(Petrone*et al.*2011). This approach aims at reducing the difference between the CDF characterizing uncertainty of a real solution and that related to an ‘ideal’ design. For real-life problems when a single CDF is generated as a result of a Monte Carlo simulation, and a substantial number of these have to be generated during optimization, implementation of this approach could be computationally prohibitive.

It can be seen that all the mentioned approaches aggregate uncertainty into the objective functions or constraints and then solve a standard MOO problem resulting in a single Pareto front. This is indeed a valid approach, however, it should be noted that the impact of uncertainty is embedded in this Pareto front, so uncertainty is in a way hidden, ‘encoded’ in the solutions which are expected to be robust. The uncertainty is encoded because the resulting Pareto front combines the minimization of the objective functions and the robustness in one single piece of information. The problem is that it is not straightforward to ‘decode’ the level of robustness from the identified solutions and to analyse how Pareto-optimal sets depend on actual realizations of uncertain variables. Decoding this information requires information to be stored for each solution in the Pareto front regarding the relation between the solution and the uncertain parameters. If explicit decoding and presentation of uncertainty are not performed explicitly, this makes it difficult to fully grasp the impact of the uncertainty by the decision makers and to select one of the solutions from the Pareto front.

The presented problem was the main motivation to develop an approach and algorithm entitled ‘Robust Optimization and Probabilistic Analysis of Robustness’ (ROPAR); in its initial form, it was presented by Solomatine (2012) in a short paper available on the web. In this approach, the uncertainty is not encoded into a single Pareto front, but multiple Pareto fronts are explicitly generated, each corresponding to sampled values of the uncertain variable(s). The multiple Pareto fronts make it possible to analyse the statistical characteristics of resulting uncertainty propagated to solutions, enriching the information available to decision makers. That initial approach by Solomatine (2012) is extended in this paper, to make it capable of finding robust optimum solutions.

### Problem statement

**, which is an abstraction (either mathematical or computational) of a real system intended to be optimized. The behaviour of the system model is characterized by the value taken by input uncertain variable**

*f***, and decision variables**

*u***. Then a RMOO problem can be formulated as subject to: where**

*x***is the vector of the**

*f**n*objective functions (

*f*

_{1},

*f*

_{2}, … ,

*f*

_{n}),

*g*is the

_{j}*j-th*inequality constraint,

*h*is the

_{j}*j-th*equality constraint,

*k*is the number of inequality constraints,

*q*is the number of equality constraints,

**is the vector of the uncertain input variables (e.g. rain intensity),**

*u*

*u**and*

^{min}

*u**are the vectors of the minimum and maximum values of uncertain input variables, respectively,*

^{max}**is the vector of the decision variables (e.g. pipe diameter),**

*x*

*x**and*

^{min}

*x**are the vectors of the minimum and maximum values of the decision variables, respectively, and*

^{max}*f*∈ ℝ.

_{i}, g_{j}, h_{j}, x, u**in the space of objectives (criteria) (see the example in Figure 3(a)) which can be defined as where**

*F*

*f**is the*

_{q}*q*th tuple representing the evaluations of the

*n*objective functions in the Pareto front,

**is all the evaluations of the objective functions forming the Pareto front, and**

*F**ns*is the number of tuples comprising the Pareto front.

To summarize, a solution is a configuration or a design that is defined by a specific combination of the values (vector) of the decision variable ** x**. The aim is to find a robust optimum solution

**which ensures that the objective function**

*x***values remain near the optimal value regardless of the uncertainty that could affect the parameters**

*f***.**

*u*### Robustness metrics

To measure the robustness of the solutions, the two most common metrics for assessing the robustness found in the literature are used (Beyer & Sendhoff 2007). These two metrics of robustness are exemplified with the following function. Assume that the goal is to minimize the objective function *f*(** x**). Consider that its robust counterpart is the function

*F*(

**).**

*x*### Objective of the study

The main objective of this study is to propose and test the new technique for RMOO, which would make it possible for decision makers to carry out probabilistic analysis of uncertainty propagation from uncertain factors to solutions, and ultimately to select robust optimum solution(s).

Two case studies are considered: (a) a benchmark problem commonly used to test MOO algorithms (a version of the ZDT1 function), and (b) a design problem of a simple storm drainage system, where uncertainty is associated with design rainfall events.

In this study, we concentrate on presenting the method, testing it in two case studies, and comparing the results to those achieved with deterministic methods. Comparison to other techniques also aimed at finding robust solutions is left for another publication (Marquez-Calvo & Solomatine in preparation-a).

## METHODOLOGY

The methodology has three parts:

deterministic optimization with multiple objectives;

robust optimization with multiple objectives;

comparing deterministic and robust solutions.

These three parts are presented in the subsequent sections.

### Deterministic optimization with multiple objectives

To assess if, indeed, ROPAR effectively finds robust solutions, the deterministic MOO is carried out first. These deterministic optimum solutions are used as a baseline to compare them with the solutions found using the robust optimization approach, using the robustness metrics presented above. For this study, two MOO algorithms are used: NSGA-II (Deb *et al.* 2002) and its advanced version, AMGA2 (Tiwari *et al.* 2011). It is acknowledged that NSGA-II is not the most efficient optimizer nowadays, however, it is very widely used and for that reason, it is employed here as well (its descendant, AMGA2, has been shown to be faster). Furthermore, NSGA-II is used for the storm drainage case and AMGA2 is used for the benchmark function case.

The problem solved using the MOO is the one defined in the section ‘Problem statement’. However, given that the problem is solved deterministically, there is no need to sample the input uncertain variable ** u** because it is considered to be fixed at a specific value. In this paper, for the ‘fixed

**’ we use the mean value of ‘uncertain**

*u***’.**

*u*### Robust optimization with multiple objectives

ROPAR is the approach used to find solutions that comply with the definition presented in the Introduction. This approach is based on Monte Carlo sampling of the input uncertain parameter ** u**, and allows the robustness of solutions found by MOO algorithms to be probabilistically analysed.

Most methods of robust optimization employ one of the known MOO algorithms as the core method, and we do this as well. As for deterministic optimization, NSGA-II (Deb *et al.* 2002) and AMGA2 are used as the core optimizers in robust optimization as well.

Before presenting the ROPAR algorithm, it is worthwhile mentioning some facts. Steps 1–7 were first presented by Solomatine (2012). The improved current version has added steps 8–10; these steps aim at finding the robust solutions. In particular, Step 9 defines two criteria (minimize ‘expected value’, and minimize ‘worst case’) to search for those solutions with the least variability (in accordance with the definition of robust optimization and the robustness metrics). This extended current version is still named ROPAR.

ROPAR has three main parts. The first part is sampling and generating the Pareto fronts. In this part, a value of an uncertain parameter is sampled, and for each of the samples the system model is instantiated with the sampled parameter and with the rest of the (certain) parameters. This model is used in optimization, resulting in a Pareto front. The number of Pareto fronts is equal to the number of samples generated. The second part of ROPAR is a probabilistic analysis of the Pareto fronts. The third part of ROPAR is finding the single or more robust solution(s).

#### Algorithm ROPAR

(Sampling and generating the Pareto fronts)

1. for

*r*= 1 to np doBegin

2. Sample

**u**_{r}3. Find the Pareto-optimal set

by solving the deterministic MOO problem for the sampled**F**_{r}.**u**_{r}

End.

(Analysing the Pareto fronts)

4. Pick one objective function

*f*, say_{k}*f*_{2}. Choose a certain level*L*_{2}for*f*_{2}(possibly represented by some narrow interval for*f*_{2}) and form the solution set*S*with the values of*f*_{2}in this interval, taking solutions from all the Pareto setswhere**F**_{r}*r*= 1, … ,*np*.5. Pick another objective function, say

*f*_{1}.6. Build the empirical distribution of the values of

*f*_{1}corresponding to the members of set*S*. This empirical distribution can be approximated by a probability density function characterising the uncertainty of*f*_{1}for the solutions corresponding to the chosen level*L*_{2}of*f*_{2}.7. (optional). Repeat steps 4, 5, and 6 running through a number of various levels

*L*_{2}, until finding a desired level of*f*_{2}, for example the one with a minimum variance of*f*_{1}.(Finding the robust solution)

8. Once a level

*L*_{2}is chosen, create the set*Q*containing optimum solutions with a value of*f*_{2}approximately equal to*L*_{2}.*Q*should include one solution from everywhere**F**_{r}*f*_{2}≈*L*_{2}(the closest one). Optionally*Q*could include not one but several solutions from everywhere**F**_{r}*f*_{2}≈*L*_{2}.9. The most robust solution

is found by solving one of the two single-objective optimization problems:**x**_{i}10. (optional). Repeat steps 8 and 9 running through a number of various levels

*L*_{2}, as many times as the decision maker needs.

Note. In an optional Step 7, the goal is to scan the robustness of the solutions for several values of *f*_{2}. It is just an initial scan because the robustness of the solutions at different values of *f*_{2} is quantitatively defined after carrying out the steps 8, 9, and 10.

### Comparing deterministic and robust solutions

To compare the deterministic solution with the robust solutions, the criteria used to determine the robustness in ROPAR are used to measure the robustness of the deterministic solution – these are specified in Step 9 of ROPAR, one for the expected value, and the other one – for the worst case.

The process to carry out here is similar to the process known as post-optimization robustness analysis presented in Paton *et al.* (2014). The robust performance of the solutions from the deterministic Pareto front is evaluated considering all the random samples of rainfall.

## TEST CASES

Two cases are considered: a benchmark function and a storm drainage network. The cases for this study were deliberately chosen to be simple, since they serve an illustrative purpose to introduce the algorithm. Both of them are optimized deterministically and robustly.

### Optimization of a benchmark function

*et al.*2000) is employed. This benchmark function is used to test MOO algorithms. The original formulation of ZDT1 is

*f*

_{2}now is Here

*randomfactor*is a random variable with a normal distribution with

*μ*= 1 and

*σ*= 0.05 (

*randomfactor*∼

*N*(1,0.0025)), which is sampled in the interval [0.8384,1.1789]. A sample within this interval has 99.2% probability of occurrence. The number of samples for this set of experiments is set to 1,000.

The MOO used is AMGA2 (Tiwari *et al.* 2011), which is set to run until 10,000 function evaluations of ZDT1 are made.

### Optimization of a storm drainage network

**is the vector (**

*D**d*

_{1},

*d*

_{2}, … ,

*d*

_{11}) representing the diameter of every pipe in the network,

**is the vector (**

*C*_{D}*c*

_{1},

*c*

_{2}, … ,

*c*

_{11}) representing the cost per length unit of every pipe depending on its diameter,

*C*is the fixed cost of the project,

_{F}**is the vector (**

*L**l*

_{1},

*l*

_{2}, … ,

*l*

_{11}) representing the length of every pipe, and

*P*is the amount of precipitation in the basin. In this paper, we are considering the pipe diameters as the only decision variables, and this is, of course, a simplification, since in reality the other variables have to be considered, such as for example, pipe slopes which have an impact on the excavation costs.

The volume of flood water is determined by running the SWMM modelling software (Rossman 2010). In the remainder of this paper, cost and construction cost are used interchangeably. The details of how the objective functions are calculated are provided in the supplementary data (available with the online version of this paper).

For this case, the only parameter with uncertainty is the rainfall, specifically, the intensity of the rainfall is considered to be random (named *i _{random}*). It is assumed that

*i*follows a normal distribution with a standard deviation of 7% of the mean (i.e.

_{random}*i*∼

_{random}*i**

_{base}*N*(1, 0.0049)), where

*i*is the intensity of the design rainfall corresponding to the geographical location of the network, considering a return period of 20 years. (The design return period for storm drainage typically varies from 2 to 25 years (Akan & Houghtalen 2003), and for this case 20 years was chosen.)

_{base}Two other parameters of the rainfall, duration and pattern, are considered to be fixed, regardless of the intensity (which is, of course, a simplification). With respect to the duration, in the conducted experiments the duration of the rainfall event is taken to be 3 hours. With respect to the pattern of the rainfall, a synthetic hyetograph for design storm was used (Butler & Davies 2004; Haestad Methods & Durrans 2007). The hyetograph resembles the one used by the Chicago method (Keifer & Chu 1957). Despite this type of rainfall profile tending to overestimate peak flows (Marsalek & Watt 1984; Alfieri *et al.* 2008), here it is used because the intention is to design a robust network able to cope with severe events (Fortunato *et al.* 2014).

Information describing the main aspects of the network is as follows. Every pipe has the same data: length of 100 m; slope of 0.5%; Manning's roughness coefficient of 0.02. The areas (in ha) of the subcatchments (from upstream to downstream): 5.617, 4.441, 4.441, 4.441, 4.089, 4.089, 6.489, 6.489, 3.245, 3.245, and 3.389. The mean value of rainfall in 3 hours is 360 mm. The system is configured to consider that in the event of flooding, the excess volume is stored atop the junction, in a ponded fashion, and is reintroduced into the system as capacity permits. An SWMM INP file with the values of other parameters can be found in the supplementary data (available with the online version of this paper).

*n*is the number of samples,

*z*is the abscissa of the normal curve that cuts off an area

*α*at the tails (1 –

*α*equals the desired confidence level, which was set to be 99.90%),

*e*is the aimed level of precision (was set to be 5%), and

*p*is the estimated variability of population (was set to 50%, for details see Israel 1992).

Based on Equation (18), *n* is estimated to be 949 which was for the sake of simplicity rounded to 1,000, so that the rainfall intensity *i _{random}* was sampled 1,000 times. As mentioned before,

*i*is the result of the multiplication of

_{random}*i*with a random number obtained from the normal distribution

_{base}*N*(1, 0.0049). The 1,000 samples fell into the interval [0.7738, 1.2505].

The large sample size (1,000) allows the four objectives to be achieved: first, to reach a high confidence level; second, to have a high level of precision; third, to consider the maximum heterogeneity in the population; fourth, to capture those extreme values that have the potential to be the most troublesome. On the other hand, if a high confidence level or a high level of precision is not needed, or the population does not have a high heterogeneity, then the number of samples will be accordingly lower. This number can be also made lower if the computational load is prohibitively high.

In the case of ROPAR, for each of the sampled rainfall intensities the storm drainage network is optimized using MOO NSGA-II (Deb *et al.* 2002), resulting in 1,000 Pareto fronts, one front for each rainfall intensity. In every optimization, 10,000 function evaluations (i.e. runs of the SWMM model needed to estimate flood volume) are used.

## RESULTS AND DISCUSSION

### Case 1: The benchmark function

#### Optimizing the benchmark function using a deterministic approach

As a result of optimizing the benchmark function ZDT1 without uncertainty (i.e. when *randomfactor* is fixed at 1), the Pareto front in Figure 2(a) is generated.

#### Optimizing the benchmark function using ROPAR

As a result of steps 1, 2 and 3 of the ROPAR algorithm, 1,000 Pareto fronts are generated – see Figures 2(b) and 2(c). In steps 4–7 of ROPAR, these Pareto fronts are analysed to determine which solutions are most robust within the boundaries imposed by the decision makers.

Assume that the decision maker is interested in the values of the objective function 2 in the interval 0.2 ≤ *f*_{2} ≤ 0.6 (see Step 7 of ROPAR). To illustrate the analysis of the Pareto fronts, two values of the objective function 2, say 0.2 and 0.6, are picked. Their corresponding PDFs are shown in Figures 2(b) and 2(c), respectively. Comparing these two PDFs, one can say that in the considered interval, the solutions with the values of *f*_{2} ≈ 0.6 are the ones with the minimum variability of *f*_{1}.

Steps 8–10 are not carried out in this case due to the fact that the optimization algorithm finds the ideal (almost exact) solutions for every realization of this random benchmark problem. These solutions are basically the same and there is no reason to use steps 8–10 of the algorithm which has the purpose of finding the most robust solution among them. The ideal solution for this benchmark problem is when *x _{i}* = 0 for

*i*= 2, 3, … , 11. However, in the experiment with the storm drainage system the solutions are different, and for that reason steps 8–10 are necessary.

#### Comparing the solutions of the deterministic and robust approaches

As pointed out previously, after carrying out deterministic optimization of ZDT1, there is no more additional information to help to decide which solution (among the 50 solutions forming the Pareto front) to choose. However, if uncertainty in the benchmark function is assumed, and the robust optimization approach is followed, one would have more information enabling one of the solutions to be selected. As shown in Figures 2(b) and 2(c), the robustness of the solution has a direct relation to the value of the objective function 2: the higher value of the objective function 2, the more robust the solution. Given the fact that it was assumed that the decision makers are interested in the solutions within the interval 0.2 ≤ *f*_{2} ≤ 0.6, the most robust solutions are those corresponding to *f*_{2} ≈ 0.6.

### Case 2: The storm drainage network

The problem of constructing (or rehabilitating) a storm drainage network (i.e. the determination of the combination of pipe diameters) is considered, with the two objective functions to minimize: construction cost and flood volume.

#### Finding the optimum design without uncertainty (deterministic approach)

Assuming fixed rainfall, the optimum deterministic designs of the storm drainage network are found and shown in the Pareto front in Figure 3(a). It can be seen that the cost of the solutions is in the range from 1,000 to 4,000 tmu (thousands of monetary units) and flooding volume of these solutions ranges from 0 to 24 mlw (millions of litres of water). Using the Pareto set aids, the decision maker in the final choice of the single solution for implementation. For example, if a decision maker considers two possible costs, 1,500 and 3,000 tmu, the corresponding flooding volumes given by the optimal solutions (networks) would be 11.2 and 1.7 mlw, correspondingly. However, the single Pareto set does not give the possibility to say anything about the robustness of these solutions against rainfall uncertainty.

#### Finding the optimum design considering uncertainty (the robust approach)

In the ROPAR algorithm, the objective function *f*_{2} is associated with the construction cost (Step 4). Applying ROPAR steps 1, 2, and 3, a set of 1,000 Pareto fronts is generated (Figure 3(b)). In steps 4, 5, and 6, the variability of the flooding volume is analysed for the two construction costs: 1,500 and 3,000 tmu – see Figures 3(b) and 3(c), respectively. As in the case of deterministic optimization, it is easy to see (and it is in a way obvious) that the lower investment leads to more flooding.

However, it is now possible to also see more, e.g. to detect a certain pattern: the lower the network costs, the bigger the variability in the network performance. Despite the relatively small uncertainty of the rainfall that is considered (the standard deviation is equal to 7% of the mean), it is interesting to see the differences in the network performance when less and less is invested in the construction of the network.

Let us assume that at Step 7 of the ROPAR algorithm, the decision maker chooses to build a network with a cost of 3,000 tmu. The decision could be based on the fact that although the 3,000-tmu solution is 100% more expensive than the 1,500-tmu solution, the latter leads to more flooding, but also is 350% more dispersed (measuring dispersion as the range, i.e. the width of the base of the PDF). The range of the 1,500-tmu solution is 14 mlw; the range of the 3,000-tmu solution is 4 mlw; and the ratio of 14 mlw to 4 mlw is 3.5, or 350%. In other words, the 3,000-tmu network has only 29% of the variability of the flooding volume of the 1,500-tmu network, or it can be said that in these terms it is 3.5 times more robust against uncertainty in rainfall. If robustness is seen as an important factor, then a decision maker may find this to be a good additional justification to build a more expensive network – in this case, a 100% more expensive design increases robustness by approximately 3.5 times, and of course leads to less flooding.

Yet another piece of information that can be deduced from these plots is the relation of maximum flooding volume (MFV) to cost under various scenarios. For example, MFV for the network costing 3,000 tmu is 4 mlw, and MFV for the network of 1,500 tmu is 19 mlw, which is almost 5 times higher.

It should be noted that an increase in robustness for more expensive designs from the engineering point of view is more or less expected: if one invests more in larger pipes, the number of solutions leading to flood is reduced. As investment reaches approx. 4,750 tmu, the pipes would be able to accommodate any amount of storm water and no rainfall would lead to flooding, so there is no uncertainty, and robustness is maximum. In other applications, such a relationship may be much less obvious. In any case, ROPAR makes it possible to identify the sets of solutions with different values of robustness and the corresponding ranges of objective functions, enabling probabilistic analysis.

Steps 8 and 9 of ROPAR identify the robust solution(s), ultimately selecting the most robust of the possible solutions, given the user-defined cost. In this example, for each of the costs 1,500 and 3,000 tmu, there are 1,000 solutions with either less than or equal cost of 1,500 and 3,000 tmu, respectively, that are selected in accordance with Step 8 of the ROPAR algorithm. Next, as specified in Step 9 of ROPAR, there are two criteria to select the most robust solution. Both criteria are applied to exemplify their use.

To apply *criterion 1* of Step 9 (minimizing expected value), for every solution the average flood volume (AFV) across the 1,000 rainfall samples is calculated, and the solution corresponding to the minimum AFV is picked. Figures 4(a) and 4(b), corresponding to costs of 1,500 and 3,000, respectively, show the solutions with minimum AFV marked with a red X.

The procedure described in the previous paragraph is also used with *criterion 2* of Step 9 of ROPAR (minimizing worst case), and here the solution with the minimum MFV is picked. Figures 4(a) and 4(b), corresponding to costs of 1,500 and 3,000, respectively, present the solutions with minimum MFV marked with a red +.

#### Comparing solutions found by deterministic and robust approaches

To see a more general relationship among the 1,001 solutions (i.e. 1,000 from the robust optimization plus 1 from the deterministic optimization), in Figures 4(a) and 4(b) every solution is plotted in the intersection of its AFV and its MFV. The robust solution with respect to criterion 1 is the one with the smallest AFV, and the robust solution w.r.t. criterion 2 is the one with the smallest MFV. In Figure 4(b), these two solutions are very close to each other; they have almost the same performance, that is, the solution found by criterion 1 could be used as the solution of criterion 2 and vice versa. It should be noted that the proximity of solutions corresponding to criteria 1 and 2 is characteristic only to the considered simple case study, but for more complex cases the results based on using criteria 1 and 2 could be very different (albeit these two criteria obviously correlate).

Furthermore, from Figure 4(b) a Pareto front can be recognized, and it has only two solutions. In Figure 4(b), the red X and the red + form a ‘blurred asterisk’ because they are pointing to two different solutions (compare with Figure 4(a), where a ‘sharp asterisk’ is seen because the X and the + are pointing to the same solution). For the decision maker, only these two solutions are worth considering because they dominate all the rest of the solutions. However, for other cases, the Pareto front might have more solutions, and in those cases, all the solutions in the Pareto front should be presented to the decision maker to let him/her decide which solution to select for implementation.

In terms of offering more options to the decision maker, several cost levels can be used. For example, Figures 4(c) and 4(d) show seven robust solutions corresponding to the costs 1,500, 1,750, 2,000, 2,250, 2,500, 2,750, and 3,000 tmu.

Additionally, from Figure 4 it is also possible to see that robust solutions have better performance than the deterministic solutions in terms of the values of AFV and MFV. Lower values for AFV and MFV mean lower variability, and hence higher robustness. This interpretation is in accordance with the definition of robustness used in this study.

Table 1 shows the robust and deterministic solutions with a cost of 3,000 tmu. In the table, the diameters from top to bottom correspond to the pipes in the network from upstream to downstream, respectively.

Deterministic | Minimum AFV | Minimum MFV |
---|---|---|

0.81 | 0.76 | 0.79 |

1.14 | 1.15 | 1.09 |

1.59 | 1.37 | 1.36 |

1.64 | 1.69 | 1.61 |

1.64 | 1.78 | 1.76 |

1.89 | 1.97 | 1.94 |

1.89 | 2.10 | 2.06 |

2.20 | 2.19 | 2.10 |

2.28 | 2.24 | 2.34 |

2.28 | 2.24 | 2.34 |

2.28 | 2.28 | 2.34 |

Deterministic | Minimum AFV | Minimum MFV |
---|---|---|

0.81 | 0.76 | 0.79 |

1.14 | 1.15 | 1.09 |

1.59 | 1.37 | 1.36 |

1.64 | 1.69 | 1.61 |

1.64 | 1.78 | 1.76 |

1.89 | 1.97 | 1.94 |

1.89 | 2.10 | 2.06 |

2.20 | 2.19 | 2.10 |

2.28 | 2.24 | 2.34 |

2.28 | 2.24 | 2.34 |

2.28 | 2.28 | 2.34 |

## CONCLUSIONS

In this paper, the problem of multi-objective robust optimization taking uncertainty into account is formulated. ROPAR is the algorithm presented and tested to solve this kind of problem. It analyses the robustness of the solutions found and selects the most robust solution(s).

Additionally, a deterministic MOO is compared with a ROPAR MOO. Then the robustness of the deterministic solutions is compared with the robustness of the ROPAR solutions. In every case tested, ROPAR found a more robust solution.

In contrast to the methods already known, ROPAR allows for explicit handling of uncertainty and analysing how uncertainty is propagated to the Pareto-optimal solutions, and allows for the corresponding probabilistic analysis. This makes it possible to identify the sets of solutions with different values of robustness and to select the robust solutions according to the preferences of a decision maker.

The novelty of the presented approach is in two aspects. First, the uncertainty of parameters or inputs is explicitly propagated to the solutions, so that the dispersion of the solutions can be analysed visually and analytically. Second, the new method has general applicability since it uses a quite general definition of robustness, and does not reformulate the objective functions (e.g. by their smoothing). This method can be applied to the optimization of a wide variety of problems, where the parameters are uncertain and can be represented by a PDF. Use of ROPAR for other problems will be demonstrated in the forthcoming papers by Marquez-Calvo & Solomatine (in preparation-a) and Marquez-Calvo & Solomatine (in preparation-b).

One of the limitations of this study is that ROPAR is applied to problems with one source of uncertainty and two objective functions; however, its design allows this algorithm to be employed in problems with more sources of uncertainty and more objective functions. In Marquez-Calvo *et al.* (submitted), a problem related with water quality in a water distribution network is solved using ROPAR considering 24 sources of uncertainty. In Marquez-Calvo & Solomatine (in preparation-b), the procedure to apply ROPAR to multiple-objective optimization problems is presented, and the robust design of a storm drainage system is identified, taking into account three objective functions and three sources of uncertainty.

Further research will be aimed at developing and testing the visual tools to enable problem representation and analysis in multiple dimensions, and testing various ‘trade-off’ approaches that make it possible to deal with the computational complexity of ROPAR.

## ACKNOWLEDGEMENTS

The first author thanks the Mexican Government for its support in funding his PhD programme through the CONACYT programme. Some of the research ideas and components were developed in the framework of the grant No. 17-77-30006 of the Russian Science Foundation, and the Hydroinformatics research fund of IHE Delft. The authors are also grateful for the support of IHE Delft in supplying the computational facilities, and to Dr. Gerald Corzo (IHE Delft) for helping with a number of IT-related aspects of carrying out this research.

## REFERENCES

*Proceedings of the 2005 IEEE Congress on Evolutionary Computation, 2005*

*Proceedings of Evolutionary Multi-Criterion Optimization. Second International Conference, EMO 2003*. Springer, Faro, Portugal, pp. 237–251. ISBN 9783540018698

*Proceedings of Evolutionary and Deterministic Methods for Design, Optimization and Control*. Springer, Capua, Italia. Eurogen 2011

*Proceedings of Parallel Problem Solving from Nature – PPSN VIII: 8th International Conference*. Springer, Birmingham, UK, pp. 832–842