## Abstract

Uncertainty analysis is important for wastewater treatment plant (WWTP) model applications. An important aspect of uncertainty analysis is the identification and proper quantification of sources of uncertainty. In this contribution, a methodology to identify an ensemble of behavioural model representations (combinations of input data, model structure and parameter values) is presented and evaluated. The outcome is a multivariate conditional distribution of input data that is used for generating samples of likely inputs (such as Monte Carlo input samples) to perform WWTP model uncertainty analysis. This article presents an approach to verify uncertainty distributions of input data (otherwise often assumed) by using historical observations and actual plant data.

## INTRODUCTION

In this article, a set of dynamic wastewater treatment plant (WWTP) model input data is first identified and then used to run simulations, producing confidence intervals giving an estimate of the prediction uncertainties of further model applications. With prior knowledge of the input data and model parameter values to a dynamic WWTP model of the Henriksdal WWTP in Stockholm, Sweden, behavioural input data sets (BIDS) are identified and form model representations (i.e. combinations of input data, model structure and parameter values) that are: (1) consistent with measured data and (2) useful for scenario analyses by simulation. The concept of rejecting non-behavioural model representations has been adopted from the Generalized Likelihood Uncertainty Estimation (GLUE) methodology introduced by Beven & Binley (1992).

Dynamic WWTP models are today tools that are commonly used by both researchers and engineers. For most practical applications, an early goal in the modelling procedure is calibration; to make simulation results match a set of measurements of target variables. A typical WWTP model contains several potential target variables, which should be considered simultaneously during model calibration (Hauduc *et al.* 2015). There are guidelines for selecting what variables to consider as well as acceptable uncertainty ranges for these, which among other things depend on the type of application (Rieger *et al.* 2012). *Ranges* are provided since a perfect match is not achievable; a set of model outputs **y** will never exactly equal the corresponding measurements, **, because of uncertainties in model structure, ***m*, in applied parameter values, **θ**, in model input data, **I**, as well as experimental errors. An overview of the many uncertainties related to WWTP modelling is found in Belia *et al.* (2009).

Therefore, it is necessary to accept a model representation that is a sufficiently good approximation of reality, but it should then be recognized that there are several other possible representations leading (at least almost) to equally good results (Beven & Freer 2001). This is especially true if the assumptions of the calibration procedure are modified, e.g. if calibration data is added/removed or if an alternative objective function is used. In other words, the uncertainties are lumped into the calibration results and then transformed into prediction uncertainty during model scenario analyses, i.e. simulations of a certain change/deviation from the normal operating point of the plant. A non-trivial question is then to what extent one can trust and apply the results of the scenario analyses.

The objective of this paper is to define and validate a method for quantifying the WWTP prediction uncertainty using a probabilistic/Monte Carlo sampling framework. The method, which is inspired by GLUE (Beven & Binley 1992), is based on grading and selecting behaviourally good model representations from a larger set, which is obtained from Monte Carlo simulations with input data sampled from a prior uncertainty distribution. In the literature review of Belia *et al.* (2013), it is found that the Monte Carlo method where the uncertainty of input parameters is propagated through the model without any feed-back is the most commonly used method of uncertainty analysis when evaluating WWTP designs. Examples of applications with this method are found in Benedetti *et al.* (2013) and Sin *et al.* (2009). Monte Carlo methods like the one presented in this study, where the prior uncertainty is updated by evaluating and comparing the simulations with measured data from the process, are, however, not as commonly applied, especially not for uncertainty analysis of full-scale WWTP models. Previously Sin *et al.* (2008) used a Monte Carlo-based calibration approach focusing on a limited number of kinetic activated sludge model (ASM) parameters and on aeration input data, and Mannina *et al.* (2012) used GLUE and varied 29 kinetic and stoichiometric ASM2d parameters to calibrate and assess the uncertainty of a full-scale WWTP.

Although it is possible to include an infinite number and various types of parameters in the uncertainty analysis the authors have, in this paper, chosen to focus on a limited amount of input data only. The choice was based on two criteria. (1) Identifiability analysis of WWTP models has indicated that only a fraction of WWTP kinetic and stoichiometric parameters are identifiable given typical data/information collected from full-scale plant operations (Brun *et al.* 2002; Vanrolleghem *et al.* 2003; Sin *et al.* 2008). (2) Previous studies on sensitivity analysis (Sin *et al.* 2011), as well as several calibration protocols (Hulsbeek *et al.* 2002; Melcer *et al.* 2003; Vanrolleghem *et al.* 2003), agree that the most significant parameters affecting uncertainty in WWTP plant models are the influent fractions and calibration of the solids balance.

Therefore, in the below case study application, the authors have selected the influent load, influent fractionation, influent temperature and parameters affecting the solids balance (and sludge retention time, SRT) as main input sources of uncertainty. These variables are measured, which facilitates the quantification of the prior uncertainty ranges. The influent temperature was moreover chosen as an uncertain calibration parameter, because it explicitly affects the kinetic parameters (growth rate, decay rate, etc.) of the underlying microbiological processes (ASM models). The outcome from the method is a set of behaviourally good input data, BIDS, which are conditioned on a historical (calibration) data set that can be used in model-based applications, such as scenario analysis of plant operation.

## MATERIAL AND METHODS

### WWTP and problem statement

Henriksdal WWTP (approximately 800,000 population equivalent) in Stockholm, Sweden, includes primary treatment, an activated sludge system, tertiary treatment in sand filters and anaerobic digestion for sludge stabilization and biogas production, see Figure 1. The plant is currently being operated at its maximum allowed sludge concentration in the bioreactors (mixed liquor suspended solids, MLSS), approximately 2,500 g m^{−3}.

With one out of seven biological treatment lines (14%) being out of operation due to reconstruction works, the nitrate recirculation (*Q*_{Rec}) pump capacity and the number of available aerators at the studied WWTP have been reduced by 14%. Unavoidably, however, the influent wastewater flow rate and load to the plant increase all the time because of the population increase in Stockholm and densification of the wastewater catchment area. Still, the effluent requirement, 10 gTN m^{−3} as a yearly average for nitrogen, remains. To evaluate if the nitrogen removal process can be enhanced, either by speeding up the remaining recirculation pumps or by intensifying the aeration, the two scenarios were simulated and evaluated using a dynamic model of the WWTP as will be described in the following text.

### WWTP model under study

The Benchmark Simulation Model no. 2 (BSM2) (Gernaey *et al.* 2014) was a close to reality basis for modelling the plant. Computational speed, however, is an issue for Monte Carlo methods, and this was the main reason for excluding the anaerobic digester (AD) process model of BSM2 from this study. At the WWTP, and in the model, reject water from the AD is mixed with raw wastewater before the influent sampling point. A few minor additional modifications to the BSM2 layout were needed to adjust the model to the Henriksdal WWTP. (1) The total bioreactor volume is distributed into two smaller (Z1, Z8) and six equally sized zones (Z2–7), which is motivated by historical tracer tests and the placement of mixers (Z2–4) and controllable aeration (Z4–7) zones. (2) Dissolved ferrous sulphate (FeSO_{4}) is added to the primary clarifier and to the sand filter inlets and is modelled by assuming an instant production of precipitates (inert suspended solids). (3) Inert suspended solids generation from the influent and iron dosage is modelled using the two particulate dummy state variables of BSM2. (4) Rapid two-media sand filter units are used to remove P and total suspended solids (TSS) from the secondary clarifier effluent and is modelled as an ideal thickener. In the model, the filter flush water is recirculated to the bioreactor, while in reality there is a possibility to recirculate that water to the inlet of the primary clarifier. However, the settings of the valves regulating the destiny of the flush water are not recorded at Henriksdal WWTP, thus adding to the model structure uncertainty.

### Adjustment of site-specific conceptual model parameters

The specific inorganic suspended solids (ISS) production was assigned a value of 2.5 gISS gFe^{−1} (ATV 2000). Then, by analysing 5 years of delivery data of the FeSO_{4} product (measured weight, iron content) and measured data on sludge leaving the plant with trucks (weight, iron content, ash content), it could be concluded that, to close the mass balance, the influent wastewater then contains approximately 30 gISS m^{−3}. This is conditional on the modelling assumption that ISS is not formed by other processes and motivates the use of *f*_{VSS_TSS} = 0.9 as the fixed parameter value for fractionating influent TSS to volatile suspended solids (VSS).

The removal efficiency of the conceptual primary clarifier model was modified to reduce approximately 60% of the particulate material by increasing the efficiency correction factor from 0.65 to 0.73. The removal of Fe^{2+} precipitates in the primary settler was reduced by decreasing the specific settleability factor from 1 to 0.75. This was motivated by measurements of total iron in the thickened waste activated sludge (WAS) flow, indicating the fraction of total added iron that ends up in the biological treatment due to the simultaneous precipitation.

The removal efficiency of the simplified sand filter model was set to 90% and together with the secondary settler model (using default BSM2 parameter values) the TSS in the effluent was on average 4 mgTSS l^{−1}, which is similar to measured values. Finally, the removal efficiency of the WAS thickener was set to 90% based on grab samples of TSS in the reject water and experience of the operational staff.

### Measurement data and acceptable uncertainty ranges

The first step in the process of identifying BIDS is to decide which model outputs *j* to consider and the acceptable uncertainty ranges **ɛ** = {*ɛ _{j}*}, see Figure 2. These were adopted from the ‘good modelling practice’ (GMP) protocol (Rieger

*et al.*2012) and the modelling application type ‘assessing the plant capacity for nitrogen removal’.

Measured process data selected from a period of 200 days (May–November 2016) was selected. Weekly grab samples of MLSS and the measured WAS production from weekly suspended solids (SS) grab samples and online flow measurements of the WAS were used to update the prior uncertainty with information related to the plant sludge production and SRT. Online measurements of NH_{4} and NO_{3} concentrations from the secondary clarifier effluent contributed information on the plant nitrogen removal capacity. and data were validated in the sense that the sets used for calibration originated from periods when duplicate sensor readings, providing similar results, were available.

The measured mean values of the target variables are shown later in Table 1 as are the applied acceptable uncertainty ranges **ɛ**. The acceptable uncertainty range for both NO_{3}-N and NH_{4}-N was 0.9 mg l^{−1}. As a criterion, the authors thus decided to accept all input data sets leading to almost complete nitrification, here defined as an average NH_{4}-N concentration in the aeration tank <1.8 mgN l^{−1}. For the WAS mass loading rate, a higher range (±10%) than recommended (±5%) was used. It is believed that 5% is a too narrow range since, in the applied model, uncertainties related to the TSS wasted with the filter flush water, as well as WAS being recirculated to the plant with reject water, are simulated as uncertainty in **y**_{WAS}.

Model input data I with prior uncertainty δ | Measurement data | ||||||
---|---|---|---|---|---|---|---|

Mean | [±%] | Mean | Range | ||||

m | Unit | S(I) _{m} | δ _{m} | j | Unit | ɛ _{j} | |

TN | kg d^{−1} | 11,600 | 20 | MLSS | g m^{−3} | 2,300 | 230 |

NH_{4}:TN | gN (gN)^{−1} | 0.74 | 5 | WAS | kg d^{−1} | 22,900 | 2,290 |

COD:TN | gCOD (gN)^{−1} | 10.5 | 20 | NH_{4} | gN m^{−3} | 0.9 | 0.9 |

TSS:COD | gSS (gCOD)^{−1} | 0.7 | 20 | NO_{3} | gN m^{−3} | 6.2 | 0.9 |

X_{I}:X_{COD} | gCOD (gCOD)^{−1} | 0.18 | 20 | ||||

Temp(t) | °C | 21^{a} | 10 | ||||

Fe^{2+}(t) | gFe m^{−3} | 18^{a} | 20 | ||||

Q_{WAS}(t) | m^{−3}d^{−1} | 3,260^{a} | 20 |

Model input data I with prior uncertainty δ | Measurement data | ||||||
---|---|---|---|---|---|---|---|

Mean | [±%] | Mean | Range | ||||

m | Unit | S(I) _{m} | δ _{m} | j | Unit | ɛ _{j} | |

TN | kg d^{−1} | 11,600 | 20 | MLSS | g m^{−3} | 2,300 | 230 |

NH_{4}:TN | gN (gN)^{−1} | 0.74 | 5 | WAS | kg d^{−1} | 22,900 | 2,290 |

COD:TN | gCOD (gN)^{−1} | 10.5 | 20 | NH_{4} | gN m^{−3} | 0.9 | 0.9 |

TSS:COD | gSS (gCOD)^{−1} | 0.7 | 20 | NO_{3} | gN m^{−3} | 6.2 | 0.9 |

X_{I}:X_{COD} | gCOD (gCOD)^{−1} | 0.18 | 20 | ||||

Temp(t) | °C | 21^{a} | 10 | ||||

Fe^{2+}(t) | gFe m^{−3} | 18^{a} | 20 | ||||

Q_{WAS}(t) | m^{−3}d^{−1} | 3,260^{a} | 20 |

^{a}Average value for the calibration data period.

### Prior input uncertainty

During the identification phase with access to measured data the BSM framework with parameter values **θ** was considered well established and kept fixed. Besides site-specific conceptual parameters, all ASM1 and settling parameters of the model were kept constant at their default values. The total uncertainty was projected on part of the model input data by the inclusion of eight uncertainty factors in the vector **δ** = {*δ _{m}*}, assumed to be independent and uniformly distributed (

*U*) with min/max ranges and referred to as the multivariate prior uncertainty distribution

*P*.

The selection of input data to consider uncertain was concentrated on the influent wastewater loads and fractions, which are fed to the used influent wastewater generator model (Gernaey *et al.* 2011) including uncertainty in particulate inert COD (*X*_{I}) content (determining the degradability of influent organic matter) and the wastewater temperature, see Box 1 in Figure 2. The influent temperature was chosen as an uncertain calibration parameter, because it explicitly affects the kinetic parameters (growth rate, decay rate, etc.) of the underlying microbiological processes (ASM models). By considering also the operational model inputs *Q*_{WAS} and iron dosage Fe_{Dos} as uncertain, the uncertainties in the settling tank model and the overall sludge balance were modelled, which affect key variables, such as actual sludge wastage from the plant and SRT (Ekama 2010).

Online input data sets considered to be known (hence not uncertain) and directly used as inputs to the model include the influent, return activated sludge (RAS) and nitrate recirculation (REC) flow rates and three sensor data readings of dissolved oxygen concentrations in bioreactors Z5,6,7. This choice was deliberately made to keep the model calibration complexity manageable and focus the discussion on analysis of the presented methodology.

Ranges (min/max) of *P*(**δ**) were derived from online monitoring and influent sampling results as well as from expert reasoning (Sin *et al.* 2009), see Table 1. *i**=* 1,…,*N* latin hypercube sampling of **δ** from *P* were used to generate realizations of model Monte Carlo input data **I**, see Figure 2. For example, the yearly average TN load was estimated to be 11,600 kgN d^{−1} and to have an uncertainty of ±20% (uniform distribution), thus *δ*_{TN} ∼ *U*(0.8,1.2), and the *i*th modelled influent TN load profile was generated by multiplying the *i*th sample of *δ*_{TN} with 11,600 and the estimated yearly and daily load patterns.

The dry weather daily load pattern (*DayVar*(*t*)) was obtained by processing online measurements of influent flow rate and NH_{4}-N concentrations (results not shown). The yearly pattern was derived as follows (see also Figure 3, left). Eighteen years of weekly flow proportional TN influent load data to the WWTP was arranged and scaled with the yearly average to produce 18 yearly load variation profiles, each having a mean value of one. The black line shows the average variation as a function of time, which was filtered and used as input data (*SeasVar*(*t*)) to the simulations. The red region is delimited by the 5% and 95% percentiles of the 18 yearly patterns. In Figure 3 (right), the result of the anticipated min/max prior uncertainties for the TN load (dashed lines) are shown together with measurements from three years (green line).

The *i*th NH_{4}, COD, TSS and X_{I} load profiles were then generated by multiplying the *i*th TN load with the *i*th samples of corresponding uncertainty parameters for wastewater composition: *δ*_{NH4:TN}, *δ*_{COD:TN}, *δ*_{TSS:COD} and *δ*_{XI:XCOD}. Thus, all pollutants were assumed to have the same diurnal and seasonal variation. Three uncertain input data sets were derived from measured time series data (influent temperature, Temp_{Inf}(*t*), *Q*_{WAS}(*t*) and ferrous iron dosage Fe_{Dos}(*t*)) by directly multiplying these with their corresponding uncertainty factor.

### Selection and grading of BIDS

Having run *N* Monte Carlo simulations and after comparing the model results with measurements, realizations that differ more than the pre-defined uncertainty ranges, *ɛ _{j}*, for any target variable,

*j*, are rejected. The subsets of

**δ**yielding results within the ranges are identified as behavioural (BIDS) and denoted

**δ**. An essential part of the presented methodology is the rejection (not only assigning a low likelihood) of non-behavioural models. This is because density-dependent sampling methods, such as Markov chain Monte Carlo, are not proposed here, and for the method to be applicable in practice, the number of ensemble simulations needed to be run for the prediction phase must be limited.

^{B}An input data set, *i*, was rejected if the resulting mean error (ME) of *any of the targets j* were outside the selected uncertainty ranges, e.g. if , where *S*(·) denotes the mean function. Then an accepted behavioural input data set must provide reasonably good results considering *all* target variables and thus, a bad prediction of, for example, MLSS *cannot* be compensated for by good WAS, NH_{4} and NO_{3} predictions. The choice of using the mean value as a statistic to distinguish between behavioural and non-behavioural model representations was based on the time averaging (month/year) proposed for the application type in the GMP protocol (Rieger *et al.* 2012). Also, according to the experience of the authors, a common and pragmatic first step during manual WWTP calibration is to compare results from steady state simulations with averaged results; the applied rejection methodology here is a way to automate this procedure.

Following the GLUE methodology each behavioural model representation is conditioned on past data and given a likelihood weight*, L*, according to the modeller's confidence in that particular model representation as a useful predictor of the future. The likelihoods are then normalized to unity and used to compute the likelihood-weighted cumulative density function (cdf) for any model prediction. From the cdf the percentiles of the simulations with **δ ^{B}** are derived. These uncertainty limits reflect what the model can say about the system response after conditioning on the past calibration data.

*et al.*2015) especially, as in the current study, when there are multiple target variables. The subjective choice of the likelihood measure in this paper was based on a product of four (one for each target variable

*j*) factors with exponential functions of the mean squared errors (MSE). For the

*r*th behavioural input data set

**δ**

^{B}^{(r)},

*r*= 1,…,

*R*, it reads

The uncertainty range, *ɛ _{j}*, is used to model the rate at which the likelihood decreases for an increase in MSEs (Lindblom

*et al.*2011). To exemplify: if a behavioural input data set

**δ**

^{B}^{(r)}yields MSEs of exactly the uncertainty ranges, the

*r*th likelihood value becomes exp(−1·4) = 0.018. The selection of likelihood function and statistic (resolution of data) is not obvious and an important topic for further research.

## RESULTS AND DISCUSSION

Figure 4 (left) shows histograms and a two-dimensional projection for two out of the eight uncertain input parameters (*δ*_{TN} and *δ*_{COD:TN}). *N* = 2,000 input data sets from the uniform prior uncertainty distribution are shown using blue markers. With the assumptions made, 9% (*R* = 179) of those were identified as BIDS forming, in combination with the model structure and default parameter values, plausible validated model representations, which were used further for scenario analyses. The BIDS are shown using red markers and for these two parameters there is a correlation. For high TN loads (*δ*_{TN} > 1.1) there are, for example, no BIDS with a high influent COD:TN ratio (*δ*_{COD:TN} > 1.1). This is because, at least partly, the influent COD load would then be too high and rejected by the WAS criterion (leading to extensive sludge production). A future sensitivity analysis (using Sobol indices or partial correlation coefficients) will be used to complement the selection and analysis of uncertain parameters.

In Figure 4 (right) the likelihood-weighted influent TN load is shown together with the initial prior uncertainty ranges (dashed lines) and measured data. For TN, as well as the other model inputs loads (NH_{4}, COD, TSS, 7-day biochemical oxygen demand), the anticipated prior uncertainty distribution results in simulated uncertainty ranges that cover the measurements. Moreover, after updating the distribution with process measurements the simulated ranges narrow.

Figure 5 shows in turquoise part of the process data, **̃y**, used to identify and grade the BIDS as well as the resulting uncertainty ranges. The WAS load was operated manually during this period and varies a lot because of limitations in the downstream sludge system. This results in varying MLSS concentrations which are well predicted by the simulations with BIDS. Considering the nitrogen target variables, the simulated uncertainty range of NH_{4} does not deviate significantly from the prior uncertainty. This is due to the fact that temperature was high during the calibration period and that the dissolved oxygen (DO) concentration inputs were considered fixed. Then nitrification was almost completely independent of the prior uncertainty. For NO_{3} on the other hand, the uncertainty is reduced significantly after identification and simulation with the BIDS.

### Scenario simulations

To test and validate the uncertainty analysis method, the identified 179 BIDS were used to simulate the two scenarios as outlined in the ‘WWTP and problem statement’ section. The initialization settings were dry weather conditions, an influent temperature of 14 °C and a constant DO setpoint in the last aerobic zone (Z7) of 0.5 gO_{2} m^{−3}. The DO setpoints in Z5–6 were adjusted (maximum 4 gO_{2} m^{−3}) by an effluent ammonia-based aeration controller (ABAC) with setpoint 1.6 gN m^{−3}. *Q*_{Rec} was kept constant and *Q*_{RAS} controlled to 50% of *Q*_{Inf} (Figure 6(a)). A constraint for the scenarios was that the MLSS concentrations of the bioreactors were not allowed to increase above the average MLSS of the calibration period.

The loading and temperature input data series to the scenarios were obtained by replacing **δ** with the behavioural uncertainty factors **δ ^{B}** in the equations of Box 1 in Figure 2 (running only the BIDS).

The uncertainty factors for operational input data needed to be translated to uncertainty factors for the setpoints of the controllers. Dosage of Fe^{2+} is modelled by a feed-forward controller proportional to the influent flow rate. For each of the 179 BIDS, the setpoint value (gFe m^{−3}) is set as the average of the corresponding Fe^{2+} consumption during calibration. Thus, the BIDS that had a high/low chemical sludge production during calibration will have a similarly high/low production during the prediction. Regarding the uncertainty in *Q*_{WAS} it was noted that each set of the 179 BIDS defining the loads, temperature, Fe^{2+} dosage and *Q*_{WAS} was associated with a simulated MLSS concentration profile. Since the average MLSS of the calibration period was assumed to be the maximum allowed, the simulated average MLSS of each BIDS was used as the setpoint value for *Q*_{WAS} in the scenarios. Thus, for example, the food to microorganisms ratios of the BIDS were similar in the base scenarios.

For each scenario, the results of two individual simulations, yielding different answers to the stated problem, are shown in Figure 6 using blue and red colouring. By doing so the importance of uncertainty analysis while drawing conclusions from dynamic WWTP model simulations is highlighted. According to the herein presented methodology, although most steps of the ‘good modelling practice’ are followed, by using and simulating only *one* calibrated model representation (combination of input data, model structure and parameter values), the resulting conclusion drawn could be either the ‘red’ one *or* the ‘blue’ one (or a third one). Recall that both simulations are based on realistic interpretations of the available input data (defined by the prior distribution) and are judged to be sufficiently consistent with the measured process data (defined by the likelihood function).

#### Scenario 1: Tuning of the engines of the remaining recirculation pumps, by increasing Q_{Rec} 26% so that Q_{Rec}/Q_{Inf} increases from 3.5 to 4.2 (Figure 6(a))

Figure 6(b)–6(d) shows the simulated uncertainty limits of effluent NO_{3} with the realizations of two individual BIDS (coloured lines and dots) included for illustrative purpose. The differences in average effluent NO_{3} concentrations after and before the increase, ΔNO_{3}, were studied. As shown in the histogram (Figure 6(c)), although most results indicate enhanced denitrification, there are also BIDS yielding deteriorated performance (ΔNO_{3} > 0). Moreover, as shown in Figure 6(d), BIDS with similar likelihood values might lead to a decrease *or* an increase of the effluent NO_{3} concentration. Among the simulations yielding deteriorated denitrification, the NO_{3} in the anoxic zones is already high with the lower pumping rate, leading to the fact that the increased recirculation only means more oxygen being transported to the anoxic zone.

#### Scenario 2: Increase in influent pollutant loads by increasing δ_{TN} and δ_{Fe} 20% (all others, e.g. the COD and TSS loads, increase as well since generated from the behavioural subset of wastewater composition parameters (Table 1))

To maintain the MLSS despite the load increase at t = 360 d, the WAS flow rate increases in all simulations. The DO concentrations in the aerobic zones (Figure 7, left) are increased by the ABAC controller to maintain the NH_{4} in the effluent at 1.6 gN m^{−3} although the aerobic SRT is decreased (Figure 7, right). By considering the likelihood-weighted 75/25% and 95/5% prediction limits, the results state that the DO will probably increase to above 2 gO_{2} m^{−3} and possible saturate at 4 gO_{2} m^{−3} while the plant can continue to nitrify (effluent ammonia <4 gNH_{4}-N m^{−3}).

However, note that among the simulations with BIDS there are those indicating that nitrification cannot be expected to be complete following the load increase, see the 100/0% percentile. This highlights the fact that the result of a simulation depends on the assumptions made during calibration. The blue and red lines, for example, represent simulations with two of the 179 BIDS that slightly over- and underpredicted the measured MLSS concentration during calibration. The blue simulation is a result of a high anticipated value of the true WAS flow rate and thus a low SRT during calibration. To produce results within the acceptable uncertainty ranges for NH_{4} and NO_{3} (required as having been identified as BIDS) this set of input data is conditional on a high value of the influent temperature. The opposite holds for the red simulation (low Q_{WAS}, high SRT, low temperature). Obviously, the assumptions done for the ‘blue’ simulation during calibration involve a higher sensitivity to the increased load since the NH_{4} setpoint cannot be maintained.

An added value with applying the presented methodology, compared to more traditional methods based on, for example, expert ranges, is that the results are derived with actual and case-specific data and that the uncertainty ranges therefore will consider and be dependent on the amount and quality of accessible data. Another benefit is that the method is easy to use in practice. Once the BIDS have been identified they can be used for simulating the uncertainty of basically any scenario. Although the total uncertainty might not be the concern of, for example, a process engineer, it is also usually valuable to run at least a few alternative simulations while studying a problem. By sampling, for example, 5–10 input data sets from the BIDS and running the model several times, he/she will get much more insight compared to running one simulation only.

It should finally be noted that the results are dependent on several assumptions (choice of parameters to consider uncertain, the prior distribution, target variables, uncertainty ranges and likelihood function, etc.). The predicted uncertainty for scenarios that differ significantly from the calibration data period might therefore not be well described.

## CONCLUSIONS

Monte Carlo methods are powerful methods that offer a range of applications for WWTP process analysis, optimization and more. An important requirement of these methods is definition of an input uncertainty domain, which directly affects the outcome and quality of the Monte Carlo analysis. The authors have presented a ‘proof of concept’ evaluation of a methodology to generate multivariate samples from the input uncertainty domain, which are conditioned on historical plant data and observations. The results show that the applied methodology is promising and useful to verify assumptions related to input data for Monte Carlo analysis.

The authors recognize that many assumptions are made in this study to develop and evaluate the proposed framework, yet they also believe it provides a pragmatic and useful method to improve credibility of Monte Carlo simulations. The latter require random samples of input data, which are often defined by expert knowledge without a systematic way to verify the validity of these ‘expert’ ranges. Rather than falsification or to inspire further mechanistic model development, the long-term goal of demonstrating the applicability of the methodology is to enhance the application of already available WWTP simulators combined with powerful Monte Carlo methods.