A parameter estimation framework was used to evaluate the ability of observed data from a full-scale nitrification–denitrification bioreactor to reduce the uncertainty associated with the bio-kinetic and stoichiometric parameters of an activated sludge model (ASM). Samples collected over a period of 150 days from the effluent as well as from the reactor tanks were used. A hybrid genetic algorithm and Bayesian inference were used to perform deterministic and parameter estimations, respectively. The main goal was to assess the ability of the data to obtain reliable parameter estimates for a modified version of the ASM. The modified ASM model includes methylotrophic processes which play the main role in methanol-fed denitrification. Sensitivity analysis was also used to explain the ability of the data to provide information about each of the parameters. The results showed that the uncertainty in the estimates of the most sensitive parameters (including growth rate, decay rate, and yield coefficients) decreased with respect to the prior information.

## INTRODUCTION

The activated sludge process is one of the most widely applied systems for the removal of natural organic matter and nutrients in municipal wastewater treatment plants (WWTPs) (Gernaey *et al.* 2004). The International Water Association's activated sludge model number 1 (ASM1) was one of the first modeling concepts to explicitly account for the interactions of multiple constituents and functional groups of microorganisms (Henze *et al.* 1987). International Water Association task groups later developed more complex versions of the ASM1: ASM2, ASM2d, and ASM3 (Henze *et al.* 2000). Other models that have been developed include the Barker and Dold model (Barker & Dold 1997), the TUDP model (Meijer *et al.* 2001), and the EAWAG Bio-P module for ASM3 (Rieger *et al.* 2001). A comprehensive review of the different ASM models can be found in Gernaey *et al.* (2004).

The ASMs provide a systematic way to represent the interaction between state variables, process rates, and stoichiometric coefficients in bioreactions, making them flexible tools for incorporating various processes in both full-scale and pilot systems (Comas *et al.* 2008; Sun *et al.* 2009; Fang *et al.* 2010; Fall *et al.* 2011; Busch *et al.* 2013). Modified versions of the ASMs have been developed to address different levels of complexity in representing processes, emerging contaminants, particulate matter forms, and functional groups of microorganisms (Ekama & Wentzel 2004; Stare *et al.* 2006; Bolong *et al.* 2009; Sun *et al.* 2009; Hao *et al.* 2011; Yang *et al.* 2013).

One of the challenges faced in using any ASM is the difficulty of calibrating a large number of bio-kinetic and stoichiometric parameters (Smets *et al.* 2003). Modeling can be performed without calibration and just by using parameter values obtained from the literature based on independent batches or pilot investigations (Henze *et al.* 2000; Cox 2004; Hauduc *et al.* 2011; Rahman *et al.* 2016a, 2016b). However, the complexities of biological reaction systems, variations in raw wastewater characteristics, design configurations, and different experimental or operating conditions can make it difficult to find suitable parameter values from the literature for a specific plant. Furthermore, wide ranges of values have been reported for ASM parameters in the literature, which makes choosing a single parameter set representing the condition in a specific reactor difficult. Cox (2004) reviewed a number of articles and created a database containing different ASM parameter values. As an example, 31 different values have been reported for the maximum heterotrophic growth rate in 22 references, ranging between 0.51 and 11 day^{−1}.

In many cases, an estimation of the ASM parameters through calibration using measured data provides more realistic values for existing biological treatment systems or design of a new system compared to determining the parameter values from the literature (Sharifi *et al.* 2014). Manual and automatic calibration methods have been widely used to estimate ASM parameters using measured data (Meijer *et al.* 2001; Fall *et al.* 2011). These methods are typically based on minimizing an objective function defined as the misfit between model predictions and observed data (Vanrolleghem *et al.* 1999; Machado *et al.* 2009; Kim *et al.* 2010; Keskitalo & Leiviskä 2012). Keskitalo & Leiviskä (2012) applied an evolutionary optimizer to minimize the sum of squared errors between modeled and measured data obtained from a municipal WWTP and a pulp mill WWTP. Kim *et al.* (2010) used the response surface method as a surrogate for the mechanistic ASM to estimate model parameters in an effort to reduce computational costs. Although deterministic (maximum likelihood) parameter estimation can lead to more realistic values for a particular plant than the use of default values from the literature, there remain uncertainties associated with the estimated parameters due to measurement error, model structural error, the errors associated with the representativeness of sampling, influent's flowrate and composition, and environmental factors such as temperature (Sharifi *et al.* 2014). The effect of these uncertainties on the deterministically estimated parameters needs to be quantified to assess the reliability of ASM models (Neumann & Gujer 2008; Hauduc *et al.* 2010; Mannina *et al.* 2012; Sharifi *et al.* 2014). Uncertainty analysis can help in determining the appropriate safety factors to control the capital investment and effluent quality (Bixio *et al.* 2002; Flores-Alsina *et al.* 2008; Busch *et al.* 2013). A common method to conduct uncertainty analysis is the Monte Carlo simulation – by generating random parameter sets within the preconceived range for each parameter – aiming to find the model effluent's confidence intervals. Bixio *et al.* (2002) applied Monte Carlo simulations to avoid large safety factors in the design stage of a conventional WWTP, which resulted in a 21% reduction in sizing and 43% reduction in capital investment equivalent to 1.2 million euros. Flores-Alsina *et al.* (2008) evaluated different control strategies with input uncertainty using Monte Carlo simulations. Sin *et al.* (2011) and Sin *et al.* (2009) performed Monte Carlo simulations by considering uncertainties associated with ASM1's parameters, partitioning coefficients (e.g., chemical oxygen demand (COD) content of particulate constituents), and hydraulic parameters to obtain the confidence intervals for model predictions. Alikhani *et al.* (2015) applied autoregressive moving-average model to generate random realizations of influent flowrates and composition into a biological nutrient removal system aiming at evaluating the influent uncertainty.

Different approaches have been used in different areas of water resources to perform backward uncertainty propagation. These approaches can be classified into those based on multiple linear regression (Hill & Tiedeman 2007; Foglia *et al.* 2009), global methods such as the generalized likelihood uncertainty estimator (Beven & Freer 2001; Mannina *et al.* 2012; Sathyamoorthy *et al.* 2014), and Bayesian inference (Dotto *et al.* 2011; Albrecht 2013; Sharifi *et al.* 2014; Alikhani *et al.* 2016a). Albrecht (2013) compared four different parameter estimation techniques for a collection of reaction models with artificially added noise to experimental data and concluded that for highly non-linear models, the Markov Chain Monte Carlo (MCMC) algorithm based on Bayesian inference has better accuracy than local sensitivity-based approaches. However, the computational cost of MCMC simulations is significantly higher than local sensitivity-based methods, as they require a relatively larger number of model runs. For a discussion of the number of runs used in our work please see ‘Parameter estimation using MCMC’ section. The Bayesian approach provides a joint probability distribution representing the degree of confidence regarding parameters after the application of observed data (referred to as the posterior distribution). The posterior range is the parameter's degree of credibility dictated by the observed measured points through the Bayesian inference (Kaplan 1997). In the Bayesian approach, prior knowledge about parameter values, based on the literature or independent laboratory experiments, can be incorporated into estimations via prior distributions (Sharifi *et al.* 2014). The magnitude of reduction in the spread of the posterior distribution of parameters relative to their prior distributions can be used as an indicator for the amount of information gained regarding each parameter. The posterior joint probability distribution of parameters obtained from the Bayesian inference can be applied in a Monte Carlo stochastic simulation technique for stochastic simulation of the biological treatment systems (Sin *et al.* 2011, 2009; Mannina *et al.* 2012).

Only a small number of studies have focused on the parameter estimation of the nitrification and denitrification (NDN) processes using real data from full-scale bioreactors (Andreottola *et al.* 1997; Koch *et al.* 2001; Choubert *et al.* 2009; Kaelin *et al.* 2009; Mannina *et al.* 2012). The main contribution of the presented study is to systematically quantify the information content of long-term field data from a methanol-feed NDN system in the context of the estimation of the modified ASM's bio-kinetics and stoichiometric parameters. The objectives of this study can be summarized as follows: (1) finding the level of confidence regarding parameter values and model prediction by applying Bayesian inference over the long-term wastewater characteristic data, (2) explaining the ability of the observed data to inform about each parameter through sensitivity analysis and evaluating parameter correlation, and (3) evaluating the effect of sampling intervals on the information content of the observed data with respect to model parameters.

## MATERIALS AND METHODS

### Modeling biological treatment processes

*V*,

*C*,

*Q*, and indicate the volume, concentration, flow, and flow's fraction factor (determining how much of the inflow and return flow enters each stage in a step-feed system), respectively,

*H*is the Heaviside (unit step) function,

*R*is the reaction rate, is the stoichiometric coefficient, is the mass transfer rate, is the saturated concentration, and is the external mass flowrate. The indices are:

*i*indicates a constituent,

*k*represents a tank,

*r*indicates a return flow, ‘in’ shows conditions in influent, and

*l*indicates a process (or reaction). Moreover,

*m*is the total number of tanks, and is the total number of reactions. is the flowrate from tank to tank

*k*either due to sequential stages or through feed-back or bypass (a positive value means flow into stage ). In Equation (1), the term on the left side is the rate of change in the total mass of the constituents

*i*in tank

*k*; the first term on the right-hand side of Equation (1) is the mass inflow due to return flow; the second term is the mass inflow of the influent; the third and fourth terms represent inner-connection flow between tanks

*k*and ; the fifth term is the production or disappearance of constituents due to reactions; the sixth term is the effect of rate-limited mass transfer (e.g., aeration); and the last term is the direct addition of the constituents (e.g., the addition of a carbon source for denitrification). To obtain solid particle concentrations in the return flow , a dynamic clarifier model (Takács

*et al.*1991) or a quasi steady-state approximation (i.e., performing mass balance while ignoring the solid storage changes in the clarifier) can be applied.

### Bayesian parameter estimation

*et al.*2014):where is the state variables vector containing the concentrations of all the constituents in the model. The length of the vector is equal to the number of constituents times the number of CSTR tanks. is the external forcing input representing the influent flowrate , influent concentrations , return activated sludge (RAS) flowrate , waste activated sludge (WAS) flowrate , external chemical loading rate , and the temperature. is the vector of the model's parameters. Function

*f*in Equation (2) represents the bioreactor model structure, which is represented in Equation (1).

*g*represents the error structure function (e.g., for log-normal error structure,

*g*is the natural logarithm function, while for Gaussian error structure,

*g*is the identity function), and is a random vector containing measurement, structural, and external forcing errors, which is assumed to collectively follow a multivariate normal distribution.

**,**represents the prior knowledge about the parameters and error structure that can be obtained from literature reviews, experts' experience, or independently performed experimental results. The denominator is a normalizing factor (Lu

*et al.*2014a).

In Equation (5) contains all the parameters that are intended to be estimated using the observed data set . Vector of includes the bio-kinetics and stoichiometric parameters **,** the elements of the partitioning coefficient , and the elements of the variance-covariance matrix for the random error term . Therefore, , where is the total number of unknown ASM model parameters in Equation (2), is the total number of unknown partitioning coefficients in Equation (3), and is the total number of elements in the variance-covariance matrix in Equation (7). Applying Bayesian inference enhances our degree of confidence about each unknown parameter any time a new evidence point (observed data) applies. In fact, each measured data point contains a potential level of information that can be determined by applying the Bayesian inference.

*et al.*2009):where is the observed concentration of constituent

*,*is the number of different observed constituents,

*j*indicates the time and location of the measurement, is the number of total samples of observed constituent

*i*. The mapping function

*g*in the likelihood function is the transformation depending on the error distribution, and is its derivative. For example, considering

*g*being identity function implies a normally distributed and additive error structure, while assuming

*g*to be logarithm function results in a log-normally distributed and multiplicative error structure.

*i*at time/location

*j*, and is the standard deviation of observed constituent

*i*.

### MCMC sampling and numerical solution scheme

Due to a large number of parameters and heterogeneity of the observed data, solving Equation (5) by using the analytical methods is not possible. Therefore, an MCMC approach (Gamerman & Lopes 2006; Meyn & Tweedie 2012) was used to generate random samples according to the posterior distribution. In this study, the Metropolis–Hasting (M–H) algorithm (Metropolis *et al.* 1953) was used to obtain a sequence of random numbers from the posterior probability distribution. In the deterministic approach, the values of parameters that maximize the likelihood function in Equation (7) were obtained by applying the hybrid genetic algorithm (GA).

To evaluate the convergence of the MCMC algorithm, a fake parameter with no influence on the model prediction is defined and its posterior distribution is monitored during the Markov Chain sampling. It is expected that the posterior and prior distribution of the fake parameter be identical after a sufficient number of MCMC sampling. The practice of throwing away the initial portion of Markov Chain samples in order to avoid the effect of potentially mis-represented samples on the construction of the posterior distribution is referred to as burn-in (Meyn & Tweedie 2012). The fake parameter was also used to estimate the number of burn-in samples.

To calculate the likelihood function of Equation (6), for each set of Markov Chain samples, the system of ODEs comprising Equation (2) should be solved. In addition to the non-linearity of these ODEs, a wide range of biochemical reaction rate scales ranging from seconds (e.g., for oxygen transfer rate) to days (e.g., for microbial growth rates) results in a stiff system of ODEs (Lindberg & Carlsson 1996). The stiffness of the system of ODEs and daily fluctuations in external forcing (influent, RAS, WAS, loadings, and temperature changes) requires small computational time-steps, which in turn can result in a long simulation runtime. This is particularly a challenge due to the fact that the MCMC algorithm requires a large number of model simulations to obtain the final solution. To overcome this problem, an adaptive time-step backward differentiation formula (BDF) algorithm was developed to solve the system of ODEs. A detailed explanation of the proposed adaptive BDF algorithm for solving the system of ODEs comprising the ASM is presented in Alikhani *et al.* (2016b) and Alikhani *et al.* (2016c).

The total suspended solids (TSS) and VSS values in the effluent of the NDN system during the time of simulation show average values of 3.8 and 2.6 mg L^{−1}, respectively, indicating the very close to the ideal settling performance of the clarifier. Therefore, a quasi steady-state approximation is assumed to obtain the solid particle concentrations in the return flow in Equation (1). In addition, in the biological reaction network used for modeling the reactors, the program developed for this study allows any reaction rate expression and stoichiometric constant to be specified by the user as a function of concentrations of components and/or model parameters.

### Global sensitivity analysis

Local and global sensitivities can be used to rank the model parameters in terms of the significance of their impact on important state variables (Saltelli *et al.* 2008; Kalyanaraman *et al.* 2015). Sensitivity analysis can also be applied to explain parameters' identifiability (Weijers *et al.* 1996; Brun *et al.* 2002; Sin *et al.* 2005). A large sensitivity value of model output with respect to each parameter is a necessary but not sufficient condition for the identifiability of the parameter due to the fact that the model output can change more significantly as a result of changing that parameter (Freni *et al.* 2009). Another factor that can influence the identifiability of parameters is parameter correlation (Eberly & Casella 2003; Ngo *et al.* 2015). Several approaches for estimating local and global sensitivity have been used in the literature (Saltelli *et al.* 2008; Zhan *et al.* 2013).

*i*obtained from the Bayesian inference application. The global sensitivities in this study were calculated by sampling the parameter sets provided by the MCMC analysis and performing local sensitivity analyses. The expected values of the sensitivities can then be estimated algebraically.

*i*, weighted by the reciprocal of the observation error standard deviation, , obtained by applying Equation (7). In fact, is an indicator of the overall sensitivity of the model output with respect to each parameter.

## MODEL APPLICATION

### Biological nitrogen removal process configuration

The described parameter estimation framework was applied to estimate the bio-kinetic, stoichiometric, and aeration parameters in the NDN phase of the Blue Plains Advanced WWTP in Washington, DC, USA. The system can be described as a post-denitrification with methanol addition as an external carbon source. The NDN reactor consists of 12 parallel reactors, each having an active volume of 17,500 m^{3}. Each reactor is compartmented into eight tanks in series, including two large and six smaller compartments (Figure 1). Nitrification occurs mainly at tanks 1A, 1B, and 2. Tank 3A is not aerated and serves as the deoxygenation stage. Methanol is added at tank 3B, and denitrification occurs in tanks 3B, 4, and 5A under anoxic conditions. The last tank is aerated again to strip nitrogen gas, which can otherwise hinder the settling of activated sludge in the clarifier. Fine bubble diffusers in the aerated tanks and hyperboloid mixers in the un-aerated tanks are used to create completely mixed conditions.

### Observed data

_{3}), nitrite and nitrate (NO

_{x}), dissolved oxygen, and methanol concentrations data were collected from different sampling locations (Figure 1) during February to June of 2010. Three sets of measured data were used in this study, including the following:

External forcing data referring to the flowrate and characteristics of the influent, WAS, RAS, and temperature (Figure 2) that were used as given inputs ( in Equation (2)) to the model.

Daily-average flowrate and characteristics of effluent over 150 days of simulation, used as observed data.

Spatial profile data referring to the measured concentrations of different constituents in the bioreactors in different sampling locations used as observed data. These observed data were gathered on a weekly to bi-weekly basis as grab samples from points 1 to 9 in tanks 3A through 5A (Figure 1) to take into account the effect of spatial heterogeneity as a source of uncertainty in measured constituents.

Approximately 7,000 measured data points taken from different times and locations were used to calculate the likelihood functions. In the Blue Plains advanced WWTP, the influent to the denitrification stage is the effluent of the BOD removal phase (the phase prior to NDN), and thus contains high levels of ammonia and low sCOD (roughly 15 mg N L^{−1} of ammonia and 25 mg L^{−1} of sCOD). Inflows typically varied between 113,000 and 133,000 m^{3} d^{−1} per reactor depending on the influent wastewater flowrate and the number of reactors in service. A higher inflow fluctuation occurred in the period prior to early April (Figure 2) because the Blue Plains WWTP receives combined domestic wastewater and stormwater runoff and thus faces high flowrate fluctuations, particularly in early spring. Methanol is applied at a rate of 45 liters per 1,000 m^{3} of the influent and serves as the carbon source for methylotrophic denitrification. The return flowrate (RAS) ratio to the influent flowrate varied from 0.5 to 1.3.

### Reaction network

In the United States, methanol is often used as an external substrate for heterotrophic denitrification because of its availability and cost (Mokhayeri *et al.* 2008). When the methanol concentration is adequate, a new functional group of specialist heterotrophic microorganisms, collectively referred to as methylotrophs, are able to thrive (Lu *et al.* 2014b). Methylotrophs only utilize methanol as an electron donor in anoxic conditions and have different growth, yield, and decay rates than generalist heterotrophic organisms that use natural organic matter (Lee & Welander 1996; Hallin *et al.* 2006; Baytshtok *et al.* 2008; Dold *et al.* 2008; Mustakhimov *et al.* 2013; Alikhani *et al.* in press). Therefore, a modified version of ASM1 (referred to as M-ASM1), to take into account the anoxic growth and decay of methylotrophs plus the aerobic growth of heterotrophs on methanol, has been developed (Alikhani *et al.* 2014). The M-ASM1 reaction table is shown in Table 1 along with the process description rate expressions and stoichiometric coefficients , which are respectively introduced in Equation (1). In this modification, processes 2, 4 and 7 are added to ASM1. The process 2 was included because the unused methanol in the anoxic zone can be utilized by heterotrophs in the aerobic zone. There are also two new constituents, methanol (*S _{m}*) and methylotrophic biomass (

*X*), added to the original 13 constituents of ASM1. Reaction rates 1 to 5 were modified by considering ammonia as a limiting factor to all the biomass growth rates using Monod kinetics (Hauduc

_{B,M}*et al.*2010). Processes 6 to 7 were modified by implementing the adjustable decay rates under anoxic and aerobic conditions.

J . | Process . | Reaction . | Rate . |
---|---|---|---|

1 | Aerobic growth of heterotrophs | ||

2 | Aerobic growth of heterotrophs on methanol | ||

3 | Anoxic growth of heterotrophs | ||

4 | Anoxic growth of methylotrophs | ||

5 | Aerobic growth of autotrophs | ||

6 | Decay of heterotrophs | ||

7 | Decay of methylotrophs | ||

8 | Decay of autotrophs | ||

9 | Ammonification of soluble organic nitrogen | ||

10 | Hydrolysis of entrapped organics | ||

11 | Hydrolysis of entrapped organic nitrogen |

J . | Process . | Reaction . | Rate . |
---|---|---|---|

1 | Aerobic growth of heterotrophs | ||

2 | Aerobic growth of heterotrophs on methanol | ||

3 | Anoxic growth of heterotrophs | ||

4 | Anoxic growth of methylotrophs | ||

5 | Aerobic growth of autotrophs | ||

6 | Decay of heterotrophs | ||

7 | Decay of methylotrophs | ||

8 | Decay of autotrophs | ||

9 | Ammonification of soluble organic nitrogen | ||

10 | Hydrolysis of entrapped organics | ||

11 | Hydrolysis of entrapped organic nitrogen |

### Parameter estimation using MCMC

All model parameters, including the 95% prior ranges used to construct the prior distributions and temperature correction factor ‘’ are shown in Table 2. The prior distributions of each parameter are assumed to follow a log-normal distribution, as suggested by Cox (2004). The given range for parameters was obtained from various sources (Bullock *et al.* 1996; Lee & Welander 1996; Cox 2004; Baytshtok *et al.* 2008; Dold *et al.* 2008; Hauduc *et al.* 2011; Alikhani *et al.* in press). Although the partition coefficient (components of vector in Equation (3)) can contribute to the uncertainty, in this research the main focus was on estimating the bio-kinetics and stoichiometric parameters, so the partitioning coefficients were assumed to be known with an acceptable level of certainty.

Type . | Symbol . | Parameter . | Unit . | Factor θ^{a}
. | Prior Range . | Posterior Range . | ||
---|---|---|---|---|---|---|---|---|

Low . | High . | Expected Value . | Standard Deviation . | |||||

Heterotrophic kinetics | μ _{H} | Maximum specific growth rate of heterotrophs | d^{−1} | 1.072 | 2 | 10 | 1.742 | 0.158 |

K _{S} | Substrate half saturation for heterotrophs | g COD m^{−3} | 1.03 | 1 | 10 | 8.065 | 1.274 | |

K _{M,H} | Methanol half saturation for heterotrophs | g COD m^{−3} | 1 | 0.1 | 0.5 | 0.254 | 0.046 | |

K _{O,H} | O_{2} half saturation for heterotrophs | g O_{2} m^{−3} | 1 | 0.02 | 0.1 | 0.030 | 0.005 | |

K _{NO,H} | NOx half saturation for heterotrophs | g N m^{−3} | 1 | 0.01 | 0.1 | 0.067 | 0.012 | |

η_{g} | Anoxic growth reduction for heterotrophs | – | 1 | 0.4 | 0.8 | 0.582 | 0.043 | |

b _{H} | Aerobic decay rate coefficient for heterotrophs | d^{−1} | 1 | 0.4 | 0.7 | 0.641 | 0.055 | |

K _{NH} | NHx half saturation for heterotrophs/methylotrophs | g N m^{−3} | 1 | 0 | 0.03 | 0.014 | 0.004 | |

Methylotrophic kinetics | μ _{M} | Maximum specific growth rate of methylotrophs | d^{−1} | 1.09 | 0.8 | 2 | 0.720 | 0.053 |

K _{M,M} | Methanol half saturation coefficient | g COD m^{−3} | 1 | 0.1 | 1 | 0.119 | 0.058 | |

K _{O,M} | O_{2} half saturation for methylotrophs | g O_{2} m^{−3} | 1 | 0.01 | 0.1 | 0.033 | 0.010 | |

K _{NO,M} | NOx half saturation for methylotrophs | g N m^{−3} | 1 | 0.01 | 1 | 0.629 | 0.171 | |

b _{M} | Aerobic decay rate coefficient for methylotrophs | d^{−1} | 1.03 | 0.04 | 0.1 | 0.098 | 0.008 | |

Autotrophic kinetics | μ _{A} | Maximum specific growth rate of autotrophs | d^{−1} | 1.072 | 0.7 | 1.2 | 1.059 | 0.058 |

K _{NH,A} | Ammonia half saturation for autotrophs | g N m^{−3} | 1 | 0.5 | 1 | 0.952 | 0.049 | |

K _{NO,A} | NOx half saturation for autotrophs | g N m^{−3} | 1 | 0.01 | 0.2 | 0.020 | 0.010 | |

K _{O,A} | Oxygen half saturation for autotrophs | g O_{2} m^{−3} | 1 | 0.1 | 0.5 | 0.238 | 0.080 | |

b _{A} | Aerobic decay rate coefficient for autotrophs | d^{−1} | 1.03 | 0.15 | 0.25 | 0.185 | 0.009 | |

Conversion kinetics | η_{h} | Anoxic growth reduction for decay | – | 1 | 0.3 | 0.8 | 0.736 | 0.065 |

K _{h} | Hydrolysis rate coefficient | d^{−1} | 1.03 | 1 | 3 | 1.088 | 0.172 | |

K _{X} | Hydrolysis half saturation coefficient | – | 1 | 0.01 | 0.2 | 0.095 | 0.028 | |

K _{a} | Ammonification rate coefficient | d^{−1} | 1.03 | 0.01 | 0.1 | 0.057 | 0.021 | |

Stoichiometries | Y _{H} | Aerobic yield of heterotrophs on substrate | – | 1 | 0.6 | 0.7 | 0.611 | 0.011 |

Y _{HM} | Aerobic yield of heterotrophs on methanol | – | 1 | 0.4 | 0.4 | – | – | |

Y _{A} | Autotroph yield | – | 1 | 0.24 | 0.24 | – | – | |

Y _{M} | Methylotroph yield | – | 1 | 0.3 | 0.5 | 0.462 | 0.011 | |

f _{P} | Endogenous fraction (death-regeneration) | – | 1 | 0.08 | 0.08 | – | – | |

Partitioning coefficients | i_{XB} | Nitrogen fraction in biomass | g N g COD^{−1} | 1 | 0.05 | 0.1 | 0.058 | 0.005 |

i_{VSS,B} | COD/VSS ratio of biomass | g COD g VSS^{−1} | 1 | 1.42 | 1.42 | – | – | |

i_{VSS,i} | COD/VSS ratio of X_{i} | g COD g VSS^{−1} | 1 | 1.5 | 2 | – | – | |

i_{VSS,s} | COD/VSS ratio of X_{s} | g COD g VSS^{−1} | 1 | 1.8 | 1.8 | – | – | |

i_{VSS,P} | COD/VSS ratio of X_{p} | g COD g VSS^{−1} | 1 | 1.42 | 1.42 | – | – | |

i_{MeOH} | COD ratio of methanol | g COD g^{−1} | 1 | 1.5 | 1.5 | – | – | |

Mass transfer | k_{L,O2} | Oxygen mass transfer rate | day^{−1} | 1 | 50 | 250 | 190.446 | 7.780 |

Type . | Symbol . | Parameter . | Unit . | Factor θ^{a}
. | Prior Range . | Posterior Range . | ||
---|---|---|---|---|---|---|---|---|

Low . | High . | Expected Value . | Standard Deviation . | |||||

Heterotrophic kinetics | μ _{H} | Maximum specific growth rate of heterotrophs | d^{−1} | 1.072 | 2 | 10 | 1.742 | 0.158 |

K _{S} | Substrate half saturation for heterotrophs | g COD m^{−3} | 1.03 | 1 | 10 | 8.065 | 1.274 | |

K _{M,H} | Methanol half saturation for heterotrophs | g COD m^{−3} | 1 | 0.1 | 0.5 | 0.254 | 0.046 | |

K _{O,H} | O_{2} half saturation for heterotrophs | g O_{2} m^{−3} | 1 | 0.02 | 0.1 | 0.030 | 0.005 | |

K _{NO,H} | NOx half saturation for heterotrophs | g N m^{−3} | 1 | 0.01 | 0.1 | 0.067 | 0.012 | |

η_{g} | Anoxic growth reduction for heterotrophs | – | 1 | 0.4 | 0.8 | 0.582 | 0.043 | |

b _{H} | Aerobic decay rate coefficient for heterotrophs | d^{−1} | 1 | 0.4 | 0.7 | 0.641 | 0.055 | |

K _{NH} | NHx half saturation for heterotrophs/methylotrophs | g N m^{−3} | 1 | 0 | 0.03 | 0.014 | 0.004 | |

Methylotrophic kinetics | μ _{M} | Maximum specific growth rate of methylotrophs | d^{−1} | 1.09 | 0.8 | 2 | 0.720 | 0.053 |

K _{M,M} | Methanol half saturation coefficient | g COD m^{−3} | 1 | 0.1 | 1 | 0.119 | 0.058 | |

K _{O,M} | O_{2} half saturation for methylotrophs | g O_{2} m^{−3} | 1 | 0.01 | 0.1 | 0.033 | 0.010 | |

K _{NO,M} | NOx half saturation for methylotrophs | g N m^{−3} | 1 | 0.01 | 1 | 0.629 | 0.171 | |

b _{M} | Aerobic decay rate coefficient for methylotrophs | d^{−1} | 1.03 | 0.04 | 0.1 | 0.098 | 0.008 | |

Autotrophic kinetics | μ _{A} | Maximum specific growth rate of autotrophs | d^{−1} | 1.072 | 0.7 | 1.2 | 1.059 | 0.058 |

K _{NH,A} | Ammonia half saturation for autotrophs | g N m^{−3} | 1 | 0.5 | 1 | 0.952 | 0.049 | |

K _{NO,A} | NOx half saturation for autotrophs | g N m^{−3} | 1 | 0.01 | 0.2 | 0.020 | 0.010 | |

K _{O,A} | Oxygen half saturation for autotrophs | g O_{2} m^{−3} | 1 | 0.1 | 0.5 | 0.238 | 0.080 | |

b _{A} | Aerobic decay rate coefficient for autotrophs | d^{−1} | 1.03 | 0.15 | 0.25 | 0.185 | 0.009 | |

Conversion kinetics | η_{h} | Anoxic growth reduction for decay | – | 1 | 0.3 | 0.8 | 0.736 | 0.065 |

K _{h} | Hydrolysis rate coefficient | d^{−1} | 1.03 | 1 | 3 | 1.088 | 0.172 | |

K _{X} | Hydrolysis half saturation coefficient | – | 1 | 0.01 | 0.2 | 0.095 | 0.028 | |

K _{a} | Ammonification rate coefficient | d^{−1} | 1.03 | 0.01 | 0.1 | 0.057 | 0.021 | |

Stoichiometries | Y _{H} | Aerobic yield of heterotrophs on substrate | – | 1 | 0.6 | 0.7 | 0.611 | 0.011 |

Y _{HM} | Aerobic yield of heterotrophs on methanol | – | 1 | 0.4 | 0.4 | – | – | |

Y _{A} | Autotroph yield | – | 1 | 0.24 | 0.24 | – | – | |

Y _{M} | Methylotroph yield | – | 1 | 0.3 | 0.5 | 0.462 | 0.011 | |

f _{P} | Endogenous fraction (death-regeneration) | – | 1 | 0.08 | 0.08 | – | – | |

Partitioning coefficients | i_{XB} | Nitrogen fraction in biomass | g N g COD^{−1} | 1 | 0.05 | 0.1 | 0.058 | 0.005 |

i_{VSS,B} | COD/VSS ratio of biomass | g COD g VSS^{−1} | 1 | 1.42 | 1.42 | – | – | |

i_{VSS,i} | COD/VSS ratio of X_{i} | g COD g VSS^{−1} | 1 | 1.5 | 2 | – | – | |

i_{VSS,s} | COD/VSS ratio of X_{s} | g COD g VSS^{−1} | 1 | 1.8 | 1.8 | – | – | |

i_{VSS,P} | COD/VSS ratio of X_{p} | g COD g VSS^{−1} | 1 | 1.42 | 1.42 | – | – | |

i_{MeOH} | COD ratio of methanol | g COD g^{−1} | 1 | 1.5 | 1.5 | – | – | |

Mass transfer | k_{L,O2} | Oxygen mass transfer rate | day^{−1} | 1 | 50 | 250 | 190.446 | 7.780 |

^{a}Temperature dependent parameters: P(T) = P_{20}θ^{(T-20)}.

Using the M–H algorithm, 500,000 samples in 10 chains were drawn from the posterior distribution of parameters. To expedite the convergence of the MCMC algorithm, the deterministically estimated parameters using a hybrid GA with 500 generations, and a population of 50 were used as the starting point of each MCMC chain. This expedites the convergence of the M–H algorithm by decreasing the number of burn-in samples. The posterior probability of each parameter was evaluated by the Bayesian inference introduced in Equation (5), assuming a log-normal error structure. To reach stationary distributions, 100,000 of the sampled parameters were discarded due to the burn-in period (Brooks *et al.* 2011).

## RESULTS AND DISCUSSION

### Bayesian credible interval of parameters

*b*(0.5–0.97 d

_{H}^{−1}) is consistent with the deterministically estimated value of 0.926 d

^{−1}by Keskitalo & Leiviskä (2012) using the ASM1 model. The estimated 95% range for

*b*(0.17–0.21 d

_{A}^{−1}) is larger than the value of 0.12 d

^{−1}obtained by Keskitalo & Leiviskä (2012) but consistent with the value of 0.17 d

^{−1}suggested by Fall

*et al.*(2011) and the value of 0.18 d

^{−1}suggested by Mannina

*et al.*(2012). The estimated range for

*K*(4.53–11.22 g COD m

_{s}^{−3}) is smaller than the value of 14.48 g COD m

^{−3}found by Keskitalo & Leiviskä (2012). The ranges of

*K*(0.8–1.06 g N m

_{NH,A}^{−3}) and

*K*(0.01–0.03 g N m

_{NH}^{−3}) are both smaller than the value of 1.41 g N m

^{−3}provided by Mannina

*et al.*(2012). The inconsistency between the

*K*ranges obtained in this study with previous studies is due to the consideration of two different parameters for ammonia half saturation for autotrophs and heterotrophs, which was not done in ASM1. It worth noting that the presented approach does not capture the possible temporal changes in the values of the parameters that may occur as a result of aging or seasonal effects. However, the presence of such temporal variations in the values of model parameters is implicitly laid within the posterior interval. In order to explicitly capture these temporal changes, the processes leading to them should be incorporated into the model.

_{NH}*et al.*in press):where is the information content gained for parameter (referred to as entropy improvement), the first term on the right-hand side is the entropy of posterior distribution and the second term on the right-hand side is the entropy of the prior distribution of parameter . A positive value for shows a degree of information gain for parameter . The higher the value of the higher the level of information is obtained by applying the observed data in the Bayesian inference.

Type . | Symbol . | Prior Entropy . | High Frequency Obs. Data . | Low Frequency Obs. Data . | ||
---|---|---|---|---|---|---|

Posterior Entropy . | I
. | Posterior Entropy . | I
. | |||

Heterotrophic kinetics | μ _{H} | −0.87 | 0.07 | 0.94 | −0.30 | 0.58 |

K _{S} | −0.87 | −0.71 | 0.16 | −1.14 | −0.27 | |

K _{M,H} | 0.43 | 0.60 | 0.17 | 0.52 | 0.09 | |

K _{O,H} | 1.13 | 1.52 | 0.40 | 1.10 | −0.03 | |

K _{NO,H} | 1.12 | 1.09 | −0.03 | 0.95 | −0.17 | |

η_{g} | 0.39 | 0.63 | 0.24 | 0.53 | 0.13 | |

b _{H} | 0.51 | 0.51 | 0.00 | 0.44 | −0.07 | |

K _{NH} | 1.64 | 1.64 | 0.00 | 0.83 | −0.80 | |

Methylotrophic kinetics | μ _{M} | −0.08 | 0.43 | 0.51 | 0.07 | 0.15 |

K _{M,M} | 0.13 | 0.24 | 0.12 | 0.04 | −0.08 | |

K _{O,M} | 1.12 | 1.15 | 0.03 | 0.89 | −0.23 | |

K _{NO,M} | 0.16 | 0.07 | −0.10 | −0.62 | −0.78 | |

b _{M} | 1.22 | 1.39 | 0.16 | 1.36 | 0.14 | |

Autotrophic kinetics | μ _{A} | 0.29 | 0.54 | 0.25 | 0.44 | 0.15 |

K _{NH,A} | 0.30 | 0.55 | 0.26 | 0.47 | 0.18 | |

K _{NO,A} | 0.85 | 1.23 | 0.38 | 1.00 | 0.15 | |

K _{O,A} | 0.43 | 0.55 | 0.12 | 0.37 | −0.06 | |

b _{A} | 0.99 | 1.28 | 0.29 | 1.15 | 0.16 | |

Conversion kinetics | η_{h} | 0.30 | 0.43 | 0.12 | 0.35 | 0.04 |

K _{h} | −0.29 | 0.07 | 0.37 | −0.23 | 0.07 | |

K _{X} | 0.85 | 0.81 | −0.04 | 0.65 | −0.20 | |

K _{a} | 1.12 | 0.82 | −0.30 | 0.79 | −0.33 | |

Stoichiometries | Y _{H} | 0.99 | 1.23 | 0.24 | 1.12 | 0.13 |

Y _{M} | 0.69 | 1.25 | 0.56 | 1.08 | 0.39 | |

i _{XB} | 1.30 | 1.52 | 0.22 | 1.44 | 0.15 | |

Standard deviations | σ_{0} | −0.09 | 1.22 | 1.31 | 0.90 | 0.98 |

σ_{1} | −0.09 | 0.56 | 0.65 | 0.67 | 0.76 | |

σ_{2} | −0.09 | 0.39 | 0.47 | 0.46 | 0.55 | |

σ_{3} | −0.09 | 1.44 | 1.53 | 1.48 | 1.56 | |

σ_{4} | −0.09 | 0.46 | 0.55 | 0.36 | 0.45 | |

σ_{5} | −0.09 | 0.58 | 0.67 | 0.53 | 0.62 | |

σ_{6} | −0.09 | 0.32 | 0.40 | 0.37 | 0.46 | |

σ_{7} | −0.09 | 1.60 | 1.69 | 1.37 | 1.46 | |

Mass transfer | k _{L,O2} | −2.27 | −1.68 | 0.59 | −2.04 | 0.22 |

Type . | Symbol . | Prior Entropy . | High Frequency Obs. Data . | Low Frequency Obs. Data . | ||
---|---|---|---|---|---|---|

Posterior Entropy . | I
. | Posterior Entropy . | I
. | |||

Heterotrophic kinetics | μ _{H} | −0.87 | 0.07 | 0.94 | −0.30 | 0.58 |

K _{S} | −0.87 | −0.71 | 0.16 | −1.14 | −0.27 | |

K _{M,H} | 0.43 | 0.60 | 0.17 | 0.52 | 0.09 | |

K _{O,H} | 1.13 | 1.52 | 0.40 | 1.10 | −0.03 | |

K _{NO,H} | 1.12 | 1.09 | −0.03 | 0.95 | −0.17 | |

η_{g} | 0.39 | 0.63 | 0.24 | 0.53 | 0.13 | |

b _{H} | 0.51 | 0.51 | 0.00 | 0.44 | −0.07 | |

K _{NH} | 1.64 | 1.64 | 0.00 | 0.83 | −0.80 | |

Methylotrophic kinetics | μ _{M} | −0.08 | 0.43 | 0.51 | 0.07 | 0.15 |

K _{M,M} | 0.13 | 0.24 | 0.12 | 0.04 | −0.08 | |

K _{O,M} | 1.12 | 1.15 | 0.03 | 0.89 | −0.23 | |

K _{NO,M} | 0.16 | 0.07 | −0.10 | −0.62 | −0.78 | |

b _{M} | 1.22 | 1.39 | 0.16 | 1.36 | 0.14 | |

Autotrophic kinetics | μ _{A} | 0.29 | 0.54 | 0.25 | 0.44 | 0.15 |

K _{NH,A} | 0.30 | 0.55 | 0.26 | 0.47 | 0.18 | |

K _{NO,A} | 0.85 | 1.23 | 0.38 | 1.00 | 0.15 | |

K _{O,A} | 0.43 | 0.55 | 0.12 | 0.37 | −0.06 | |

b _{A} | 0.99 | 1.28 | 0.29 | 1.15 | 0.16 | |

Conversion kinetics | η_{h} | 0.30 | 0.43 | 0.12 | 0.35 | 0.04 |

K _{h} | −0.29 | 0.07 | 0.37 | −0.23 | 0.07 | |

K _{X} | 0.85 | 0.81 | −0.04 | 0.65 | −0.20 | |

K _{a} | 1.12 | 0.82 | −0.30 | 0.79 | −0.33 | |

Stoichiometries | Y _{H} | 0.99 | 1.23 | 0.24 | 1.12 | 0.13 |

Y _{M} | 0.69 | 1.25 | 0.56 | 1.08 | 0.39 | |

i _{XB} | 1.30 | 1.52 | 0.22 | 1.44 | 0.15 | |

Standard deviations | σ_{0} | −0.09 | 1.22 | 1.31 | 0.90 | 0.98 |

σ_{1} | −0.09 | 0.56 | 0.65 | 0.67 | 0.76 | |

σ_{2} | −0.09 | 0.39 | 0.47 | 0.46 | 0.55 | |

σ_{3} | −0.09 | 1.44 | 1.53 | 1.48 | 1.56 | |

σ_{4} | −0.09 | 0.46 | 0.55 | 0.36 | 0.45 | |

σ_{5} | −0.09 | 0.58 | 0.67 | 0.53 | 0.62 | |

σ_{6} | −0.09 | 0.32 | 0.40 | 0.37 | 0.46 | |

σ_{7} | −0.09 | 1.60 | 1.69 | 1.37 | 1.46 | |

Mass transfer | k _{L,O2} | −2.27 | −1.68 | 0.59 | −2.04 | 0.22 |

For the first scenario (labeled as High Frequency (daily) Obs. Data), the observed data provided a good level of a posterior confidence interval for a large number of the parameters, which can be interpreted as reducing a parameter's uncertainty. For most of the parameters presenting higher sensitivity (Figure 6), including *μ _{H}*,

*K*,

_{L,O2}*μ*,

_{M}*Y*,

_{M}*K*,

_{M,M}*K*, and

_{O,H}*b*, the uncertainty was reduced by more than 60% in comparison with their prior range. Among the kinetics parameters, large reductions in the uncertainty are obtained for the maximum specific growth rates of heterotrophs and methylotrophs,

_{A}*μ*and

_{H}*μ*, respectively, and the decay rate for autotrophs,

_{M}*b*. Among the yield coefficients, both methylotrophs and heterotrophs yield coefficients,

_{A}*Y*and

_{M}*Y*, showed high levels of reduction in the uncertainty. Most of the half saturation coefficients represent low levels of sensitivity in comparison with growth, decay, and yield parameters, as shown in the ‘Sensitivity analysis’ section (Figure 6), and thus for them lower levels of uncertainty reduction are obtained. The exceptions are the oxygen half saturation of heterotrophs

_{H}*K*and the methanol half saturation of methylotrophs

_{O,H}*K*, which showed around 70% reduction in the 95% posterior interval relative to their prior ranges. The oxygen mass transfer rate

_{M,M}*K*also showed a large reduction in its posterior intervals, although this could be due to the lack of prior knowledge about its value. As for the parameters related to the methylotrophic process the posterior ranges for

_{L,O2}*μ*,

_{M}*b*,

_{M}*Y*, and

_{M}*K*were significantly reduced with respect to the given prior range, especially for

_{M,M}*μ*and

_{M}*b*where the 95% Bayesian credible interval is almost outside the prior range, indicating poor prior knowledge about these parameters that can be enhanced by applying the Bayesian parameter estimation method.

_{M}For the second scenario (labeled as Low Frequency (weekly) Obs. Data in Figure 5), the spatial profile data were reduced to one-third of their original size (from 1,650 data points to 450 data points) and the effluent observed data were reduced to weekly intervals instead of their original daily intervals. As can be seen in Figure 5, the posterior ranges of the low frequency observed data are, as might be expected, wider for almost all of the parameters, especially for the *μ _{H}*,

*K*,

_{O,H}*μ*,

_{M}*K*,

_{M,M}*Y*,

_{M}*η*, and

_{h}*K*. However, in some cases, the minimum to maximum ranges are slightly moved. This observation indicates that high frequency observed data carry a higher amount of information about most of the parameters.

_{L,O2}A common practice when calibrating in an over-parameterized model such as ASM is to fix the parameters that are deemed to present lower sensitivity and only calibrate the model using the high-sensitivity parameters (Sin *et al.* 2005). This practice is helpful because it reduces the dimension of the search space and makes the calibration more computationally efficient and it also avoids the chance of becoming trapped in local minima for an over-parameterized ASM model. To study the effect of this decision the analysis was also performed by only considering the most sensitive parameters (Figure 6). The specific growth rate and decay rate of all types of bacteria (*μ _{H}*,

*b*,

_{H}*μ*,

_{M}*b*, μ

_{M}_{A,}and

*b*), the heterotrophic and methylotriophic yield coefficients (

_{A}*Y*and

_{H}*Y*), and the oxygen mass transfer coefficient rate (

_{M}*K*) were treated as unknown parameters. The remainder of the parameters were fixed at the maximum likelihood estimation value found by the GA. The resulting 95% intervals for this scenario are shown in Figure 5 (labeled as Most Sensitive Parameters). As shown in Figure 5, the resulting 95% posterior intervals when only the nine most sensitive parameters are estimated mostly overlap with the results when all the parameters are estimated, with the ranges being mostly narrower when only nine parameters are considered. As expected the narrow confidence brackets obtained for

_{L,O2}*μ*and

_{H}*μ*and to some degree Y

_{m}_{m}and

*K*, compared to the case when all parameters are included in the analysis (in the scenario of high frequency observed data), shows that when only the most sensitive parameters are considered in the uncertainty analysis the uncertainty assigned to these parameters can be under-estimated. For a few parameters, the uncertainty increases when only the sensitive parameters are considered. This can be due to the model's lower flexibility to reproduce the data, therefore introducing larger observed error standard deviations.

_{L,O2}In all the cases, a moderate to small amount of information was obtained regarding the greater number of parameters. This is likely due to: (a) the model being over-parameterized with respect to the amount and the diversity of data and (b) the fact that the operational conditions of the full-scale bioreactors are fairly constant and the observed data are non-informative regarding some of the parameters under those conditions. The ability of the observed data to inform the values of the parameters (i.e., by providing a narrower posterior interval in comparison to the prior interval) depends on several factors including: (1) the sensitivity of the measured constituents with respect to the parameters, (2) the correlation between the parameters (collinearity), and (3) the weight of the observed constituents that are most sensitive with respect to the parameter.

### Sensitivity analysis

Figure 6 shows the ranked SSF for each parameter. Among the stoichiometric and the rate parameters, *Y _{M}*,

*Y*,

_{H}*μ*,

_{A}, b_{A}*μ*,

_{M}*b*,

_{M}*μ*and

_{H,}*b*showed a higher level of sensitivity compared to the other parameters. For all of the half saturation parameters, the model showed relatively lower sensitivities. The model also showed a relatively higher sensitivity with respect to the oxygen mass transfer rate coefficient than for other parameters. In general, the sensitivity analysis was in agreement with the uncertainty reduction, as the parameters with higher sensitivity achieved a higher level of uncertainty reduction. Nevertheless, sensitivity is not the only factor in the uncertainty analysis. A high to moderate level of information, as determined by the reduction of the 95% posterior interval relative to the prior was obtained for most high-sensitivity parameters, specifically

_{H}*μ*and

_{H,}K_{L,O2},*K*. The exceptions are

_{NH,A}*K*and

_{O,H}*K*, for which a significant level of information was obtained despite a small sensitivity factor. It is speculated that this is because the prior interval attributed to these parameters was exceptionally wide due to a lack of adequate prior knowledge from the literature about their values. The small (effectively zero) measured effluent concentrations of methanol also force

_{M,M}*K*to be small, narrowing its posterior interval.

_{M,M}In general, a positive correlation between SSF and reduction in the credibility interval can be seen. The global sensitivity with respect to most of the parameters for which the information gain was non-significant was found to be small. Global sensitivities of the parameters estimated here represent sensitivities under operational conditions during which the data were collected. This means that as long as the operational condition of the bioreactor is within the range experienced during the data collection period, the influence of the values of parameters with small information gain (lower sensitivity) will not be substantial. However, evaluating the impact of other operational conditions on the value of these parameters requires further study.

The global sensitivity results are fairly consistent with previous studies. In other studies, the maximum growth rates, decay rates, and biomass yields have been reported as high-sensitivity parameters in the ASMs. Kim *et al.* (2010) reported *i _{Xp}*,

*η*,

_{g}*k*,

_{NH}*b*,

_{A}*μ*,

_{A}*k*,

_{x}*η*, and

_{h}*f*as parameters with large effects on the overall output, while Sharifi

_{p}*et al.*(2014) found

*Y*,

_{A}*b*,

_{H}*Y*, and

_{H}*μ*to be highly sensitive parameters for ASM1 based on a local sensitivity analysis at the optimal value of the parameters. Based on a local sensitivity analysis, Fang

_{A}*et al.*(2010) reported that effluent COD and ammonia/ammonium (NH) concentrations in ASM3 are more sensitive with respect to

*μ*, and

_{A}, μ_{H}, Y_{A}*Y*.

_{H}### Correlation between parameters

Posterior results can be used to estimate the posterior correlation between parameters (or parameter collinearity). A high correlation between two parameters means that ways exist to change both of them without seeing a change in the model output or its fitness in terms of the overall ability to reproduce the data. The correlation matrix is provided in the supplementary information for the sake of brevity (available with the online version of this paper). Significant posterior correlations exist between *μ _{H}* and

*b*,

_{H}*μ*and

_{H}*K*,

_{s}*b*and

_{H}*b*and

_{M},*μ*and

_{M}*K*. Overall, posterior correlations were smaller than the case where synthetic (model generated) data were used as the observed data (Sharifi

_{M,M}*et al.*2014). This may be due to greater heterogeneity and structural errors associated with the real observed data, widening the range of the parameters, which reduces the correlation factor.

## CONCLUSIONS

In this study, a stochastic parameter estimation method was applied to measured data collected from the NDN reactors at the Blue Plains advanced WWTP in Washington, DC. Wastewater characteristics, including sCOD, VSS, nitrite/nitrate, ammonia, and methanol concentration, were measured inside the reactor in different tanks, locations, and times, as well as in the effluent wastewater. The goal of this study was to evaluate whether effluent and spatial profile data obtained from a full-scale plant could provide information about the values of model parameters and to quantify the level of confidence they can provide for each parameter relative to its prior knowledge. The following general conclusions were obtained from the presented study.

The model was generally able to capture the magnitude and trends of the observed data.

The data were able to significantly improve the level of confidence for some of the model parameters as determined by the increase in the entropy of posterior relative to the prior distribution of parameters.

Model prediction's uncertainty reduced by applying the posterior distributions of parameters instead of using the prior ranges.

Backward uncertainty assessment reveals the correlations between the parameters, representing how the combined variation of two or more parameters can result in the similar reproduction of observed data.

By and large, the overall sensitivity of the model output with respect to each parameter as measured by an SSF can explain the ability of data to narrow down the ranges of each parameter.

The posterior range of methylotrophic parameters was found to be narrowed down substantially, which reflects the poor prior knowledge about these parameters.

The number and the frequency of observed data can influence the level of information gained by the parameters. The results showed that lower frequency in observed data sampling (daily vs. weekly) can narrow down the posterior distribution.

## ACKNOWLEDGEMENTS

This project was supported by the District of Columbia Water and Sewer Authority and partially by the DC Water Resources Research Institute.