## Abstract

Online model predictive control (MPC) of water resource recovery facilities (WRRFs) requires simple and fast models to improve the operation of energy-demanding processes, such as aeration for nitrogen removal. Selected elements of the activated sludge model number 1 modelling framework for ammonium and nitrate removal were included in discretely observed stochastic differential equations in which online data are assimilated to update the model states. This allows us to produce model-based predictions including uncertainty in real time while it also reduces the number of parameters compared to many detailed models. It introduces only a small residual error when used to predict ammonium and nitrate concentrations in a small recirculating WRRF facility. The error when predicting 2 min ahead corresponds to the uncertainty from the sensors. When predicting 24 hours ahead the mean relative residual error increases to ∼10% and ∼20% for ammonium and nitrate concentrations respectively. Consequently this is considered a first step towards stochastic MPC of the aeration process. Ultimately this can reduce electricity demand and cost for water resource recovery, allowing the prioritization of aeration during periods of cheaper electricity.

## INTRODUCTION

Mathematical modelling of water resource recovery facilities (WRRFs) is a widely established discipline for research, plant design, optimization, simulation of process control strategies, etc. For these purposes there are many models to choose between, such as the activated sludge models (ASM), the anaerobic digestion models (ADM), the University of Cape Town (UCT) model or the TU Delft phosphorous removal model (Wentzel *et al.* 1992; Henze *et al.* 2000; Batstone *et al.* 2002; Gernaey *et al.* 2004; Meijer 2004; Hu *et al.* 2007). These complex models have differences in focus and as a result, in their structure. Hence choosing a model structure is, as with all modelling tasks, crucial to the outcome of the project. One important thing to include in the choice of a suitable model is the number of states and parameters. On the one hand more states and parameters leads to a more detailed model. However, on the other hand more details introduce more inputs that need to be distinguished and therefore estimated, measured or, if this is not possible, guessed at (Vanrolleghem *et al.* 1995). Furthermore, numerically solving large models with many states leads to long simulation times which can be demanding for data-driven optimizations, which need to be run at short time intervals (seconds to minutes). Although not yet used in the online operation of WWRFs, models can also be used to forecast future variables of interest for use in model predictive control (MPC), which means they should be fast and adaptable to online data.

The activated sludge model number 1 (ASM1) (Henze *et al.* 1987) describes organic matter degradation, nitrification and denitrification in activated sludge bioreactors. The model contains 13 states variables and 19 parameters. One of the most important challenges in using ASM1 is arguably attributing the many stoichiometric and kinetic parameters (Gernaey *et al.* 2004). The information needed for the characterization of these can come from three sources (Petersen *et al.* 2002): (1) default values from the literature, (2) full-scale plant data such as those collected by online sensors, and (3) information obtained from laboratory-scale experiments. The type of data and calibration framework to use is highly dependent on the intended use (e.g. Petersen *et al.* 2002). While (1) might be good for educational purposes or comparison of control strategies (e.g. Gernaey *et al.* 2014), optimization of processes with respect to a specific plant requires (2) and/or (3) (Petersen *et al.* 2002).

MPC aims to predict processes as a function of potential control actions and then choose the best control scenario based on the optimization of some objective function. In WRRFs this can translate to real-time modelling and forecasting of plant performance based on aeration control, optimizing electricity costs and effluent. When it comes to the selection of a suitable model for WRRF MPC strategies, the structure of states and parameters becomes particularly important. This is for the following two reasons: firstly, because parameters should be statistically identifiable from online data to take proper advantage of the real-time setting and secondly, because the computational requirements should be sufficiently low to allow for real-time, recursive simulation of several control scenarios. This means that a good online model should not have strong correlations between parameters, which is the case for parameters of ASM1 (Sharifi *et al.* 2014). Furthermore, the calibration should only depend on online data to avoid delays in updating the model, and hence the recalibration routine should not depend on information obtained from laboratory-scale experiments.

ASM1 has already been simplified to a linearized version to provide faster and yet reliable predictions (Smets *et al.* 2003). Furthermore, ASM1 has been reduced in an effort to make more parsimonious models (e.g. Mulas *et al.* 2007; Cadet 2014). However, the focus of these models is not online operations, i.e. to be updated with online data only. Online model applications are here managed by the use of stochastic, data-driven modelling (DDM) techniques. Many DDM methods exist, depending on the purpose and data availability (e.g. Dürrenmatt & Gujer 2012) and they are generally good alternatives when mechanistic models are not available or not valid (Gernaey *et al.* 2004). Since the detailed mechanistic understanding of the activated sludge process (ASP) already exists, the use of DDMs would ignore all the existing empirical process knowledge of nonlinearities and correlations.

Discretely observed stochastic differential equation (SDE) based models are often referred to as stochastic ‘grey-box’ (GB) models because the structure of the models represents both the physical/chemical/biological, deterministic (‘white-box’) understanding of the processes and the statistical, stochastic (‘black-box’) information indicated by data. Parameter calibration can be managed in the SDE-GB model by e.g. combining extended Kalman filtering (EKF) techniques and maximum likelihood estimation. This can be done statistically directly from online data by using e.g. the frameworks suggested by Kristensen *et al.* (2004), Tullekin (1993) or Jazwinski (1970). Furthermore, the EKF allows for optimal state estimation and handles additive noise effectively. Thordarson *et al*. (2012), Del Giudice *et al*. (2015) and Carstensen (1994) concluded that in terms of process prediction and control, SDE-GB models of the wastewater processes perform significantly better than traditional black-box models, such as ARMAX models, and also used them to statistically identify Monod-kinetic parameters from online measurements in an ASP (Carstensen *et al.* 1995). SDE-GB models have also been used to model incoming ammonium loads and first-flush phenomena (Bechmann *et al.* 2000; Halvgaard *et al.* 2017), and to forecast rainfall-runoff flow and volume in sewer systems for use in real-time optimization (Thordarson *et al.* 2012; Löwe *et al.* 2016).

In this paper we show that ASM1 can be rewritten as a simpler SDE-GB model that is applicable to online MPC purposes by treating state variables that show only slow and minor changes over short time horizons as model parameters that are kept fixed or intermittently re-estimated using online data. Thus, changes that occur slowly over weeks or months, such as changes in biomass, temperature, maintenance, wastewater composition etc., will be included in the parameters that are re-estimated intermittently with data from the previous few days. The small error introduced by this simpliﬁcation is estimated by a stochastic diffusion process and consequently it can be managed in the control setup. Following this methodology it is possible to create a stochastic ASM with only three states representing ammonium concentration, nitrate concentration and available oxygen. This model can then be used to optimize the ammonium and nitrate removal process within an MPC approach.

This article presents a simple ASM based on SDEs, which uses flow and aeration data as input and assimilates online measurements of ammonium and nitrate to update model states and thus prepare for providing the best possible forecasts at each time step. The model gives reliable online forecasts of the ammonium and nitrate removal process from a few minutes to up to 24 hours ahead and considers measurement errors. It was developed and tested with data from a small recirculation WRRF with alternating operation. The simplicity of the model makes it a general tool that can be useful in recirculation facilities with different configurations without changing the model setup.

## CASE STUDY: NØRRE SNEDE WRRF

The model is developed and tested with data from Nørre Snede WRRF, which is located in central Jutland, Denmark. The plant is designed to handle a maximum capacity of 9,700 population equivalent (PE) and the current load is approximately 4,000 PE.

### Operation and design

The WRRF includes several typical treatment processes that the wastewater goes through before discharge. Listed in order from when the wastewater enters the process, these are pretreatment, grit removal and grease trap, chemical dosage, nitrification/denitrification and secondary treatment. The nitrification/denitrification in the Nørre Snede plant happens in a process tank with a total volume of 3,500 m^{3}. The tank is divided into three smaller chambers operating under different conditions. This is illustrated in Figure 1, which also shows that the aeration tank is equipped with nutrient sensors, aeration equipment, a recirculation pump and rotors that control the flow direction/velocity (direction shown with arrows in the figure; rotors are located at the bridge).

The facility is currently operated with a rule-based control strategy, as described by Isaacs & Thornberg (1998), Zhao *et al.* (2004) and Kim *et al.* (2014), for example. In this case the control switches aeration on/off as a function of online ammonium and nitrate concentration measurements. Therefore the conditions switch between anoxic and aerobic and the cycle lengths depend on the conditions in the process tanks. This is managed in the control platform STAR Utility Solutions™ (Sørensen *et al.* 1994; Nielsen & Önnerth 1995).

### Data

The current control of the plant (i.e. actuator settings controlling aeration and inlet flow) is updated every 2 min and as result, aeration and inflow data are available every 2 min. The control rules are based on ammonium and nitrate signals, which are only sampled every 5 min directly in the aeration tank, meaning that observations are sampled at a different frequency to the control sampling. Calibration of sensors happens automatically two to four times per day, resulting in 30–60 min with no new observations. There is a response time from when aeration starts/stops until this is observed by the sensors. This is due to hydraulics in the tank and processing time in the sensors (Rieger *et al.* 2003). This response time is estimated using the method suggested by Stentoft *et al.* (2017): an estimate from the point at which conditions are shifting to the point at which a change in the trend in measurements is observed. Flow data are available at the outlet (after the settler) and change between 0 and ∼45 m^{3}/h because of a pumping scheme. To account for this scheme, flow data are filtered by a second-order Fourier series. The available data used in this study are summarized in Table 1.

Symbol . | Description . | Sample frequency . | Unit . | Uncertainty . |
---|---|---|---|---|

Q | Effluent Flow | 2 min | m^{3}/L | Unknown |

O | Actuator setting | 2 min | mgO_{2}/L | 0 |

MsNH | Measured ammonium | 5 min | mgN/L | ±3% |

MsNO | Measured nitrate | 5 min | mgN/L | ±5% |

Symbol . | Description . | Sample frequency . | Unit . | Uncertainty . |
---|---|---|---|---|

Q | Effluent Flow | 2 min | m^{3}/L | Unknown |

O | Actuator setting | 2 min | mgO_{2}/L | 0 |

MsNH | Measured ammonium | 5 min | mgN/L | ±3% |

MsNO | Measured nitrate | 5 min | mgN/L | ±5% |

The uncertainty is based on the information available from the sensor manuals (HACH Lange Aps 2013, 2014).

## THEORY AND METHODOLOGY

This section includes a description of the SDE-GB model, a simplification of the ASM1 model with noise terms added, the inclusion of aeration control and inflow as input. Finally, the parameter estimation is briefly described.

### Stochastic grey-box model

*Diffusion,*represents model approximations and undescribed noise disturbances, i.e. changes in biomass efficiency, unmodelled inflow, uncertainty of input variables , or true stochastic behaviour in the processes.

*Measurement noise,*represents the noise occurring due to imperfect accuracy of the measuring equipment, i.e. measurement uncertainty in the ammonium and nitrate sensors.

### Simplification of ASM1

Following the notation proposed by Corominas *et al.* (2010) the ordinary differential equations that govern ammonium, nitrate and oxygen in ASM1 can be written in a Gujer matrix, as presented in the Supplement (available with the online version of this paper), together with a brief description of ASM1 parameters and state variables. The complexity of these equations is considered an obstacle for use in a real-time setting since many of the variables are unmeasured and consequently constants would be difficult to distinguish. We therefore make simplifications to obtain a more suitable model. The main assumption in this simplification is that the model parameters will be re-estimated frequently, and therefore several state variables of ASM1 will become constant and some parameters will become unimportant.

The rate,

*ρ*_{4}, which governs ammonification of soluble organic nitrogen can be ignored. This is considered reasonable as the ammonification rate parameter*k*is typically small compared to the process rates of nitrification and denitrification (Henze_{am}*et al.*1987).The state variable

*S*is constant throughout the day. In practice it will follow a diurnal pattern similar to that of_{B}*S*, but since_{NH4}*S*is not measured these will be difficult to distinguish._{B}The state variables governing active heterotrophic and autotrophic biomasses

*X*and_{OHO}*X*can be considered constant on a daily basis, and hence can be treated as parameters. This is considered reasonable as the biomass is known to only change over longer periods of time._{ANO}The parameter for the relative amounts of N/COD in biomass,

*i*, can be ignored. A stoichiometric calculation by Henze_{N_COD}*et al.*(1987) (assuming a typical cell formation, C_{5}H_{7}O_{2}N) indicates that*i*is 0.086. This is very small compared to_{N_COD}*1/Y*which is approximately 4.2, and therefore it will be difficult to estimate._{ANO}The half-velocity parameters

*K*and_{O2,OHO}*K*for oxygen use are considered to be equal (_{O2,ANO}*K*=_{O2,OHO}*K*=_{O2,ANO}*K*), as Henze_{O2}*et al.*(1987) argue that they are not, quantitatively, that different.

*Kc*is introduced because the state

_{NH4,ANO}*S*is the Monod term indicating how quickly the process is running relative to the maximum rateThe seven new parameters to estimate online are therefore

_{O,MO}*θ*,

_{i}*K*,

_{NO3,OHO}*K*and

_{NH4,ANO}*Kc*where

_{NH4,ANO}*θ*relates to the original ASM1 parameters aswhere C

_{i}_{1}and C

_{2}are correction factors that are introduced because

*S*is the relative amount of oxygen.

_{MO,O}### Aeration control and inflow

*C_(O,Q)*is a function that describes the effect of inflow and aeration control on the given state. These functions are here determined from the literature. More specifically, the two-films theory (Lewis & Whitman 1924) and diurnal variations in ammonium concentration and constant (low) nitrate concentrations in the incoming wastewater (Henze & Comeau 2008). This means that

*C*and

_{NH4}(O,Q)*C*are given aswhere

_{NO3}(O,Q)*μ*and

_{NH4,in}, μ_{NO3,in}, s_{i}*c*are parameters relating to the inflow. Note that

_{i}*μ*is typically ∼0 (e.g. Henze & Comeau 2008). The parameters

_{NO3,in}*r*and

_{c}*ρ*are related to the recirculation to and the volume of the aeration tank (see Figure 1). The control of the aeration is given aswhere

*k*is a transfer constant (Lewis & Whitman 1924) related to the size and efficiency of the aeration equipment. The maximum value of the relative oxygen state is 1, and hence

_{1}*S*is set to 1 and should not be estimated.

_{O,MOmax}### Stochastic ASM

*α*and

*C*govern the ASP, the aeration and the inflow as described in previous sections. To avoid negative noise and to make estimation of small noise processes easier, the diffusion terms are estimated as exponential parameters (i.e. ). The system is discretely observed through ammonium and nitrate sensors in the aeration tank. The measurements (

*MsNH*and

*MsNO*) from these relate to the system aswhere

*ε*and

_{NH4}*ε*are independent and identically distributed (i.i.d.)

_{NO3}*Ɲ(0, 1)*, i.e. the residuals of the measurements are normally distributed with zero mean and

*exp(s*standard deviation. The changes in the states

_{1,NH4}), exp(s_{2,NO3})*dS*,

_{NH4}*dS*and

_{NO3}*dS*are given as state variables where ɷ

_{MO,O}_{1}, ɷ

_{2}and ɷ

_{3}are 1-dimensional standard Wiener processes and

*exp(s*and

_{11}), exp(s_{22})*exp(s*represent the deviation of these processes.

_{33})### Online parameter estimation in stochastic ASM

The stochastic model presented fits the general model structure for continuous-discrete stochastic state space models, i.e. a model of the state variables in continuous time and measurements of some of the states at discrete times. The R-package CTSM-R (Kristensen *et al.* 2004; Juhl *et al.* 2016) can manage just this kind of system, and is therefore used to estimate parameters and predict the effect of control. This paper provides only a brief summary of how the package works and how it is used here. For further information on this, see CTSM-R (2018).

*ε*

_{t}*=*

*y*

_{t}– ŷ_{t}_{|t-1}(

*ŷ*

_{t|t-1}*=*

*E(y*) and

_{t}|y_{t-1}, θ)*R*

_{t|t-1}*=*

*V(y*and

_{t}|y_{t-1}, θ). ε_{t}*R*are computed by means of a version of the EKF (e.g. Jazwinski 1970). The likelihood function can be simplified to a simpler log-likelihood function by conditioning on y

_{t|t-1}_{0}and taking the negative logarithm. However, this rewriting is omitted here. The parameter estimates are then obtained by minimizing this log-likelihood.

*C*is the Jacobian of the observation equation,

*h*. The innovation given byThe Kalman gain,

*K*, for the EKF is then calculated asand the system is updated is the state estimates (

_{k}*S*,

_{NH4}*S*and

_{NO3}*S*). The state prediction is done numerically bywhere

_{O,MO}*A(t)*is the Jacobian of the drift term

*f*. This Jacobian is calculated using a method based on Speelpenning (1980). In calculations of the Jacobian it is assumed that

_{i}(t,…)*x*=

*k|k-1, u = uk, t = tk*and the parameters,

*θ*, are known. The ordinary differential equations (ODEs) are solved by numerical integration schemes suggested by Hindmarsch (1983) (cited in Kristensen & Madsen 2003, p. 17). This is to ensure an intelligent re-evaluation of

*A*and

*σ*. From this construction we see that the approximation is only good when nonlinearities are not too strong.

The estimation setup implies that initial state and parameter estimates are necessary in the parameter estimation procedure. These can be supplied either as prior distributions or simply as estimates with some maximum and minimum boundaries. For most parameters these initial estimates are based on the literature (e.g. Henze *et al.* 1987; Henze & Comeau 2008). However, a few parameters are unnecessary or cannot be estimated. The initial state values *S _{NH4,0}*,

*S*and

_{NO3,0}*S*can easily be estimated directly from data, as ammonium and nitrate are directly measured and the oxygen signal is known, and hence these are considered unnecessary to estimate. Furthermore, the parameters ,

_{O,MO,0}*θ*and

_{3}*θ*show very small deviations and notably correlation with other parameters. It is also argued by Henze

_{4}*et al.*(1987) that does not need estimating. For these reasons these are kept constant here at [

*K*,

_{NO3,OHO}*θ*,

_{3}*θ*] = [3.0, 5.0E − 6, 1.0], thereby reducing the number of parameters to estimate.

_{4}## RESULTS AND DISCUSSION

Firstly the model is qualitatively evaluated by comparing the model predictions with data and discussing parameter estimates. Secondly, the model is quantitatively evaluated by running it for 1 month and discussing the statistics of the residuals. We stress that the model is run ‘online’ in the sense that parameters are estimated only by minimizing the objective function described in the previous section. Furthermore, the states *S _{NH4,} S_{NO3}* and

*S*, are updated using the EKF whenever a new measurement becomes available. Figure 2 shows an example of one prediction of ammonium at a given inlet flow and aeration signal. The state,

_{O,MO}*S*, is updated with present data and then predicted 2 hours ahead. Clearly, uncertainty increases with increasing forecast horizon.

_{NH4}### Model dynamics

Parameters are estimated with data from a period at the beginning of October 2016, chosen arbitrarily among periods without rain. The length of the parameter estimation period is 4 days and 4 hours (corresponding to 3,000 time steps of 2 min). These parameters are used to predict the concentrations of ammonium and nitrate in the aeration tank. Figures 3 and 4 show predictions of the ASP 60 time steps ahead corresponding to 2 hours, given the aeration signal. This is done for 24 hours, meaning that each time a new measurement becomes available a prediction similar to Figure 2 is made and compared with data. This is done during normal operation of the plant i.e. no rain and no (known) problems.

In Figure 3 it is evident that under normal operation, the modelled ammonium concentration follows the same dynamics as the data. During the aeration phase the ammonium concentration decreases, and when aeration is switched off, NH_{4} increases. In periods when no new data are received (i.e. calibration of the ammonium sensor from 17:30 to 18:30), the model continues to provide reliable estimates. The nitrate concentrations estimated in Figure 3 also follow dynamics similar to those in the data. It is noted that when aeration is off, nitrate decreases and when it is on, it increases. However, a dynamic starting at 06:00 does not follow the behavior shown by the sensor measurements. This period contains a relatively long timespan without aeration that would normally mean denitrification, but in this case we see that nitrate increases. This could be due to some unmodelled dynamics, problems with a drifting nitrate sensor or an unusually large load of nitrate in the influent from industry, for example. Overall, the results show that the uncertainty of the nitrate predictions is greater than the uncertainty of the ammonium predictions, and hence larger deviations from the modelled concentrations are expected. Figure 4 shows the estimates of the unmeasured state, *S _{O,MO}*. It is plotted together with the measured dissolved oxygen (DO) concentrations and the setpoint of the actuator, O. It is clear that when the setpoint is lower, it takes a longer time for

*S*to reach the maximum level. The comparison between the aeration status and the measured oxygen concentrations highlight that short periods of aeration are not registered in the measurements. This is caused by the location of the sensor, which is opposite the aeration grid (see Figure 1). This supports the choice of not including the measured DO as input/state in the model. Therefore the actuator signal is considered superior as it reports all periods of added oxygen and furthermore, it does not have any response time from when air is added until it is observed in the tank.

_{O,MO}### Parameter estimates

The parameters that are estimated in the period mentioned previously are presented in Table 2.

Parameter . | Estimate . | Standard deviation . |
---|---|---|

Simplified ASM | ||

K _{NH4,ANO} [mgN/L] | 4.80 × 10^{−1} | 1.80 × 10^{−1} |

K _{CNH4,ANO} [mgN/L] | 1.05 × 10^{−3} | 6.99 × 10^{−4} |

θ _{1} [mgN/L] | 5.00 × 10^{−2} | 5.39 × 10^{−3} |

θ _{2} [mgN/L] | 2.47 × 10^{−1} | 1.59 × 10^{−2} |

Aeration control and inflow | ||

s _{1} [mgN/L] | 1.14 | 6.63 × 10^{−1} |

s _{2} [mgN/L] | −5.63 × 10^{−1} | 3.73 × 10^{−1} |

c _{1}[mgN/L] | 4.25 × 10^{−1} | 3.24 × 10^{−1} |

c _{2} [mgN/L] | −5.82 × 10^{−2} | 1.75 × 10^{−1} |

μ _{NH4,in} [mgN/L] | 1.35 × 10^{−1} | 6.76 |

μ _{NO3,in} [mgN/L] | 1.10 × 10^{−2} | 2.01 × 10^{−2} |

K _{1} [L/mgO] | 1.12 × 10^{−1} | 6.48 × 10^{−3} |

ρ[] | 1.08 × 10^{−5} | 1.64 × 10^{−5} |

r _{c} [] | 7.21 × 10^{−4} | 4.31 × 10^{−4} |

Noise and diffusion terms | ||

s _{11} [log(mgN/L)] | −4.79 | 7.65 × 10^{−2} |

s _{22} [log(mgN/L)] | −3.20 | 2.06 × 10^{−2} |

s _{33} [] | −2.86 | 2.16 × 10^{−1} |

s _{1,NH} [log(mgN/L)] | −8.66 | 1.26 × 10^{−1} |

s _{1,NO} [log(mgN/L)] | −1.80 × 10^{−1} | 2.56 × 10^{−1} |

Parameter . | Estimate . | Standard deviation . |
---|---|---|

Simplified ASM | ||

K _{NH4,ANO} [mgN/L] | 4.80 × 10^{−1} | 1.80 × 10^{−1} |

K _{CNH4,ANO} [mgN/L] | 1.05 × 10^{−3} | 6.99 × 10^{−4} |

θ _{1} [mgN/L] | 5.00 × 10^{−2} | 5.39 × 10^{−3} |

θ _{2} [mgN/L] | 2.47 × 10^{−1} | 1.59 × 10^{−2} |

Aeration control and inflow | ||

s _{1} [mgN/L] | 1.14 | 6.63 × 10^{−1} |

s _{2} [mgN/L] | −5.63 × 10^{−1} | 3.73 × 10^{−1} |

c _{1}[mgN/L] | 4.25 × 10^{−1} | 3.24 × 10^{−1} |

c _{2} [mgN/L] | −5.82 × 10^{−2} | 1.75 × 10^{−1} |

μ _{NH4,in} [mgN/L] | 1.35 × 10^{−1} | 6.76 |

μ _{NO3,in} [mgN/L] | 1.10 × 10^{−2} | 2.01 × 10^{−2} |

K _{1} [L/mgO] | 1.12 × 10^{−1} | 6.48 × 10^{−3} |

ρ[] | 1.08 × 10^{−5} | 1.64 × 10^{−5} |

r _{c} [] | 7.21 × 10^{−4} | 4.31 × 10^{−4} |

Noise and diffusion terms | ||

s _{11} [log(mgN/L)] | −4.79 | 7.65 × 10^{−2} |

s _{22} [log(mgN/L)] | −3.20 | 2.06 × 10^{−2} |

s _{33} [] | −2.86 | 2.16 × 10^{−1} |

s _{1,NH} [log(mgN/L)] | −8.66 | 1.26 × 10^{−1} |

s _{1,NO} [log(mgN/L)] | −1.80 × 10^{−1} | 2.56 × 10^{−1} |

Parameters are calculated using 3,000 timesteps of data corresponding to 4 days and 4 hours. The period is from the beginning of October 2016.

Some parameters are difficult to evaluate as they represent some catchment/plant-specific information. Nonetheless, parameters are discussed here and in some cases compared with the literature:

The residual errors (which are split into diffusion and measurement noise) are found to be similar to what is estimated for ammonium in the aeration tank by Halvgaard

*et al.*(2017) in another plant with similar sensors and configuration. However, the uncertainty of the measuring equipment is much smaller compared to what is stated by the sensor manufacturer (see Table 1 footnote).The Monod kinetics parameter,

*K*, is found to be similar to what is statistically estimated by Carstensen_{NH4,ANO}*et al.*(1995). Here it was found to be between 0.44 and 0.76 among different plants and catchments. However, it is lower than ∼1, which is what is suggested by Henze*et al.*(1987).The mean incoming concentrations of ammonium,

*μ*and nitrate,_{NH4,in}*μ*seem reasonable as these are similar to what is typically found in a ‘catchment with little industrial activity’ according to Henze & Comeau (2008). The diurnal variations in ammonium,_{NO3,in}*s*,_{1}*s*,_{2}*c*and_{1}*c*are catchment specific. However, these are considered reasonable as they produce a variation that is comparable with the one presented in Henze & Comeau (2008), i.e. a similar shape with peaks in the morning and afternoon._{2}The new parameters

*θ*and_{1}*θ*are difficult to compare with the literature as they depend on many parameters and the states_{2}*X*and_{ANO}*X*. However, following the typical parameter suggestions from Henze_{OHO}*et al.*(1987), these should be 3.33*X*and 0.82_{ANO}*X*respectively._{OHO}The parameters related to the incoming water,

*r*and_{c}*ρ,*are estimated to be 7.21 × 10^{−4}and 1.08 × 10^{−5}respectively. Summing these and multiplying the mean flow with*ρ*we get 1.11 × 10^{−3}. This is slightly more than the expected value of 3.43 × 10^{−4}which is found by dividing mean flow with the volume of the process tank. This difference can be due to the recirculation that happens between the nitrification and denitrification tanks.The relative oxygen transfer rate

*k*depends on many factors such as tank design (e.g. reactor geometry, aeration design), physico-chemical properties (e.g. liquid composition, viscosity, temperature) and the presence of biomass (e.g. Pittoors_{1}*et al.*2014). Therefore it is difficult to determine empirically as it varies between facilities and over time.

### Model performance

The model's predictive ability is tested by re-estimating parameters every hour for a period of 1 month and 1 week, starting late September 2016. The model is then used to predict concentrations of ammonium and nitrate 1, 60 and 720 time steps ahead (corresponding to 2 min, 2 hours and 1 day, respectively). The predictions are compared with data and a 24-hour running mean absolute residual is calculated. Figure 5 illustrates how this changes over time for predictions 2 hours ahead (60 time steps). Table 3 shows the statistics of the mean absolute residual for all the different prediction horizons.

Precipitation^{a}
. | Prediction . | Mean absolute error [mgN/L] . | Relative absolute error [–] . | ||
---|---|---|---|---|---|

– . | No . | Yes . | No . | Yes . | |

Ammonium | 2 min | 0.0264 | 0.0278 | 0.0212 | 0.0223 |

2 hours | 0.0884 | 0.1065 | 0.0709 | 0.0855 | |

24 hours | 0.0925 | 0.129 | 0.0855 | 0.104 | |

Nitrate | 2 min | 0.0804 | 0.081 | 0.0594 | 0.0598 |

2 hours | 0.212 | 0.2399 | 0.157 | 0.177 | |

24 hours | 0.246 | 0.374 | 0.1815 | 0.276 |

Precipitation^{a}
. | Prediction . | Mean absolute error [mgN/L] . | Relative absolute error [–] . | ||
---|---|---|---|---|---|

– . | No . | Yes . | No . | Yes . | |

Ammonium | 2 min | 0.0264 | 0.0278 | 0.0212 | 0.0223 |

2 hours | 0.0884 | 0.1065 | 0.0709 | 0.0855 | |

24 hours | 0.0925 | 0.129 | 0.0855 | 0.104 | |

Nitrate | 2 min | 0.0804 | 0.081 | 0.0594 | 0.0598 |

2 hours | 0.212 | 0.2399 | 0.157 | 0.177 | |

24 hours | 0.246 | 0.374 | 0.1815 | 0.276 |

‘Precipitation’ indicates whether only dry weather periods (No) or all periods are considered (Yes).

^{a}Wet weather periods are as follows: 28 September to 2 October, 15–19 October, 22–23 October.

In Figure 5 it is evident that during some periods (i.e. October 2nd, October 15th and October 22nd) the uncertainty increases. Comparison with rain data supplied by the Danish Meteorological Institute (DMI 2018) shows that many of these periods are characterized by wet weather. This is also indicated in Table 3 where the general picture is that uncertainty increases during wet weather. In Table 3 it is also clear that the relative 2 min uncertainty is ∼2% for ammonium and ∼6% for nitrate. This is comparable with the sensor uncertainty listed in Table 1 and indicates that the 2 min predictions cannot be further improved even with a more detailed deterministic model. The 2-hour and 24-hour predictions perform worse than the sensor uncertainty. This can, on the one hand, indicate that there is room for improvement, but can, on the other hand, also mean that there is some stochastic behavior (e.g. incoming nutrients or biomass efficiency) which is more pronounced when predicting further ahead from the EKF state update. It should be added that the treatment requirements for Nørre Snede WRRF state that during a 24-hour period, the ammonium concentration should be <2 mgN/L and total N should be <8 mgN/L in the outlet. Consequently this means that accurately predicting ammonium is more important because nitrate only affects total N. The relative uncertainties of ammonium and nitrate of <10% and <20% respectively (in dry weather, 24 hours ahead) are considered sufficient for stochastic MPC.

### Towards MPC – simplified and full ASM models

The results in Table 3 are difficult to compare with full ASM models, as to our knowledge there is no framework for making the full ASM models online adaptive to data. However, our results can be compared with data-driven model parameter estimations of full ASMs (i.e. other methods that rely only on data from online ammonium/nitrate sensors). One example of such an approach is provided by Sin *et al.* (2008), where the parameters of an ASM2d model were estimated using only frequently sampled online ammonium, nitrate and oxygen measurements (sampled every 5 min, similar to this study) in the 50,000 PE Haaren WRRF in the Netherlands with alternating control of aeration. Parameters were calibrated using Monte Carlo simulations to minimize a weighted sum of squared errors (WSSE) based on a calibration period of 16,117 measurements (56 days). Model performance was compared with data in a validation period of 9,217 measurements (32 days). The results showed a mean absolute error (MAE) for ammonium of 1.39 mgN/L and 0.98 mgN/L in the calibration and validation periods, respectively. For nitrate concentrations an MAE of 2.56 mgN/L and 2.31 mgN/L were found. These error values are 10 times larger than what we found in this study, cf. Figure 3. Also, the framework presented in Sin *et al.* (2008) differs from this study in that the model states are not updated when new data become available and hence short time predictions in the validation period (up to 24 hours ahead) are not based on all the information available in an online situation. Additionally it is noted that the framework by Sin *et al.* (2008) required some computation time (45 min per simulation (in 2008) on a PC, and 500 Monte Carlo simulations where used to obtain a model), which makes it non-ideal for online applications.

The development of tools for online performance optimization of WRRFs using models is crucial for exploiting the full potential of digitalization. Hence, the development of robust approaches to online identifiable ASMs for improved short horizon predictions is needed. These models should also include additional processes, such as biological removal of COD and P, to achieve an overall improvement of all the removal processes in the plant. This paper provides a first step in this direction with online predictions of ammonium and nitrogen removal.

## CONCLUSION

Grey-box models based on SDEs are efficient tools as they can estimate both processes and noise from real-time data. A stochastic model of an aeration tank is proposed here. The model contains a deterministic term consisting of both a simplified ASM and input functions determining the influence of control and inflow. The model is used to predict the nitrification/denitrification in Nørre Snede WRRF in Denmark as a function of aeration and inflow.

The results show that despite the simple structure of the proposed model, the dynamics of the nutrient concentrations are captured. Quantitative investigations show that the processes are predicted accurately, i.e. 24-hour predictions of the ammonium and nitrate concentrations in the aeration tank are predicted with relative errors of <10% and <20% respectively. Consequently this is considered to be a step towards stochastic MPC of water resource recovery processes.

## ACKNOWLEDGEMENTS

This work was partly funded by the Innovation Fund Denmark (IFD) under File No. 7038-00097B – the first author's industrial PhD: *Stochastic Predictive Control of Wastewater Treatment Processes*. Data for this study were supplied by Ikast Brande Forsyning A/S.

## REFERENCES

*IFAC Proceedings vol. 1*, June 4–6, Cancún, Mexico

*Proceedings of ICA 2017 11-14 June 2017 in Québec City*, Québec, Canada