ABSTRACT
A novel data-driven stochastic state space system for modeling and forecasting the sludge blanket height in secondary clarifiers is presented. The model is trained on sensor measurements of the sludge blanket height and uses as inputs (1) the clarifier sludge mass inflow rate, and (2) the clarifier recycle flow rate. The model's prediction accuracy is evaluated on data from two Danish wastewater treatment plants, for a summer and a winter month, by means of root-mean-square errors and compared with a persistence model. The model consistently outperforms the persistence model in the summer, but only one plant performs well in the winter month. The worst performing plant is challenging to model due to data quality issues and problematic (uneven and time-varying) flow distributions to the clarifiers. This led us to conclude that the best performance and stability is seen to require high data quality and well-controlled flow distribution. In summary the model achieves, in almost all cases, prediction error reductions in the order of 30–50% and 0.1–0.4 m in relative and absolute terms when compared with the predictions from a persistence model.
HIGHLIGHTS
A stochastic state space model of the sludge blanket height in secondary clarifiers at wastewater treatment plants is developed, trained on sensor measurements of the sludge blanket height and using as inputs the plant inflow rate, clarifier feed suspended solids concentration and clarifier recycle flow rate.
The model can forecast the sludge blanket height multiple hours ahead with high accuracy, in particular on plants with good secondary clarifier flow-distribution and where high-quality sensor data is available.
The developed model may be used in a model predictive control setup to improve secondary clarifier performance by taking future predicted behaviour into account when choosing the present control strategy for e.g. bypass fraction or recycle flow rates, which may electively increase plant hydraulic capacity, limiting severe bypass scenarios and reduce eluent concentrations.
INTRODUCTION
Wastewater treatment plants are a fundamental part of modern sanitation systems and serve to reduce the environmental impact on natural water bodies from human activities. The cleansing procedure at wastewater treatment plants follows a series of steps, roughly comprised of (1) mechanical treatment, (2) phase separation, and (3) biological/chemical treatment. The former step employs metal grids/screens to filter out larger objects, the second uses so-called primary clarifiers where courser materials are removed through basic settling, and the latter step employs biological tanks and secondary clarifiers to reduce various undesired chemical compounds and carbonous matters in the wastewater. This research paper focuses on the latter step, in particular on the secondary clarifier. The technique used for biological treatment is a widespread two-stage technique known as the activated sludge process, which is carried out in the connected biological basins and secondary clarifiers. In the biological basin the wastewater is aerated in a repeated nitrification/denitrification process to increase growth rates of naturally occurring bacteria, that aid in devouring the carbonaceous matter. The wastewater is subsequently led into the secondary clarifier where the bacteria flocs, referred to as sludge, are separated from the water through gravitational settling to produce a cleansed water outflow (effluent). It is paramount that the activated sludge process is performing well, and in particular the secondary clarifier, since it is known to be the primary bottleneck of the volumes of wastewater a plant is able to handle (Stukenberg et al. 1983; Ekama et al. 1997; Ramin 2014). The secondary clarifier serves three primary purposes (Ekama et al. 1997): (1) producing a decontaminated effluent, (2) thickening activated sludge, and (3) storing sludge during peak loading periods. The settling dynamics of sludge in the clarifier is such that a clear sludge-water interface is established, at a particular height, across which the sludge concentration changes rapidly. This level is referred to as the sludge blanket height. A high sludge blanket is correlated with high concentration of bacteria (called suspended solids) in the effluent (Ekama et al. 1997) why it is of interest to keep it at a sufficiently low level. A fundamental challenge for treatment plants is that the designed hydraulic capacity i.e. the maximum permissible flow rate through the plant, is often surpassed during events of heavy precipitation, compromising the activated sludge operation. In a worst-case scenario it can be necessary to skip the activated sludge process altogether, a process known as bypass, to prevent diluting the sludge mass, or to avoid increased effluent concentrations as a result of a high sludge blanket. It is financially infeasible to overcome this challenge by building larger treatment plants, because the required volumes are immense. In the pursuit of limiting human environmental impact treatment plants are under increasingly stringent demands to further reduce effluent contamination and lower bypass volumes. It is thus crucial to continue to develop and improve the operation and control of treatment plants, and in particular the activated sludge operation, in order to better exploit plant hydraulic capacities and reduce effluent contamination.
There has been substantial research into clarifier behavior and settling dynamics and we provide a brief overview of the history in the following. The earliest research by Coe & Clevenger (1916) discussed solids capacities in the layers of the clarifier, but the sedimentation theory was not thoroughly discussed until Anderson & Gould (1945), with the introduction of the fundamental solid-flux theory in Kynch (1952), and the further elaboration of so-called hindered settling by Vesilind (1974). Towards the 1990s research focused on improving clarifier design based on the theory (Fitch 1966; Kos 1977; Tekippe & Bender 1987), and an operational so-called state point strategy for dealing with overloaded clarifiers were demonstrated in Keinath et al. (1977) and Keinath (1985, 1990). The state point strategy lead to the first online control strategies known as step feed to prevent bypass during wet weather conditions (Thompson et al. 1989). In combination with a rule-based four step control system (Balslev et al. 1994) developed an online control strategy for the secondary clarifier, which laid the foundations for the clarifier control implemented at various Danish treatment plants (due to Krüger Veolia’s digital software tool STAR Utility Solutions™). One-dimensional concentration models based on the sedimentation theory of Kynch (1952) were addressed extensively in the 1990s by Takács et al. (1991), Hamilton et al. (1992), and Watts et al. (1996) and others, but these models were primarily adapted using steady-state data, although a dynamic model was introduced by Clarcq et al. (2003). The Takács et al. (1991) models are obtained from a spatial finite difference scheme of the advection-diffusion partial differential equation, with incorporation of settling dynamics from empirical flux settling curves and settling parameters. These models continue to form the basis of one-dimensional clarifier modeling, and the more recent work in the literature improves on the mathematical descriptions of, e.g., mechanistic compression dynamics and solids dispersion, and generally focuses on appropriate mathematical formulations to ensure well posedness and stability. The reader should consult the work of, e.g., Plosz et al. (2011), Bürger et al. (2012), and Guyonvarch et al. (2015) for a solid overview for these modern formulations. The newer contributions in the literature also considers uncertainty quantification (Guyonvarch et al. 2020; Zhou & Li 2023) and the effect of certain filamentous bacteria on settling properties (Qiu et al. 2023). The literature also contains many three-dimensional computational fluid dynamics models, e.g., Xu et al. (2022), that are used for optimizing design and similar research. The reason that these models are not used here, is due to their large computational complexity, making them unviable for online operations that require fast continuous calculations.
The general challenges with respect to sludge blanket height prediction is firstly that clarifier behavior in operational settings is very complicated, and one-dimensional models must (obviously) neglect more complicated features such as, e.g. turbulence, geometric properties, or uneven sludge distributions, and consequently suffer accuracy impairments. A primary difficulty for the models found in the literature is furthermore that inferring the sludge blanket height ultimately comes down to inferring a single point on the estimated sludge concentration profile curve. The models are also difficult to calibrate because continuous concentration profile measurements are generally not available. The models are relatively large with approximate 10–100 states, so optimization across multiple weeks of training data is computationally quite expensive. Similarly expensive is forecasting, which is essential for model predictive control where the control signal(s) must be repeatedly optimized over. To overcome these prediction challenges, we present a novel data-driven model which specifically targets the sludge blanket height directly, as opposed to indirectly through concentrations. A primary advantage here is that sludge blanket measurements are used directly to calibrate the model. The model furthermore only consists of a single state equation, and is thus fast and easy to evaluate in online settings making it ideal in a model predictive control setup. A challenge with the proposed model, as opposed to the existing models, is that it lacks clear physical interpretation, being based on a very simple qualitative understanding of the clarifier dynamics, which inevitably neglects more complicated dynamics. Due to the models simplicity high quality data is crucial to be able to identify appropriate correlations between the exogenous model variables (plant inflow, clarifier suspended solid concentrations, clarifier recycle flow rate) and the endogenous sludge blanket height. A challenge in this regard is that there is a very limited understanding of the quality of the blanket measurements and their representativeness. The measurements are conducted only at one particular location, and thus wrongly assumes a uniform blanket height, and accuracy impairments due to, e.g., turbulence in the water column is unknown. There is currently, to the best of the authors’ knowledge, no similar model in the literature, so no direct comparisons can be made.
The model is a one-dimensional continuous-discrete state space system (Jazwinski 1970) consisting of a stochastic differential equation (SDE) and an algebraic observation equation (AOE). We refer to the model as belonging to the class of gray-box models, combining black-box statistical models with traditional ‘white-box’ differential equation models. A stochastic differential equation can be viewed as a generalization of an ordinary differential equation with a natural incorporation of uncertainty through the addition of a so-called diffusion term whose increments are assigned a probability distribution. The solution to a stochastic differential equation at every point in time is therefore a probability distribution, rather than a single point. This allows for much more model flexibility when fitting to data where the model can adapt locally to observations, and that results in parameter estimates that better reflect the true system. This is in contrast to ordinary differential equations where local changes to the solution curve, to better match data, will have global implications (Møller 2011). In essence the role of the stochastic differential equation is to transform the data-fitting procedure into a weighted fit based on the uncertainty (the probability distribution) assigned by the stochastic differential equation, and the uncertainty of the observations. The developed model is fitted to data by estimating the parameters that minimize the negative log-likelihood of independent one-step state transition probabilities, where state filtration is carried out using an Extended Kalman Filter. The procedure takes advantage of algorithmic differentiation and the speed of C++, which is made available through the RpackageTMB (Kristensen et al. 2016). The work presented is inspired by the work in Stentoft et al. (2019, 2021) where a similar model for the biology was developed for forecasting and controlling the aeration process in the nitrification/denitrification step of the biological treatment basins. The authors also demonstrated the ability to implement various model predictive control strategies tailored towards, e.g. cost-savings, or reduced carbon emissions by considering electricity prices. The overarching goal here is to construct and link together similar model predictive control frameworks for various units at the treatment plant to increase performance, but also to embed flexibility in the sense of Junker et al. (2020) into treatment plants such that the operation can respond to variations in energy prices.
The paper is organized as follows: In Section 2 we present the two wastewater treatment plants whose data the present analysis is based on, and discuss properties of this data. In addition to that we briefly discuss theory of stochastic state space systems, likelihood estimation, the developed model, theoretical details, and numerical settings. In Section 3, we present estimation results, model residuals, showcase example predictions, prediction accuracy in terms of root-mean-square error , and finally consider predictions with uncertain (forecasted) inputs. The section is concluded with a discussion of the results, the chosen methods and computation techniques, subtle challenges with regards to model training and more. Finally, a conclusion is given in Section 4.
METHODS
The results presented in this article are computed using the statistical software language R (R Core Team 2023), the plots/graphics are created using the ggplot2 package (Wickham 2016), tables are created using the knitr (Xie 2014) and kableExtra (Zhu 2021) packages, and likelihood calculations are based on the TMB package (Kristensen et al. 2016).
Model implementation and workflow
The model is implemented as follows: First, a likelihood function for the observed data, based on the Extended Kalman Filter algorithm similar to that discussed in Brok et al. (2018) is implemented in C++. The C++ script uses among other things a set of macros made available by TMB. The reader is referred to TMBs GitHub Pages documentation for further information about implementation procedures. Once this is done the likelihood function, and its gradient and hessian, is made available by TMB for use in R. This allows for straight-forward minimization of the likelihood function and thus obtaining the best fitting parameter and the associated states. Once the parameters are obtained predictions can be made by computing prior mean and variance state estimates through the Extended Kalman Filter algorithm, which amounts to solving a set of moment ODEs (17). This entire procedure is made easily available through the authors’ own (under-development) package ctsmTMB, which has been used to generate the results presented in this article, and the reader is referred to the package’s GitHub Page for more information. In addition an example script is also provided here which demonstrates how estimation and prediction is carried out.
Treatment plants and data
. | PE capacity . | Lines . | Clarifiers . | Shape . | Dimensions (meters) . |
---|---|---|---|---|---|
Damhusåen | 350.000 | 4 | 24 | Rectangular (H×L×D) | 5×20×3 |
Kolding Central | 125.000 | 1 | 4 | Circular (Dia×D) | 30×4 |
. | PE capacity . | Lines . | Clarifiers . | Shape . | Dimensions (meters) . |
---|---|---|---|---|---|
Damhusåen | 350.000 | 4 | 24 | Rectangular (H×L×D) | 5×20×3 |
Kolding Central | 125.000 | 1 | 4 | Circular (Dia×D) | 30×4 |
The model development presented in this article has been based on 6 months of data collected from July to December in 2,019 and 2022, at Damhusåen and Kolding Central, respectively. The data were acquired from the two treatment plants through Krüger Veolia’s Hubgrade™ cloud solution. The sludge blanket measurements are collected by an ultra-sonic sensor (Hach 2023) installed above the surface of the clarifiers, and calculated based on the return time of an emitted ultra-sound signal. We use these calculated values directly as the sludge blanket height, without any further processing. Each data signal is sampled roughly every 2 min but in an unsynchronized fashion, so joint time-stamps for all signals are achieved by linear interpolation. The data was subsequently aggregated into 10 min intervals, using median values, to ease the computational effort. A few example tests demonstrated that the results (i.e. the model parameter estimates) were unaffected by this aggregation, arguably since blanket dynamics are much slower. The two selected periods were chosen because the data contained limited signal artifacts such as drop-outs, unrealistic jumps or peculiar oscillations. In this article, however, we focus just on August and December to exemplify the model’s properties and predictive performance. The purpose of selecting these two months is to showcase performance differences caused by seasonality i.e. ‘summer’ and ‘winter’, and the reason for comparing these two treatment plants is due to a clear difference in the plant operating conditions. There are four operating lines at Damhusåen but we only consider one of the lines in this article for sake of simplicity. The second operation line was deliberately not picked since it stood out as having particularly large dispersion, but we picked arbitrarily among the remaining three lines, so results are most likely representative of the general behavior. The reader may appreciate the dispersion differences by comparing the chosen data shown in Figure 2 with the discarded line placed in Figure 9 of Appendix A. The figures contain time series data of the sludge blanket heights for each operation line together with the (normalized) mass inflow rates . The measured sludge blanket heights are shown as gray-colored regions spanning minimum to maximum, but the individual sludge blanket time series are omitted. The intention here is to emphasize the blanket dispersion between lines and plants, which evidently is much greater at Damhusåen than at Kolding Central. The gray-colored regions in Figure 2(c) and 2(d) are very thin indicating close-to-identical behavior between all clarifiers. In contrast these regions are seen to be much larger at Damhusåen in particular in Figure 2(b). The mass inflow rates are included in the figures to showcase its correlation to the sludge blanket height. The recycle flow rates are omitted to avoid cluttering. The plant inflow rate and suspended solids concentrations are also omitted in favor of the mass inflow rate, which we expect sums up their contributions. That being said there may be more complicated interactions from the terms individually that we suppress in favor of a clearer interpretation. Generally we expect the sludge blanket height to mirror the behavior of the mass inflow rate. This expectation is generally met in the presented data, but there are evidently also some more complicated behavior present in many situations. In the following we highlight some of the scenarios where the sludge blanket response is interesting, or unexpected, in order to emphasize some of the challenges in the data.
(1) August 2–8 and 21–31: The sludge blankets are much higher in the former period than the latter for similar mass inflow rates
(2) August 8–23: The sludge blankets and mass inflow peaks coincide on the 16th, 18th, 19th and 21st, but are delayed in time on the 13th–14th and 17th–20th.
(3) August 11–13: The sludge blankets are much higher at the end of the month, even though the mass inflow rate is similar.
(4) August 28–29: The sludge blankets do not respond to the small peak in the mass inflow rate.
(5) December 1–31: The sludge blankets are gradually increasing throughout the month although the non-peak mass inflow rate remains roughly the same.
(6) December 15–16: The sludge blankets do not respond to the large increase in the mass inflow rate.
(7) December 18–20: The sludge blanket height reductions here are (very) large when compared to the complete lack of blanket response on the 21st–22nd where a similar low mass inflow rate is obtained (although only for a short duration).
Kolding Central:
(1) August 5–7 and 11–15: The blanket heights are lower in the former period when compared to the latter even though the mass inflow rate is comparable.
(2) August 1–15: The diurnal blanket oscillations reach a new level from the 8th and onwards even though the mass inflow rate appears unchanged during the period.
(3) August 3, 5, 16–17 and 28: The blanket peaks that occurs on the 16th–17th exhibit a larger time-delay than on the other days relative to the mass inflow rate peak.
(4) December 2–3 vs 19–20 and 9–10 vs 14–15: The blanket responses on the latter dates are larger than the responses on the former even though the mass inflow rates are similar.
(5) December 19–31: The blanket dynamics seem to change abruptly after the 19th, and the sensitivity to the mass inflow rate appears to increase. In particular, the sludge blanket response to the mass inflow rate is much greater on the 29th–30th than earlier in the month.
(6) December 24: The blanket heights decreased substantially even though the change in the mass inflow rate was minor.
Theory
Optimizer and parameter settings
The parameter estimation was performed using the nlminb optimization algorithm (Gay 1990) available from the stats package (R-Stats-Documentation 2023). The parameters to optimize are with fixed observation noise at Kolding Central and Damhusåen, respectively. While the observation noise parameter can be estimated too, it is often helpful to reduce the number of noise parameters since these can be difficult to identify separately. The chosen value for was based on the reported accuracy level of a commercially available sludge blanket detector (Hach 2023). This should be interpreted in terms of the associated 95% confidence interval i.e. the sludge blanket detector has an accuracy of . The optimization was initialized with the parameter values . The lower and upper parameter boundaries were set at , somewhat arbitrary bounds for and with ’s unconstrained. We noted that optimizer stability required low initial values of ’s, otherwise the optimizer explodes. A district advantage of the nlminb optimizer is its possibility of using the objective function hessian during optimization. This is particularly ideal here because TMB also returns the hessian through algorithmic differentiation. The hessian-aided optimization was found to be superior to the gradient-only optimization insofar as it was actually able to converge to a true minimizer. The gradient-based search although faster was only able to locate the same minimizer as the hessian-aided optimization for one particular month. In that case (November, 4,000 observations) the computation time using the hessian increased to seconds up from seconds using only the gradient. As an example of the behavior for the other months we highlight the month of August of Figure 2: In this case, the hessian-aided and gradient-only optimizations required seconds, respectively, but although nlminb reported convergence in both cases (both X-convergence and relative convergence), a closer inspection on the maximum gradient components revealed and , respectively, proving that the gradient-only minimizer was false. The negative log-likelihood difference was also substantial for the two with and , respectively. A similar pattern was seen in the other 5 months in the available data set.
RESULTS AND DISCUSSION
The following section presents an analysis of the modeling results and includes estimated parameter values, example predictions, and an evaluation of the model performance. In order to avoid an excessive number of figures in the main article the reader is referred to Appendix B for the results from Damhusåen. The parameter estimates and example predictions are made based on the data from a single selected clarifier to exemplify the model properties, since it is impractical to include results for all (4 or 6) available clarifiers. The results shown are representative of all clarifiers on Kolding Central, but this is not the case for Damhusåen. The reason that Damhusåen results are not representative is due to the large blanket variation observed there, as shown in the data in Section 2.2, which is believed to be caused by the plant’s flow distribution characteristics. We will discuss this further in Section 3.3.
Model estimation and validation
We present parameter estimates and associated statistics in Tables 2 and 4, but emphasize again that the parameter estimates at Damhusåen are not representative of all clarifiers. To emphasize this point consider the range of values obtained from estimating on all Damhusåen clarifiers for August; , and , and for December; , and . We note in particular that for December the values of and span large ranges, indicating different sensitivity to changes in the mass inflow rate, and the recycle flow rate, respectively. The differences in the time-constant is also large ( versus ), implying that the sludge blanket responds much slower/faster in certain clarifiers. Returning to inspecting the parameter estimates in the two tables, we draw the following conclusions:
(1) The estimated time-constants for the two plants are found to be in the order of with mean values across all clarifiers of and at Kolding Central and Damhusåen, respectively. According to the model the stationary sludge blanket height determined by is thus reached to 95% of its value in under fixed inputs. Judging from various peaks in the considered data (Figure 2) the estimate for Kolding Central seems appropriate, while it is more difficult to tell from Damhusåen where certain situations (e.g., after the peaks on the 1st, and the 20th of August) indicate that such a process takes closer to 36–48 h instead.
(2) The estimates of and carry the expected signs (positive and negative, respectively), indicating that if the mass inflow rate increases then so does the (stationary) sludge blanket height, and vice versa for the recycle flow rate.
(3) The larger dispersion at Damhusåen is seen to result in process noise parameters that are times larger than those at Kolding Central, an increase from approximately up to .
August . | December . | ||||||||
---|---|---|---|---|---|---|---|---|---|
. | Estimate . | Std. error . | t-test . | Scaled estimate . | . | Estimate . | Std. error . | t-test . | Scaled estimate . |
1.596502 | 0.031724 | 50 | 2.101859 | 0.034789 | 60 | ||||
−2.229816 | 0.060314 | 37 | −3.095960 | 0.110200 | 28 | ||||
0.000154 | 0.000005 | 33 | 0.883888 | 0.000275 | 0.000012 | 23 | 2.757868 | ||
−0.009058 | 0.000818 | 11 | −0.711820 | −0.003085 | 0.000280 | 11 | −0.459864 | ||
−3.357198 | 0.031832 | 105 | −2.972295 | 0.022831 | 130 | ||||
0.050000 | 0.050000 |
August . | December . | ||||||||
---|---|---|---|---|---|---|---|---|---|
. | Estimate . | Std. error . | t-test . | Scaled estimate . | . | Estimate . | Std. error . | t-test . | Scaled estimate . |
1.596502 | 0.031724 | 50 | 2.101859 | 0.034789 | 60 | ||||
−2.229816 | 0.060314 | 37 | −3.095960 | 0.110200 | 28 | ||||
0.000154 | 0.000005 | 33 | 0.883888 | 0.000275 | 0.000012 | 23 | 2.757868 | ||
−0.009058 | 0.000818 | 11 | −0.711820 | −0.003085 | 0.000280 | 11 | −0.459864 | ||
−3.357198 | 0.031832 | 105 | −2.972295 | 0.022831 | 130 | ||||
0.050000 | 0.050000 |
Predictive performance with perfect input forecast
The results overall show significant improvements in the RMSE over a persistence model for the ‘fast dynamics’ period with the exception of December on Damhusåen where results show little to no improvement (and even a worsening for Clarifier 1). The ‘slow dynamics’ period shows vastly different results at the two plants, with clear improvements at Kolding Central and much worse accuracy at Damhusåen. This latter result reveals that the model predictions do not agree with the stationary level of the observations as demonstrated in Figure 13 in Appendix B, and the model seems to wrongly predict minor blanket movement. The four missing curves in Figure 14(b) reveals challenges with missing data for this particular period, an issue that is indicative of the data conditions and variability of data between clarifiers at Damhusåen in general.
. | Kolding Central . | Damhusåen . | ||||||
---|---|---|---|---|---|---|---|---|
. | RRMSE (%) . | DRMSE . | RRMSE (%) . | DRMSE . | ||||
Pred. (hours) . | August . | December . | August . | December . | August . | December . | August . | December . |
2.5 | −52 | −39 | −0.11 | −0.11 | −33 | −3 | −0.14 | −0.03 |
5.0 | −66 | −46 | −0.24 | −0.21 | −42 | −14 | −0.26 | −0.08 |
10.0 | −73 | −50 | −0.33 | −0.31 | −49 | −16 | −0.39 | −0.12 |
. | Kolding Central . | Damhusåen . | ||||||
---|---|---|---|---|---|---|---|---|
. | RRMSE (%) . | DRMSE . | RRMSE (%) . | DRMSE . | ||||
Pred. (hours) . | August . | December . | August . | December . | August . | December . | August . | December . |
2.5 | −52 | −39 | −0.11 | −0.11 | −33 | −3 | −0.14 | −0.03 |
5.0 | −66 | −46 | −0.24 | −0.21 | −42 | −14 | −0.26 | −0.08 |
10.0 | −73 | −50 | −0.33 | −0.31 | −49 | −16 | −0.39 | −0.12 |
A negative sign indicates that the proposed model has a lower RMSE and vice versa. The reader is referred to the bulk text for further details.
Discussion
We shall start by addressing performance, and differences therein, at the two treatment plants. First, the results presented in the previous section reveal that the performance in general is better in the summer month, although Kolding Central also showed good performance in December. For Damhusåen the prediction accuracy was significantly reduced in December with two out of six clarifiers performing worse than, or identical to, the persistence model, and the remaining four clarifiers showing only marginal accuracy improvements. A possible explanation for this seasonal variation may be due to more complicated settling dynamics during winter which our proposed model is unable to capture well. It should be mentioned however that an additional challenge for Damhusåen in December was a (relatively) large number of missing observations. Secondly, it is evident when comparing the two plants that model performance at Kolding Central is superior. This fact is particularly emphasized by the accuracy for the ‘slow dynamics’ periods, which reveals that the model is able to capture the smaller blanket dynamics and the stationary blanket level very well. The finding that performance at Damhusåen is worse is not unexpected based on the large blanket variability at Damhusåen in the data presented in Figures 2 and 9. The model would only hypothetically be able to account for these individual sludge blanket differences if they were caused by differences in the recycle flow rate alone. The differences cannot arise from the mass inflow rate, since this signal is comprised of a plant-wide inflow (same for all 24 clarifiers at the plant) and a line-wide suspended solids concentration (same for all six clarifiers in a line). It is important to note that this challenge cannot be solved by measuring the two inputs on clarifier level since these unobtainable signals would then have to be forecasted. The modeling task at Damhusåen is further complicated by the fact that the distribution of the plant inflow to the operation lines (and clarifiers) is known to be non-uniform and time varying, which inevitably obscures parts of the correlation between inputs and observations. This flow distribution property is also what we believe to be the root cause of the large sludge blanket dispersion visible in the data for Damhusåen. Thirdly, the model stability was seen to be excellent at Kolding Central and very challenged at Damhusåen. In further details, we saw no period at Kolding Central were the model did not converge to a ‘true’ minimum, by which we mean an estimate whose maximum gradient component is sufficiently small. In contrast convergence to non-true optima or errors in the gradient/hessian calculations during optimization (which leads the optimization to a halt) were common when working with the Damhusåen data. This is inevitably tied to the data properties with larger dispersion, more frequent number of longer periods with missing observations, more missing observations in general, and the presence of sudden jumps in the blanket data. The stability was however drastically improved by increasing the observation variance (from to ), but errors and convergence to false minima remained for the more challenging data (such as December 27th–31st) that had many missing observations and larger jumps. In conclusion we have seen that high quality data without too many missing values and jumps and a well-controlled clarifier flow distributions is necessary in order for the model to be both stable and well-performing, especially during winter where more complicated settling dynamics may be present.
The present work lacks a satisfying treatment of the presented model’s prediction uncertainty, although we briefly mentioned that the predictive distributions appear both underdispersed and overdispersed based on the results in Figures 5 and 12, respectively. We note that the outliers in the former figure are found primarily during periods where the blanket moves so it would seem sensible to have a diffusion function in (8) that incorporates this addition uncertainty. Various attempts were made to this end by expanding σ as a function of the inputs, e.g., , with the interpretation that increased flow rates creates additional turbulence in the water column, but all of these attempts led to a decrease in the prediction accuracy, and were thus abandoned. A correct uncertainty quantification is very important both because this is an essential part of the underlying Kalman filter and heavily affects the individual likelihood contributions, but also for the purpose of predictive control where a natural control objective is to have a particular quantile remain under the top of the water column.
In regard to model training, a subtle challenge is the implicit emphasis on having higher prediction accuracy in the upper layers of the clarifier, where there is a greater risk of sludge overflow. The actuality of this challenge rests on the (unknown) assumption that the dynamics at higher levels are significantly different from those at lower levels. If that is the case then it may be difficult to learn the upper-layer dynamics in periods where the blanket is seldomly excited to those levels, and that may decrease prediction accuracy when it is needed most. This challenge leads us to ask the following two questions:
(1) How can we correctly measure the model prediction accuracy if the accuracy at lower levels is of little interest?
(2) What can be done in terms of model training to direct focus on performance in the upper clarifier layers?
A technical detail worthy of mentioning, although common in practice, is the fact that the objective function to be minimized is based on the likelihood of one-step-ahead residuals as opposed to using multi-steps-ahead. The latter would be more natural since it matches the intended purpose of the model which is to produce forecasts. In the present case the relevant steps-ahead would be 30–60 which corresponds to 5–10 h (10 min per step). The use of such an objective function will however lead to correlation between all residuals in the time series significantly complicating the likelihood computation. That being said, naive implementations that do not take residual correlation into account could be further investigated. Such an implementation is arguably no worse than the colored residuals we have presented in Figures 4 and 11.
These non-normal residual structures remain an issue which should be addressed. We have considered other observation noise distributions through state-transformations, in particular log and logit domains. A detailed account of these can be found in Appendix C. The conclusion of this work was, however, that there was only a slight reduction in the auto-correlation, but a large decrease in model stability. As a final comment we remark that since the model purpose is prediction accuracy improvements in the one-step-ahead residuals alone are not enough but must also be accompanied by improvements in accuracy since this is the ultimate objective.
Finally, for a future outlook on the perspectives here, we mention that Thilker et al. (2021) showed the possibility of incorporating input forecasts into a stochastic differential equation through additional states. In the present case it would be obvious to try to incorporate the plant inflow into the model, although this will require additional observations, such as radar precipitation data. If successful, such an approach could generalize the modeling procedure across many wastewater treatment plants allowing a much easier implementation of the model and an associated clarifier predictive control strategy, limiting the required number of sensors and measurement devices.
CONCLUSION
A novel data-driven stochastic state space system for modeling and forecasting the sludge blanket height in secondary clarifiers have been presented. The model is trained on measurements of the sludge blanket height, and uses as inputs: (1) the mass inflow rate and (2) the clarifier recycle flow rate. The model prediction accuracy was observed to be great, relative to that of a persistence forecast, at a treatment plant with high data quality and well behaved clarifiers. The accuracy was more challenged at a treatment plant with lower data quality and uneven flow-distribution, in particular during winter, but the summer performance remained good. The model stability was similarly challenged at this more difficult-to-model plant, but was excellent on the former plant. The model residual analysis (one-step in-sample prediction errors) revealed a systematic pattern in the auto-correlation and deviated from the assumed normal distribution which indicates that there are missing drivers to be accounted for, but this issue could not be resolved. In conclusion, the presented model has the potential to be used in a model predictive control setting to improve secondary clarifier performance. As a concrete example a bypass algorithm could be developed whose objective is to maintain some prediction quantile of the sludge blanket height under the clarifier height, by allowing some fraction of the inflow to be bypassed. A similar strategy controlling the recycle flow rate, and trying to limit the effluent suspended solids could be used. Consequently the authors regard the presented work as a step towards the development of tomorrow’s modern data-driven wastewater treatment plants.
ACKNOWLEDGEMENTS
The data analysed in this paper were provided by Biofos (Damhusåen) and BlueKolding (Kolding Central). The work was partly funded by Biofos A/S and Krüger A/S in relation to the first author’s Ph.D studies. The work on developments of the used gray box modeling tools is supported by ARV (EU H2020 101036723), and ELEXIA (EU Horizon Europe 101075656). The authors acknowledge a conflict of interest insofar as the third and fourth co-authors are employed at Krüger A/S.
CONFLICT OF INTEREST
The authors have a conflict to declare.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.