ABSTRACT
This study explores various approaches to formulating a parallel hybrid model (HM) for Water and Resource Recovery Facilities (WRRFs) merging a mechanistic and a data-driven model. In the study, the HM is constructed by training a neural network (NN) on the residual of the mechanistic model for effluent nitrate. In an initial experiment using the Benchmark Simulation Model no. 1, a parallel HM effectively addressed limitations in the mechanistic model's representation of autotrophic bacteria growth and the data-driven model's incapability to extrapolate. Next, different versions of a parallel HM of a large pilot-scale WRRF are constructed, using different calibration/training datasets and different versions of the mechanistic model to investigate the balance between the calibration effort for the mechanistic model and the compensation by the NN component. The HM can improve predictions compared to the mechanistic model. Training the NN on an independent validation dataset produced better results than on the calibration dataset. Interestingly, the best performance is achieved for the HM based on a mechanistic model using default (uncalibrated) parameters. Both long short-term memory (LSTM) and convolutional neural network (CNN) are tested as data-driven components, with a CNN HM (root-mean-squared error (RMSE) = 1.58 mg NO3-N/L) outperforming an LSTM HM (RMSE = 4.17 mg NO3-N/L).
HIGHLIGHTS
In a parallel hybrid model (HM), a data-driven component compensates for structural gaps in a mechanistic model.
A data-driven component of an HM should be trained on an independent dataset.
Integrating an uncalibrated mechanistic model in a parallel HM may lead to better results than applying an overly calibrated mechanistic model.
A convolutional neural network outperforms a long short-term neural network as data-driven component of an HM.
INTRODUCTION
The wastewater industry encounters increased challenges as a result of stringent regulations, exemplified by the European Union's Water Framework Directive (European Commission 2020). These regulations necessitate a focused effort to meet stringent water quality standards and transition towards sustainable, energy-efficient, and circular technologies. In this context, modelling is a powerful tool to support Water Resource Recovery Facility (WRRF) operators and engineers.
In recent decades, engineers have favoured mechanistic approaches for modelling wastewater treatment processes. In WRRF operations, a transparent methodology is one of the keys to adoption, leading researchers to favour simpler but interpretable methodologies (Newhart et al. 2019). Mechanistic models are based on fundamental engineering and scientific knowledge about the physical, chemical, and biological mechanisms that affect a system, with the relationships themselves defined by modellers and assumed to be known. A generally accepted mechanistic model for the biokinetic processes in the biological reactor of wastewater treatment plants is the family of activated sludge models (ASMs), of which Activated Sludge Model No. 1 (ASM1) is the most commonly used (Henze et al. 2000; Sin & Al 2021). The ASM1 was primarily developed to model the removal of organic carbon and nitrogen, but it also aims to accurately describe sludge production and oxygen consumption (Gernaey et al. 2004). The model uses a chemical oxygen demand (COD)-based modelling technique where all organic matter is expressed as equivalent amounts of COD, as it provides a link between electron equivalents in the organic substrate, the biomass, and the oxygen used.
Mechanistic models require relatively limited data input, have high interpretability, and can extrapolate the process performance to a wide variety of process operating conditions. These models often simplify the complex processes in the WRRF, such as aeration, mixing, or aggregation of particulates (Schneider et al. 2022). The application of mechanistic models can be limited due to knowledge gaps or the (over)simplification of complex processes. There are still some relevant processes that are not understood clearly enough to be put in a model for use in simulations, such as the formation of nitrous oxide and the conversions within the biological phosphorus removal process (due to the variability of phosphate-accumulating organism metabolics) (Regmi et al. 2019). This results in only a partial description of the process or system, which can be valuable, but may not be robust or resistant to significant changes (Schneider et al. 2022). Mechanistic models require extensive and time-consuming parameterization, calibration, and validation. Parameter estimation based on numerical optimization algorithms may also lead to non-unique parameter estimates and many local optima (Dochain & Vanrolleghem 2001; Hvala & Kocijan 2020), which in turn limits the predictability of the model.
Data-driven approaches are gaining attention due to improved sensor technologies since these types of models are particularly successful when dealing with problems involving large and high-quality datasets (Cheng et al. 2020; Therrien et al. 2022; Khalil et al. 2023). Modern data-driven models are typically developed using (machine learning) algorithms that do not consider fundamental mass, charge, and energy balances. In contrast to mechanistic models, the structure of data-driven models is directly derived from data and does not require extensive knowledge about the system. Depending on the chosen complexity, data-driven models can theoretically approximate any underlying dynamics from the data. However, data-driven models have low knowledge-based interpretability and need high-quality datasets with sufficient representative dynamics, which can be a challenge due to harsh sensor conditions in WRRFs. They are generally also limited in extrapolation power (Schneider et al. 2022).
Hybrid modelling is a solution to bring forth the advantages of both mechanistic and data-driven models. Hybrid models (HMs) combine mechanistic and data-driven modelling techniques and benefit in this way from the advantages of the two approaches (Von Stosch et al. 2014). Mechanistic models have the capability to conserve critical process knowledge and maintain extrapolation capabilities, while data-driven approaches are capable of discovering hidden relationships and patterns that may not be sufficiently captured by mechanistic models. The application of hybrid modelling in the domain of WRRFs has the potential to foster automation (Rodriguez-Roda et al. 2002), increase efficiency, and increase the predictive power of models (Quaghebeur et al. 2022; Serrao et al. 2023).
Three different architectures of HMs can be distinguished: serial, parallel, and surrogate models. In a serial HM, the output of the one model is used as input to the other model (Psichogios & Ungar 1992). Generally, the output of the data-driven model is used as the input to the mechanistic model since the data-driven component is capable of predicting the dynamics of certain subprocesses that are not accurately modelled by the mechanistic component, such as poorly defined reaction kinetics (Lee et al. 2002; Lee et al. 2005). In cases where the overall structure of the mechanistic model is inaccurate or the origin of the mechanistic model gap is unknown, the parallel arrangement can partially compensate for any structural mismatch in the mechanistic model (Oliveira 2004). In a parallel architecture, the HM aims to optimize predictive accuracy without necessarily deepening the understanding of underlying processes. This shift towards prioritizing predictive accuracy over in-depth process understanding is particularly beneficial in applications where immediate decision-making or control actions are critical, for example, in model predictive control. In a parallel HM, a data-driven component is trained to predict the residuals between the mechanistic model outputs and the measurement data. The predicted residuals are then fused with the mechanistic model outputs, usually by addition. Finally, surrogate models are data-driven models that are trained on the output of a mechanistic model to create a computationally more efficient model, allowing rapid and/or large-scale simulations (Forrester et al. 2006). The importance of surrogate models is expected to rise with the growing use of digital twins for real-time simulations, as they can be a valuable tool in this context. Neural differential equations have recently emerged as a novel approach in hybrid modelling for wastewater treatment, learning the residuals of dynamics rather than the residuals of the state (Quaghebeur et al. 2022).
HMs have proven useful in a number of applications within the field of wastewater treatment. HMs have proven their usefulness in the creation of influent generator models, which have been developed using different data-driven methods to predict the incoming flow rate and pollutant loads to WRRFs (Flores-Alsina et al. 2014; Zhu et al. 2015; Li & Vanrolleghem 2022). HMs also demonstrate their usefulness in the development of soft-sensing models. Soft-sensing models, often data-driven, can predict variables that serve as input information to controllers, which are used in conjunction with ASM models (Wang et al. 2022). The implementation and advancement of HMs also have the capacity to encourage and enhance the effective deployment of digital twin technology in WRRFs (Torfs et al. 2022) and real-time model predictive control strategies (Serrao 2023).
Currently, the predominant architecture of HMs for wastewater treatment processes is the parallel HM (Piuleac et al. 2012; Cheng et al. 2021; Serrao et al. 2023), although instances of the serial architecture and the combination of parallel and serial models can also be found (Karama et al. 2001; Hvala & Kocijan 2020). The principal focus of the current research is predicting effluent quality (Piuleac et al. 2012; Serrao et al. 2023). Studies on HMs for WRRFs often provide little detailed documentation regarding the specific training and calibration efforts for both model components. Despite the application of different types of models as data-driven components of an HM in various studies, a consensus on this matter remains elusive. Although applications of HMs in WRRF modelling are increasing, research specifically addressing the most effective approach to developing such models is underdeveloped.
Schneider et al. (2022) listed various critical challenges in the field of hybrid modelling of WRRFs that need corresponding development efforts. The availability of Good Modelling Practice guidelines for the development of mechanistic models for ASMs is well-established (Rieger et al. 2012). While there are as of yet no official guidelines specifically tailored for data-driven models for WRRFs, several well-established documents provide best practices for setting up data-driven models in a general context (Hastie et al. 2009; Chollet 2021). The integration of hybrid modelling paradigms into existing frameworks for WRRF modelling raises important considerations regarding the extension of existing protocols and their transferability to hybrid modelling studies. Another challenge for HMs is the fact that uncertainty arises from both the mechanistic and data-driven components, and studies often lack the quantification of uncertainty, limiting the assessment of trust in HMs. Also, the task of balancing complexity between the mechanistic and data-driven components requires more research to determine the acceptable level of error that can be compensated by the data-driven model. One aspect to address regarding this challenge is the establishment of a calibration protocol for HMs with guidelines on the effort required for and the order of calibration of the mechanistic model compared to the data-driven model as well as the data requirements for these calibration efforts. Selecting a suitable architecture for constructing HMs presents another challenge in the development of HMs. It is important to ascertain the specific purposes for employing parallel models and sequential models. Identifying the most effective data-driven models for distinct purposes is also crucial in achieving desired outcomes.
This research aims to investigate how various factors impact the development of parallel hybrid modelling architectures to improve the predictive power of wastewater treatment plant models. The study is performed on a unique, continuously monitored pilot-scale wastewater treatment plant, called pilEAUte (Québec, QC, Canada). For this pilot-scale plant, a mechanistic model was developed in the previous research (Kirim 2022) and is used in this work. However, accurate effluent quality prediction remains a challenge in this model. The first objective of this study is to explore the feasibility of developing a parallel HM for enhancing effluent quality prediction in the wastewater treatment plant. In addition, the optimal model structure of the data-driven component is explored by comparing the performance of a parallel HM with a long short-term memory recurrent neural network (LSTM-RNN) as data-driven component and of a parallel HM with a convolutional neural network (CNN) as a data-driven component. Also, best practice in setting up parallel HMs with respect to calibration and training efforts between the mechanistic and the data-driven components is explored.
METHODOLOGY
Case study and mechanistic model
An HM is developed for a pilot-scale WRRF installed at Université Laval (Québec, Canada), called pilEAUte. The plant continuously receives domestic wastewater from a student residence and a kindergarten and rain runoff from the parking lot in the university campus. The WRRF treats 12 m3 of wastewater per day, and the configuration consists of a pumping station, a storage tank, a primary settler, and a biological reactor with five basins upstream of the secondary settling tank. For a more detailed description of the pilEAUte setup, refer to Kirim (2022). The plant is capable of carbon and nitrogen removal using a predenitrification configuration. The aeration flow rate in each basin is controlled using mass flow controllers based on the oxygen concentration in basin 4, with a setpoint set at 3 mg/L during the study. Each basin is equipped with individual airflow lines, which are adjusted using ratio controllers. The biological reactors have an internal recycle system that circulates water from the last basin to the first basin and settled sludge is recycled from the secondary clarifier to the first basin. The pilEAUte is monitored with sensors at the outlet of the primary clarifier, the biological reactors, the recycle flows, and the outlet of the secondary clarifiers (Appendix A).
Calibrated biokinetic model parameter values (Kirim 2022)
Parameter . | Default value . | Calibrated value . | Unit . | Deviation from default (%)a . | |
---|---|---|---|---|---|
b_H | Decay coefficient for heterotrophic biomass | 0.62 | 0.70 | 1/d | 12 |
f_XI | Fraction of biomass converted to particulate inert matter | 0.1 | 0.13 | – | 26 |
k_h | Maximum specific hydrolysis rate | 3 | 3.20 | gCOD/(gCOD*d) | 6 |
b_NH | Decay coefficient for NH4 oxidizing autotrophic biomass | 0.05 | 0.06 | 1/d | 18 |
b_NO | Decay coefficient for NO oxidizing autotrophic biomass | 0.033 | 0.04 | 1/d | 19 |
K_HNO2_NO | Nitrous acid half-saturation coefficient for NO oxidizing autotrophic biomass | 0.000872 | 0.000061 | gCOD/m3 | 174 |
K_NH3_NH | Ammonia half-saturation coefficient for NH4 oxidizing autotrophic biomass | 0.75 | 0.0057 | gNH3-N/m3 | 197 |
K_SH | Substrate half-saturation coefficient for heterotrophic biomass | 20 | 8.74 | gCOD/m3 | 78 |
mu_H | Maximum specific growth rate for heterotrophic biomass | 6 | 4.77 | 1/d | 23 |
mu_NH | Maximum specific growth rate for NH4 oxidizing autotrophic biomass | 0.8 | 0.71 | 1/d | 12 |
mu_NO | Maximum specific growth rate for NO oxidizing autotrophic biomass | 0.79 | 0.95 | 1/d | 18 |
K_NO2_H | Nitrite half-saturation coefficient for denitrifying heterotrophic biomass | 1 | 3.31 | gCOD/m3 | 107 |
K_O_NH | Oxygen half-saturation coefficient for NH4 oxidizing autotrophic biomass | 0.6 | 0.25 | gO2/m3 | 82 |
K_O_NO | Oxygen half-saturation coefficient for NO oxidizing autotrophic biomass | 1.5 | 0.27 | gO2/m3 | 139 |
K_OH | Oxygen half-saturation coefficient for heterotrophic biomass | 0.2 | 0.10 | gO2/m3 | 67 |
n_NO2 | Correction factor for anoxic growth of heterotrophs on nitrite | 0.6 | 0.92 | – | 42 |
n_NO3 | Correction factor for anoxic growth of heterotrophs on nitrate | 0.6 | 0.49 | – | 20 |
Parameter . | Default value . | Calibrated value . | Unit . | Deviation from default (%)a . | |
---|---|---|---|---|---|
b_H | Decay coefficient for heterotrophic biomass | 0.62 | 0.70 | 1/d | 12 |
f_XI | Fraction of biomass converted to particulate inert matter | 0.1 | 0.13 | – | 26 |
k_h | Maximum specific hydrolysis rate | 3 | 3.20 | gCOD/(gCOD*d) | 6 |
b_NH | Decay coefficient for NH4 oxidizing autotrophic biomass | 0.05 | 0.06 | 1/d | 18 |
b_NO | Decay coefficient for NO oxidizing autotrophic biomass | 0.033 | 0.04 | 1/d | 19 |
K_HNO2_NO | Nitrous acid half-saturation coefficient for NO oxidizing autotrophic biomass | 0.000872 | 0.000061 | gCOD/m3 | 174 |
K_NH3_NH | Ammonia half-saturation coefficient for NH4 oxidizing autotrophic biomass | 0.75 | 0.0057 | gNH3-N/m3 | 197 |
K_SH | Substrate half-saturation coefficient for heterotrophic biomass | 20 | 8.74 | gCOD/m3 | 78 |
mu_H | Maximum specific growth rate for heterotrophic biomass | 6 | 4.77 | 1/d | 23 |
mu_NH | Maximum specific growth rate for NH4 oxidizing autotrophic biomass | 0.8 | 0.71 | 1/d | 12 |
mu_NO | Maximum specific growth rate for NO oxidizing autotrophic biomass | 0.79 | 0.95 | 1/d | 18 |
K_NO2_H | Nitrite half-saturation coefficient for denitrifying heterotrophic biomass | 1 | 3.31 | gCOD/m3 | 107 |
K_O_NH | Oxygen half-saturation coefficient for NH4 oxidizing autotrophic biomass | 0.6 | 0.25 | gO2/m3 | 82 |
K_O_NO | Oxygen half-saturation coefficient for NO oxidizing autotrophic biomass | 1.5 | 0.27 | gO2/m3 | 139 |
K_OH | Oxygen half-saturation coefficient for heterotrophic biomass | 0.2 | 0.10 | gO2/m3 | 67 |
n_NO2 | Correction factor for anoxic growth of heterotrophs on nitrite | 0.6 | 0.92 | – | 42 |
n_NO3 | Correction factor for anoxic growth of heterotrophs on nitrate | 0.6 | 0.49 | – | 20 |
aThe deviation from default is calculated by , with A being the default parameter value and B being the calibrated parameter value.
Timeline with the time periods used for the different mechanistic modelling purposes (Kirim 2022).
Timeline with the time periods used for the different mechanistic modelling purposes (Kirim 2022).
Calibrated mechanistic model results and measurements for effluent nitrate during the validation period (Kirim 2022).
Calibrated mechanistic model results and measurements for effluent nitrate during the validation period (Kirim 2022).
Given the current lack of clarity regarding the specific structural component missing in the mechanistic model, leading to its suboptimal performance in predicting effluent nitrate, our strategy involves the implementation of a parallel HM to address these discrepancies. The choice of a parallel HM proves particularly advantageous in this context, as it has the capacity to address inaccuracies in the overall structure of the mechanistic model or fill gaps with unidentifiable score.
Parallel hybrid model setup
Recurrent neural networks (RNNs) are models that have at least one feedback loop, which allows them to take some temporal context into account in their decision function. Hence, RNNs are able to ‘remember’ information through time, which makes them a useful tool for time series forecasting (Goodfellow et al. 2016). The RNN cell encounters difficulties with long sequences because of vanishing or exploding gradient issues. During backpropagation, gradients can either become too small (vanish), resulting in inadequate weight updates, or too large (explode), causing unstable weight matrices. These issues stem from gradients being intractable and limiting the RNN's capacity to capture long-term dependencies effectively (Hewamalage et al. 2021). LSTM-RNNs address these issues as they are capable of learning long-term dependencies. Long short-term memory (LSTM) cells have an internal recurrence (a self-loop), in addition to the outer loop of the RNN. The cells have the same inputs and outputs as an ordinary recurrent network, yet each cell features additional parameters and a set of gating units that manage the information flow. The gates enable these networks to reject irrelevant information from the past and remember information in the current state (Goodfellow et al. 2016). LSTM networks have the ability to remember more than 1,000 time steps, depending on the complexity of the network's architecture (Hochreiter & Schmidhuber 1997). CNNs are often used for processing multidimensional inputs. CNNs use convolutional layers containing filters to identify important features in the input data. In addition, pooling layers are employed to summarize these features and extract the most prominent ones within a given locality (Goodfellow et al. 2016). Recently, CNNs have also gained attention as a valuable tool for time series forecasting (Wang et al. 2017; Durairaj & Mohan 2022). CNNs train filters that represent recurrent patterns in the series and use them to predict future values (Koprinska et al. 2018). They can also be effective for handling noisy time series by removing noise at each layer and extracting only the meaningful patterns (Borovykh et al. 2017).
For the development of the NNs, Keras (version 2.11.0, Chollet 2015) is used, a high-level interface of the TensorFlow platform (version 2.11.0, Abadi et al. 2016). To integrate Python-developed NN models with the mechanistic model, the WEST Tornado kernel is called through the command line interface. A Windows batch file is created to set the path to the Tornado kernel library and execute the XML file linked to the WEST model. The input variables for the NNs were preprocessed as described by Kirim (2022). Moreover, input data were standardized (centring and scaling) and subsampled from the original 1- to a 10-min interval to reduce the computational burden. The input data are reshaped into a 3D tensor to align with the required format for LSTM-RNNs and one-dimensional CNNs. The mean absolute error (MAE) is used as training loss and is optimized using the adaptive moment estimation (Adam) optimizer, and models are trained using a batch size of 64. Early stopping is used to determine the optimal number of training epochs. The number of layers is manually tuned first (between 1 and 5). The LSTM hidden units' size, CNN kernel size and the number of filters, activation function (ReLU or tanh), and the learning rate of the Adam optimizer were optimized using a random search tuner with the KerasTuner framework (O'Malley et al. 2019). As a regularization procedure, a drop-out layer with a drop-out rate of 0.1 was chosen between each layer of the NNs. All hyperparameters of the NNs are tuned using a separate validation period. The evaluation metrics encompassed the coefficient of determination (R2), which estimates the percentage of variance accounted for by the model. In addition, MAE and root-mean-squared error (RMSE) were used to provide information about the magnitude error.
Hybrid model setup with benchmark simulation model
In an initial experiment, a proof of concept is performed based on the well-established Benchmark Simulation Model no. 1 (BSM1) (Gernaey et al. 2014). The objective is to illustrate the capabilities of a parallel HM to capture missing process dynamics in the context of wastewater treatment as well as to assess the data requirements for training such a parallel HM. This experiment is important given the data constraints observed in the pilEAUte case study, where only a 2-month dataset was available for HM development. In the BSM1 experiment, which uses a dataset spanning a 2-week period, the goal is to assess the feasibility and performance of constructing a functional HM with data over a constrained time horizon. If a parallel HM can be successfully implemented for BSM1 using 2 weeks of high-frequency data, then developing a parallel HM based on 2 months of data with a similar frequency would be reasonable. The BSM1 serves as a test model for control strategies within a wastewater treatment setup featuring two anoxic tanks, three aerated tanks, and one settler. The ASM1 has been selected to describe the biological phenomena occurring in the biological reactor tanks. Realistic influent dynamics are available for the benchmark model, e.g. dry weather, or rainy weather (a combination of dry weather and a long rain period). With a configuration similar to the pilEAUte, this plant proves highly beneficial for initial explorations of an HM. The primary focus centres on predicting the NO3-N concentration in the effluent as this is also the variable of interest in the pilEAUte case study. First, synthetic data of the behaviour of a wastewater treatment plant using the full BSM1 are generated in WEST 2023 (DHI, Hørsholm, Denmark). The model simulated 150 days with constant influent to obtain a steady state. Subsequently, simulations were conducted for 2 weeks using the BSM1 dry weather influent, and an additional 2 weeks were simulated using the BSM1 rain weather influent. The data are sampled every 15 min, and a Gaussian noise with mean μ = 0 and standard deviation σ = 0.25 is added to the effluent NO3-N data.
Hybrid model setup with pilEAUte model
Overview of the studied parallel hybrid model developed to predict effluent nitrate.
Overview of the studied parallel hybrid model developed to predict effluent nitrate.
RESULTS AND DISCUSSION
Parallel hybrid model proof of concept on benchmark model
Results of the BSM1 experiment: hybrid model output and data-driven model output compared to mechanistic model output with wrong dynamics and the dataset for effluent nitrate for the training (dry weather) and test (rainy weather) period. The rain event during the wet weather conditions is indicated by the grey box.
Results of the BSM1 experiment: hybrid model output and data-driven model output compared to mechanistic model output with wrong dynamics and the dataset for effluent nitrate for the training (dry weather) and test (rainy weather) period. The rain event during the wet weather conditions is indicated by the grey box.
In an additional experiment, μA is restored to its calibrated value of 0.5 d−1. Simultaneously, the ASM mechanistic model is modified by adjusting the value of the KNH parameter from 1 to 5 mg NH4-N/L, thereby introducing an alternative variation in the system's dynamics. Subsequently, an HM is constructed using this modified mechanistic model. The outcomes of this experiment are shown in Appendix B. This experiment yielded results similar to those observed in the first experiment. Although the HM exhibits somewhat inferior performance during the rain event within the test period compared to the HM in the first experiment, it still outperforms both the modified mechanistic model and the data-driven model.
In this part of the study, it was demonstrated that the application of a parallel HM allows for a significant reduction in RMSE, even when both components of the HM are given the same input data. Furthermore, it was shown that with a training period of only 10 days of high-frequency data, the HM achieved a good performance, evidenced by an RMSE of 0.64 mg NO3-N/L during the testing phase.
pilEAUte case study
The work of Kirim (2022) revealed that, despite an extensive and thorough calibration procedure, the mechanistic model of the pilEAUte plant still exhibits limited predictive power for effluent nitrate. This limitation was the driving force behind the investigation undertaken in this study, aiming to explore the potential and identify an optimal methodology for developing a parallel HM for a WRRF case study. The purpose of this HM is to compensate for the missing information in the mechanistic model by incorporating additional information and ultimately enhancing the overall predictive power of the model.
Data-driven models
Timeline with the time periods used for the different mechanistic modelling (Kirim 2022) and data-driven modelling purposes. The data-driven model is trained and validated using the calibration period of the mechanistic model and tested using the validation period of the mechanistic model.
Timeline with the time periods used for the different mechanistic modelling (Kirim 2022) and data-driven modelling purposes. The data-driven model is trained and validated using the calibration period of the mechanistic model and tested using the validation period of the mechanistic model.
LSTM-RNN data-driven model output and CNN data-driven model output compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The NNs are trained on the calibration dataset of the mechanistic model.
LSTM-RNN data-driven model output and CNN data-driven model output compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The NNs are trained on the calibration dataset of the mechanistic model.
Hybrid models using calibration dataset
LSTM-RNN hybrid model output and CNN hybrid model output compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The NNs are trained on the calibration dataset of the mechanistic model.
LSTM-RNN hybrid model output and CNN hybrid model output compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The NNs are trained on the calibration dataset of the mechanistic model.
The results of this experiment indicate that the data-driven components of the HM were not able to learn relevant missing dynamics from the remaining error of the mechanistic model in the calibration period. It should be noted that the mechanistic model itself has already been extensively calibrated to this data period. Hence, it raises the question of whether sufficient relevant structural information remains in the residual of this calibration period for the data-driven models to capture. In addition, the question arises to what extent the missing dynamics and knowledge in the output of the mechanistic model are already corrected by compensating certain parameters from their ‘true’ values during the calibration itself. In such a case, the data-driven component in the HM faces a challenging task to perform effectively. It must successfully tackle two objectives to accurately predict unseen periods: (1) recognize and correct for the parameter compensation of the mechanistic model, and (2) incorporate the missing dynamics. These requirements are demanding, and there is a realistic possibility of failure of the data-driven component to predict accurately in this case.
Hybrid model with validation dataset
Timeline with the time periods used for the different mechanistic modelling (Kirim 2022) and the hybrid modelling purposes. In this specific experiment, the CNN component of the hybrid model is trained on an independent dataset (the validation period of the mechanistic model), validated on the calibration period of the mechanistic model, and tested on the initialization period of the mechanistic model.
Timeline with the time periods used for the different mechanistic modelling (Kirim 2022) and the hybrid modelling purposes. In this specific experiment, the CNN component of the hybrid model is trained on an independent dataset (the validation period of the mechanistic model), validated on the calibration period of the mechanistic model, and tested on the initialization period of the mechanistic model.
CNN hybrid model output compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The CNN component is trained on an independent validation dataset.
CNN hybrid model output compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The CNN component is trained on an independent validation dataset.
The findings of these experiments suggest that training a NN in a parallel HM configuration on the same dataset as used for calibration of the mechanistic model may not yield effective results, as the remaining error during this period contains limited structural information for the model to learn. Training the data-driven component in an HM on an independent dataset appears to provide more relevant information for the NN component to learn some structurally missing info. However, going through an extensive calibration of the mechanistic model prior to considering a parallel HM still creates a risk of wrongly fitting the mechanistic model due to structural deficiencies which hamper the final overall HM model performance.
Analysis of mechanistic model structure deficiencies
For the mechanistic model of Kirim (2022), potential calibration issues can be observed in two different submodels: the hydraulic and the biokinetic models. The hydraulic model was established using data obtained from two tracer tests, which led to the determination of various backflows between the basins that were incorporated into the final model configuration. The results of the hydraulic model can be found in Appendix D. The results demonstrate that the hydraulic model prediction is still suboptimal. In the first tracer test dataset, the model consistently overestimates the observed peak in each tank. For the second dataset, the shape of the peak deviates from the actual peak in each tank. A suboptimal hydraulic model will influence the subsequent calibration of biokinetic model parameters as they may be used to compensate for missing dynamics in the hydraulic model. The calibrated parameters of the biokinetic model are presented in Table 1. It can be observed that the calibrated values for the model parameters and
related to the biokinetics of the two-step nitrification process are significantly lower than their default values (factor 10–100 difference). The same situation holds for the oxygen half-saturation concentrations (KO,NH and KO,NO) (factor 3–5 difference). This could be an indication that the biokinetic model parameters are suboptimally calibrated to compensate for missing phenomena in the model structure, resulting in an overall negative impact on the model's predictive power.
In the next part of the study, the potential impact of wrongly fitting the mechanistic component in HMs is explored. The goal is to identify the optimal balance between the calibration effort for the mechanistic model and acceptable compensation by the NN component. This investigation involves comparing the training of the NN component on the residual of the calibrated mechanistic model with that of an uncalibrated mechanistic model. To accomplish this, the mechanistic model's biokinetic parameters were set to their default values (Table 1), and the backflows included in the hydraulic model were removed. The model was re-executed, generating new mechanistic model predictions for effluent nitrate. The removal of the backflows had no impact on the model predictions. However, resetting the biokinetic parameters to their default values did result in a significant difference in the output of the mechanistic model. The uncalibrated model performs worse than the calibrated one, as expected.
A CNN is then trained and validated on the residual of the uncalibrated model. The NN is thus trained to predict the residual between the output of the mechanistic model with default parameters for effluent nitrate, and the corresponding measurements. The predicted residuals are then added to the output of the uncalibrated mechanistic model. The results are compared with those from a previous experiment, in which the CNN of the HM was trained on the difference between a calibrated mechanistic model. This experiment is performed by both training the NN component on the calibration dataset of the mechanistic model and the validation dataset of the mechanistic model.
Accuracy metrics for comparison of mechanistic model (mech.), CNN hybrid model based on the calibrated mechanistic model (HM cal.), and CNN hybrid model based on the uncalibrated mechanistic model (HM uncal.) for training on the calibration dataset
Period . | Model . | Metric . | ||
---|---|---|---|---|
RMSE . | MAE . | R2 . | ||
Train | Mech. | 2.30 | 1.87 | 0.17 |
HM cal. | 0.99 | 0.77 | 0.84 | |
HM uncal. | 1.39 | 1.05 | 0.70 | |
Test | Mech. | 4.38 | 3.66 | <0 |
HM cal. | 5.37 | 4.46 | <0 | |
HM uncal. | 1.58 | 1.29 | 0.23 |
Period . | Model . | Metric . | ||
---|---|---|---|---|
RMSE . | MAE . | R2 . | ||
Train | Mech. | 2.30 | 1.87 | 0.17 |
HM cal. | 0.99 | 0.77 | 0.84 | |
HM uncal. | 1.39 | 1.05 | 0.70 | |
Test | Mech. | 4.38 | 3.66 | <0 |
HM cal. | 5.37 | 4.46 | <0 | |
HM uncal. | 1.58 | 1.29 | 0.23 |
The metrics correspond to the results in Figure 11.
CNN hybrid model output trained on the residual of a calibrated mechanistic model and CNN hybrid model output trained on the residual of an uncalibrated mechanistic model compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The CNNs are trained on the calibration dataset of the mechanistic model.
CNN hybrid model output trained on the residual of a calibrated mechanistic model and CNN hybrid model output trained on the residual of an uncalibrated mechanistic model compared to mechanistic model output and measurements for effluent nitrate for the training and test period. The CNNs are trained on the calibration dataset of the mechanistic model.
CNN hybrid model output trained on the residual of a calibrated mechanistic model and CNN hybrid model output trained on the residual of an uncalibrated mechanistic model compared to mechanistic model output and measurements for effluent nitrate for the training and the test period. The CNNs are trained on an independent validation dataset.
CNN hybrid model output trained on the residual of a calibrated mechanistic model and CNN hybrid model output trained on the residual of an uncalibrated mechanistic model compared to mechanistic model output and measurements for effluent nitrate for the training and the test period. The CNNs are trained on an independent validation dataset.
So, in these experiments, it is shown that an HM trained on the uncalibrated mechanistic model can improve the accuracy of the predictions and capture the right dynamics, whereas the HM trained on the calibrated model fails to compensate for the missing dynamics of the mechanistic model. Moreover, the predictions using this HM remain rather close to the predictions of the mechanistic model in this case.
The results of this experiment suggest that the residual of the uncalibrated mechanistic model contains valuable information and patterns that are not present in the residual of the calibrated model. By incorporating this residual information into the training process, the NN is able to improve its predictive capabilities, resulting in superior performance during the test period. The HM trained on the uncalibrated mechanistic model (Figure 11) performs better than both the CNN and LSTM purely data-driven models (Figure 7) in terms of prediction accuracy and capturing the dynamics. This suggests that the mechanistic model still offers valuable information to the HM, even though the performance of the stand-alone mechanistic model is far from optimal. The results of these experiments thus support the hypothesis that the mechanistic model may have been wrongly fitted. This is always a risk when a mechanistic model is unable to describe some dynamics. Detailed calibration of the mechanistic model with several unidentifiable parameters may lead to overfitting, making it more difficult for a parallel NN component to learn the missing model dynamics. The present work shows that a parallel HM approach can compensate for non-calibrated phenomena. This could potentially save time in the calibration effort of mechanistic models as well. Hvala & Kocijan (2020) set up an HM based on a default mechanistic model and pre-set input wastewater characterization. This model showed a prediction accuracy comparable to the prediction accuracy of an HM with tuned mechanistic model parameters, indicating that no prior complex tuning of the mechanistic model is needed for good HM performance.
Hybrid model with CNN and LSTM trained on uncalibrated mechanistic model residual
To further investigate the findings of the first experiments, where the performance of an HM with CNN was compared with an HM with LSTM, an additional HM is developed with an LSTM as the data-driven component. In this experiment, the LSTM is also trained on the residual of the uncalibrated mechanistic model. The goal of this experiment is to confirm that the inferior performance of the LSTM HM compared to the CNN HM can be attributed to the nature of the data-driven component in the HM, rather than the specific characteristics of the calibrated mechanistic model.
CNN and LSTM hybrid model output trained on the residual of an uncalibrated mechanistic model compared to mechanistic model output and measurements for effluent nitrate for the training and the test period.
CNN and LSTM hybrid model output trained on the residual of an uncalibrated mechanistic model compared to mechanistic model output and measurements for effluent nitrate for the training and the test period.
LSTM-RNNs are typically used to predict time series due to their ability to capture long-term dependencies in sequential data. However, in this study, the model fails to effectively learn meaningful patterns from the input data during the training phase. The LSTM-RNN struggles to capture and represent the underlying patterns and dependencies present in the time series data. As a result, when faced with the test data, the model lacks the necessary knowledge and context to accurately recognize and predict patterns in the residuals. This observation is noteworthy because the existing literature suggests that LSTM-RNNs perform well as data-driven components in HMs, as demonstrated by Jia et al. (2021) and Dong et al. (2023). The precise cause behind the failure of the LSTM remains ambiguous or undetermined. It is possible that the dataset is not large and diverse enough for the LSTM-RNN to learn something useful in this study. On the contrary, the HM with CNN as the data-driven component is able to detect and learn important dynamics in the residuals during the test period. The CNN has the ability to concentrate on important time series features and can extract these features through its learning process. The convolution filter in CNN can be useful for identifying patterns at different timescales or detecting specific events in the time series. This was also confirmed by Wang et al. (2019). However, to draw definitive conclusions regarding the optimal data-driven component for an HM, additional research is needed. A holistic approach using comprehensive evaluation criteria is essential to verify the performance among data-driven models within this context.
The use of a relatively short dataset (2 months) may be seen as a limitation in this study. The dataset was constrained to the spring season due to frequent sensor errors and is thus not representative of a full year of data with seasonal influences. However, it should be stressed that data requirements for (hybrid) models can vary significantly based on the modelling objective and particularly the length of the prediction horizon that is envisioned. For applications requiring short-term predictions (such as model predictive control or short-term operational optimization in digital twins), a shorter training period could provide adequate results as long as sufficient (and relevant) dynamics are present in the training data. As in most modelling studies, retraining and/or recalibration of the model will become necessary at some point. The length and information content of the training dataset may influence the frequency at which recalibration or retraining is needed. For the current study, both the BSM1 experiment using 2 weeks of data, and the pilEAUte case study using 2 months of data indicate that parallel HMs can successfully be set up with datasets over a limited time horizon. Further testing on a longer dataset (for example, a full year) could be beneficial to evaluate the parallel HM's performance across different conditions, thereby ensuring its robustness and determining the need and frequency of retraining.
CONCLUSION
In this study, parallel HMs are developed in which the data-driven component of the HM is trained to forecast the residual between the mechanistic model output and the measurements for effluent nitrate. A first theoretical experiment based on BSM1 demonstrated the effective predictive capability of a parallel hybrid, effectively addressing limitations in the mechanistic model's representation of autotrophic bacteria growth and a data-driven model's incapability to extrapolate. Constructing an HM of the pilEAUte showed that training a parallel HM requires separate datasets for accurate calibration of the mechanistic model and training of the data-driven model. Further improvement in HM performance was achieved when the CNN component of a parallel HM was combined with an uncalibrated mechanistic model. This indicates that HMs may be more difficult to construct when the mechanistic component is wrongly calibrated with uncertain or unidentifiable parameters. It also indicates that balancing calibration efforts between mechanistic and data-driven components in HMs is crucial for optimal final model performance. An iterative approach is recommended over a sequential calibration procedure.
Finally, different choices for the data-driven component were tested for their ability to reveal the hidden dynamics from WRRF time series. Comparing the LSTM-RNN and CNN HMs, the CNN HM outperformed the LSTM HM in learning meaningful patterns and predicting accurate dynamics in wastewater treatment system time series with recurrent phenomena. This indicates that CNN is more suitable as a data-driven component for parallel HMs in this context than LSTM-RNN. When developing a model for wastewater treatment processes, thoughtful allocation of efforts is essential, potentially giving preference to refining HMs, as they may provide a more effective path for improvement than excessive calibration efforts in mechanistic models.
FUTURE PERSPECTIVES
Further research is required to reach the full potential of hybrid modelling. Implementing a parallel HM for a full-scale treatment plant and examining various strategies for its setup could offer additional valuable insights into the optimal construction of an HM.
Currently, limited research is available on determining the effective data-driven components in parallel HM structures. This study indicates the potential of CNN as a data-driven component of a hybrid modelling approach, but further exploration of alternative options may lead to even better methods. In recent years, CNN-LSTM models have emerged as a valuable data-driven approach for time series prediction. This model combines the strengths of CNN by leveraging the feature extraction capabilities and LSTM by capturing long-term dependencies in input time series data.
This study aimed to identify the most effective methods for constructing a parallel HM. However, a general framework for integrating mechanistic and data-driven models still needs to be developed. It should address key considerations such as data compatibility, model selection, parameter estimation, model validation, and interpretation of the HM results. By developing this framework, researchers and practitioners can ensure a systematic and reliable integration of data-driven and mechanistic models, leading to improved accuracy and understanding of complex systems.
ACKNOWLEDGEMENTS
Jan Verwaeren received funding from the ‘Onderzoeksprogramma Artifciële Intelligentie (AI) Vlaanderen’. The work of Peter Vanrolleghem was supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant RGPIN-2021–04347 Towards digital twin-based control of Water Resource Recovery Facilities – Methods supporting the use of adaptive hybrid digital twins.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.