ABSTRACT
Model parameter estimation is a well-known inverse problem, as long as single-value point data are available as observations of system performance measurement. However, classical statistical methods, such as the minimization of an objective function or maximum likelihood, are no longer straightforward, when measurements are imprecise in nature. Typical examples of the latter include censored data and binary information. Here, we explore Approximate Bayesian Computation as a simple method to perform model parameter estimation with such imprecise information. We demonstrate the method for the example of a plain rainfall–runoff model and illustrate the advantages and shortcomings. Last, we outline the value of Shapley values to determine which type of observation contributes to the parameter estimation and which are of minor importance.
HIGHLIGHTS
Approximate Bayesian Computation is a likelihood-free approach for inverse parameter estimation.
The method is both computationally simple and versatile to use imprecise information.
Censored and binary data are useful observations for parameter estimation.
Shapley values allow us to estimate the significance of a specific type of observation.
The Monte-Carlo based method is computationally expensive.
INTRODUCTION
Model parameter estimation by inverse modeling is the adjustment of the unknown parameter values to get a match between model predictions and measurement data, i.e., observations (Refsgaard 1997; Šimůnek & De Vos 1999; Tscheikner-Gratl et al. 2016). Classical inverse modeling relies on paired prediction – observation in the form of single-valued point data and the observations are assumed as being uncensored. Two approaches are popular to solve the calibration problem, i.e., first, the method of minimizing an objective function and second, maximum likelihood. The first method selects some objective function that is related to the deviations between observations and model predictions. The best fit of the parameter values is achieved by minimizing the objective function (Marquardt 1963; Madsen et al. 2002). The second method determines the best fit by maximizing the likelihood – which is the probability of the model predictions to match the observations (Beven & Binley 1992). Many textbooks are available on calibration by inverse modeling (Beck & Arnold 1977; Tarantola 2005; Beven 2010; Aster et al. 2018, etc.) and the reader is expected to be familiar with the approach.
For any practical application, it has to be seen that measurements are inaccurate in nature (see, for e.g., McMillan et al. 2018 for hydrological data) and – in principle – classical calibration methods are no longer straightforwardly applicable. Still, as long as single-valued point data (or a time series thereof) is available, the most common approach is to ignore errors in the parameter estimation and accept the resulting uncertainty in model prediction. More recently, authors such as Beven & Binley (1992), Abbaspour et al. (1997), Higdon et al. (2004), Gupta et al. (2005), Todini (2008), Dotto et al. (2011), Dotto et al. (2012) discuss methods to take the uncertainties in measurements as well as deficits in model structure into account (Deletic et al. 2012). The underlying idea is to assign uncertainties to the parameters so that the model predictions fit the observations as well as presumptions on inaccuracy and other prior information (Abbaspour et al. 2007). With increasing model complexity and linked or coupled models, the propagation of uncertainties through models becomes more important (Tscheikner-Gratl et al. 2019). At the same time, data collection and handling methods are improving. However, there are still uncertainties related to data availability (Hajihassanpour et al. 2023), and even when data availability is good, it may be difficult to process all available data sets (Tscheikner-Gratl et al. 2016). Finally, recent developments include the application of artificial intelligence algorithms to calibration and uncertainty assessment problems (Garzón et al. 2022), and attempts to assess long-term prediction uncertainties related to global change (Kourtis & Tsihrintzis 2021) and to improve robustness under deep uncertainty (Casal-Campos et al. 2018).
However, there are instances where point measurements are no longer available, i.e., the information on the system performance cannot be measured exactly but the information is imprecise. In statistical terms we can consider imprecise observations as being censored (meaning that only part of the information is available – see for e.g., Gijbels 2010). This is a common problem in various urban drainage applications. For example, in deterioration models of ageing infrastructure, censored data are often discussed in relation to the problem that data are only available for parts of the system. If this introduces a bias into the data, e.g., if only data-within an area is collected, or only older pipes with certain materials are inspected, this can lead to poorly positioned data driven models (Scheidegger et al. 2011). Other options are simplified data such as proxy data or binary data (Wani et al. 2017). However, despite being imprecise such information is still considered to be hard data – as compared to soft data. Seibert & McDonell (2002) define soft data as ‘qualitative knowledge from the experimentalist that cannot be used directly as exact numbers but that can be made useful through fuzzy measures of model simulation and parameter value acceptability’. Thus - in this context – one can turn soft data into imprecise information by subjective judgement of associated probabilities.
Different approaches are pursued for parameter estimation using the – above mentioned – imprecise information: e.g., Otto & Bandemer (1986) use fuzzy theory, Wang (2019) evidence theory, while Sankararaman & Mahadevan (2012) provide a general framework based on both least square optimization and likelihood assessment. Wani et al. (2017) develop a methodology for censored and binary data by defining a likelihood function and using Bayesian interference for the estimation.
In this paper we demonstrate the use of a likelihood-free approach for this problem, that is Approximate Bayesian Computation (ABC) (Nott et al. 2012; Vrugt & Sadegh 2013; Kavetski et al. 2018). The virtue of the approach is its intuitive angle toward using imprecise information and the simple computational implementation. In the following, we first present the theoretical background and then apply the methodology to two case studies. The first one is a simple but real-world parameter estimation problem that arose in wastewater-based epidemiology, while the second shows the application to a semi-virtual urban drainage catchment – using measured rain series as input data. In order to present a unified framework, we also demonstrate how to use Shapley values to identify which observations are supporting the parameter estimation and which are of little importance.
METHODS
Typology of imprecise information
- •
Interval data
The data are bracketed, i.e., the measurement value lies within an interval defined with lower and upper bounds. Typically, the boundary values are estimated as confidence intervals from observations or prior knowledge, e.g., CI 2.5–97.5%. Also, soft data can be transferred into (hard) interval data. An example would be the expectation that data from a highly unreliable sensor can still be bracketed within a prediction uncertainty.
- •
Probability distribution
This information is given for the classical description of measurement uncertainty and assumes that observations can be described with a probability distribution. For example, the uncertainty in a sensor measurement could be assumed to be normally distributed. Note that for a uniform distribution, the information is equal to interval data.
- •
Binary information
The measurement quantity is either Yes/No or value A/value B. The simplest form is the information on the state of a system state, e.g., for a certain rain event, we either experience a combined sewer overflow or not or flooding on the surface or not.
- •
Boundary information
The data are censored in such a way that measurements can only be lower or upper a defined boundary of the true value. For example, the runoff from an urban catchment cannot be lower than the dry weather baseflow. Boundary information is a special case of interval data.
Maximizing similarity
It is possible to adapt this traditional approach in inverse modeling for imprecise information, by using the flowing algorithm:
1. Transform the imprecise observations YO into a matching distribution q(YO) (e.g., uniform between lower/upper bounds for interval data) and define an initial set of parameter values Θ as well as a (prior) distribution for the parameters π(Θ)
2. Compute the resulting probability distribution of the model predictions p(Ym) e.g., by sampling from π(Θ) using Monte-Carlo Simulation
3. Compute a similarity metric of the two distributions q(YO) and p(Ym)
4. Repeat Steps 2 and 3 with increasing improvement of Θ until the similarity is maximized
Approximate Bayesian Computation
As demonstrated by Wani et al. (2017) the maximum likelihood method can be extended to use censored and binary information for parameter estimation. Here we take a slightly different approach (Nott et al. 2012), which is denoted as likelihood-free Bayesian interference or ABC. The method was originally introduced in 1990 as a rejection technique thus bypassing the requirement to compute a likelihood function (Tavaré et al. 1997; Pritchard et al. 1999). The prototype rejection algorithm has been suggested as follows (adapted from Ke & Tian 2019):
Given the observations YO and a prior distribution for the parameters π(Θ)
1. Generate a sample from the prior distribution
2. Simulate using Θ’ to get
3. Accept the sample if Ym=YO for binary data for interval data otherwise reject the sample
4. Repeat 1. to 3. until the required number of posterior samples are obtained
The result is a posterior distribution of the model parameters Θ that match the imprecise information. Note that the ABC methodology usually applies a summary statistic for acceptance (Ke & Tian 2019) much in the same way as the objective function mentioned above. However, it is actually the virtue of the simple ABC rejection scheme that the information on the acceptance of a sample has not been transformed by complex functions but can be used directly. Compared to standard Bayesian interference the benefit is, that no assumption of the distribution of the model residuals is necessary. It is thus evident that the simple and basic reject scheme is well suited for the assessment of imprecise information.
The basic ABC rejection method may result in a huge computational burden if the acceptance rate is low, as a suitable number of posterior samples is necessary. Just for the sake of an estimate: given an acceptance rate of 1% and aiming to achieve 102 posterior samples would already require simulations. Thus, many improvements to the simple rejection scheme have been proposed, most importantly MCMC-ABC (Marjoram et al. 2003) and sequential Monte-Carlo (SMC) methods (Sisson et al. 2007). In order not to overload the paper we refrain here from discussing (and implementing) possible improvements but use the plain rejection scheme.
Another point that needs to be addressed is the statistics on the resulting posterior distribution of the parameters. For assessment of the parameter estimation and comparison of methods, one might not be able to use the full posterior distribution of the parameters but needs a resulting set of values ΘR. If a summary statistic is available, one can resort to using this information to determine the best-fitting parameter set. However, if we have more than one parameter and lack such summary statistics, this is no longer straightforward. Intuitively – and ignoring covariance – we can either use the median (50%ile values) or the mode (maximum density value) of the posterior probability density function. For the problems at hand, the median values gave more convincing results and have thus been used throughout.
Shapley value
Especially for imprecise measurements, it happens in practice that different sources of information (types of observations, i.e., input time series) are available and used for parameter estimation. The general attitude is, that the more the better. However, this simple approach falls short as inverse model parameter estimation is usually nonlinear. Depending on both the information available and the structure of the model (including modeling aim) it is important to identify which observations support the parameter estimation and which are of little importance. Lundberg & Lee (2017) and Sundararajan & Najmi (2020) demonstrate that Shapley values (Shapley 1953) are a measure of the importance of a specific feature (here observation) to model predictions.
Computational implementation
The computational burden in Monte-Carlo based procedures limits applications to fast models and/or requires the use of efficient algorithms and parallel computing. The numerical scheme was coded in ANSI C and tested on a PC with an Apple M1 Max 10-Core @3.22 GHz processor and 32 GB LPDDR5-6400 RAM. The runtime of processing the catchment hydrology case study is 70.3 sc for 200,000 Monte-Carlo samples – for the calibration rain series with a length 3 months and 10 min timestep resolution. Since the implementation is numerically efficient, the performance is fast enough for analysis. Parallelization of the code would further improve the computational speed as the computational complexity is O(N) with N being the number of Monte-Carlo samples.
RESULT
Case study 1 – Wastewater-based epidemiology
Case study 2 – Catchment hydrology
The model computes the hydrograph from the catchment by means of a linear reservoir (parameters A and T) and the combined sewer overflow (CSO) structure is represented by means of a mass balance model (parameters V and I). For implementation and application examples see Kleidorfer et al. (2009a, 2009b) and Dotto et al. (2011). As rain input, we use a 3-month period for calibration (see Figure 4 Right) and a 1-year rain series (different year, i.e., no overlap in the information) for verification.
For a reference situation the parameter values (see Table 1) are chosen to reflect a standard combined sewer catchment according to Austrian guidelines (Kleidorfer & Rauch 2011). Using this reference model, three times a series of imprecise observations are computed, which are then used as input in the calibration process. The idea is to use the ABC method to inverse estimate the reference parameter values from these observations. As prior information, the parameters are assumed to be uniformly distributed see Table 1 for lower/upper boundaries.
. | Reference . | Prior LB . | Prior UB . | CSO . | EMF . | BET . | All . |
---|---|---|---|---|---|---|---|
(a) Parameter | |||||||
A in m2 | 10000.0 | 5000.0 | 12000.0 | 9845.0 | 10337.0 | 9689.0 | 10499.0 |
T in s | 1800.0 | 1000.0 | 3000.0 | 2114.0 | 1757.0 | 1701.0 | 1735.0 |
V in m3 | 15.0 | 10.0 | 25.0 | 17.1 | 17.5 | 15.9 | 18.5 |
I in m3/s | 0.00200 | 0.00100 | 0.00400 | 0.00168 | 0.00250 | 0.00166 | 0.00189 |
(b) Result | |||||||
NSE | 0.745 | 0.708 | 0.787 | 0.841 | |||
RMSE | 0.430 | 0.460 | 0.394 | 0.339 | |||
relDev Par. | 0.122 | 0.119 | 0.079 | 0.093 | |||
reDev CSO | 0.000 | 0.125 | 0.063 | 0.063 |
. | Reference . | Prior LB . | Prior UB . | CSO . | EMF . | BET . | All . |
---|---|---|---|---|---|---|---|
(a) Parameter | |||||||
A in m2 | 10000.0 | 5000.0 | 12000.0 | 9845.0 | 10337.0 | 9689.0 | 10499.0 |
T in s | 1800.0 | 1000.0 | 3000.0 | 2114.0 | 1757.0 | 1701.0 | 1735.0 |
V in m3 | 15.0 | 10.0 | 25.0 | 17.1 | 17.5 | 15.9 | 18.5 |
I in m3/s | 0.00200 | 0.00100 | 0.00400 | 0.00168 | 0.00250 | 0.00166 | 0.00189 |
(b) Result | |||||||
NSE | 0.745 | 0.708 | 0.787 | 0.841 | |||
RMSE | 0.430 | 0.460 | 0.394 | 0.339 | |||
relDev Par. | 0.122 | 0.119 | 0.079 | 0.093 | |||
reDev CSO | 0.000 | 0.125 | 0.063 | 0.063 |
For imprecise measurement, we assume the following three data series of observation. Note that all three observations are not continuous time series data, but summarize information to a single data point per rain event. A rain event is defined by an antecedent dry weather period of 240 min:
CSO – combined sewer overflow: This is the binary information if a CSO event has occurred for a rain event or not (Wani et al. 2017) which is fairly simple information to record (Kleidorfer et al. 2009a, 2009b).
EMF – empirical maximum flow: If the runoff from the catchment exceeds a value of 5*I (i.e., for extreme events) we assume that there is uncertain information available (relative error < 50%) on the maximum flow in the catchment during that rain event. Such information could be from a crowd source or simple surrogate sensors. For matching with reality, we further assume that the information is only recorded randomly in 80% of the cases.
BET – basin emptying time: If the retention basin has filled up, one can easily get information on the time required for emptying, i.e., directly by using a water level sensor or by using indirect information such as pumping time (if used for basin emptying) of flow measurement at the inlet of a treatment plant (an inflow above the base level indicates basin emptying). The emptying time of the basin can be seen as a form of software sensor. We assume that the observation is recorded as soon as the emptying time exceeds 60 min, i.e., if the basin is approximately half full. This threshold is chosen arbitrarily but with the intention to limit the signal to significant events. We further assume that the observation is inaccurate with a deviation of 60 min, i.e., a sample is accepted if
Table 1 shows the result of the model parameter estimation effort for the types of observations available (CSO, EMF and BET) where ALL denotes the situation when all three observations are used. Calibration is done for the calibration rain series. Model performance is then computed by using the resulting parameter set (mean values of posterior samples) and simulating for the verification rain series. For performance metrics we use RMSE and NSE of the CSO time series, mean relative deviation of the parameters and relative deviation from the true number of CSO events. It is interesting to note that adequate calibration results can be achieved already for each of the observations in itself. Consequently, imprecise information, e.g., binary information on CSO events, is a valuable source of information for parameter estimation.
Significance of observations by means of Shapley values
Unlike standard calibration situations, there is usually more than one type of observation available when using imprecise information for model calibration. However, for nonlinear systems, it is not immediately obvious if a certain type of observation is actually contributing to the calibration or not. It is thus important to identify which observations support the parameter estimation and which are of little importance. The key problem for this estimation of significance is again related to the absence of a summary statistic: When using imprecise information, we cannot readily identify one unifying metric to express the system performance. E.g., when using the observation type CSO, all accepted samples must reproduce the exact number of CSO events – otherwise, they would have been rejected – and thus we have no means to determine the best fit. For EMF and BET the situation is alike but there is an additional random element included for the rejection.
In the following, we compute the Shapley values for each of the three observation types (CSO, EMF and BET) for the metric 1 – NRMSE (thus turning the metric as positive in the interval [0 − 1]). Table 2(b) lists all possible combinations of the three types, with the marginal contribution indicating the increase in the metric if a player (here observation type) is added. e.g., for the combination EMF-CSO-BET (3rd line in Table 2(b)), the marginal for the observation type CSO is computed as metric for the combination EMF-CSO (0.872 – fourth line in Table 2(a)) minus the combination EMF (0.876 – 2nd line in Table 2(a)), thus −0.004. The Shapley value is then the mean of all marginal values. Table 2(b) indicates that EMF and BET are giving slightly better results than CSO and should be preferred as observation input in this case study. The negative values in the marginal distributions of the CSO observation further indicate that the joint use of CSO with both other observation types is subpar to the use of only EMF and BET. Note that this information is gained by only analyzing the accepted samples in the ABC scheme.
(a) Combination . | 1-NRMSE . | NSE . | . |
---|---|---|---|
CSO | 0.797 | 0.745 | |
EMF | 0.876 | 0.698 | |
BET | 0.833 | 0.789 | |
CSO-EMF | 0.872 | 0.781 | |
CSO-BET | 0.861 | 0.828 | |
EMF-BET | 0.910 | 0.866 | |
CSO-EMF-BET | 0.904 | 0.845 | |
(b) Combination . | Marginal CSO . | Marginal EMF . | Marginal BET . |
CSO-EMF-BET | 0.797 | 0.075 | 0.032 |
CSO-BET-EMF | 0.797 | 0.043 | 0.064 |
EMF-CSO-BET | − 0.004 | 0.876 | 0.032 |
EMF-BET-CSO | − 0.006 | 0.876 | 0.034 |
BET-CSO-EMF | − 0.015 | 0.043 | 0.833 |
BET-EMF-CSO | − 0.006 | 0.077 | 0.833 |
Mean | 0.261 | 0.332 | 0.305 |
(a) Combination . | 1-NRMSE . | NSE . | . |
---|---|---|---|
CSO | 0.797 | 0.745 | |
EMF | 0.876 | 0.698 | |
BET | 0.833 | 0.789 | |
CSO-EMF | 0.872 | 0.781 | |
CSO-BET | 0.861 | 0.828 | |
EMF-BET | 0.910 | 0.866 | |
CSO-EMF-BET | 0.904 | 0.845 | |
(b) Combination . | Marginal CSO . | Marginal EMF . | Marginal BET . |
CSO-EMF-BET | 0.797 | 0.075 | 0.032 |
CSO-BET-EMF | 0.797 | 0.043 | 0.064 |
EMF-CSO-BET | − 0.004 | 0.876 | 0.032 |
EMF-BET-CSO | − 0.006 | 0.876 | 0.034 |
BET-CSO-EMF | − 0.015 | 0.043 | 0.833 |
BET-EMF-CSO | − 0.006 | 0.077 | 0.833 |
Mean | 0.261 | 0.332 | 0.305 |
In order to test the result of the scheme a scenario analysis is conducted for the different combinations of observation types using the information from the calibration effort. Here we use the metric NSE computed for the time series of the CSO – reference versus calibrated system and test for the same combinations as for the Shapley values. Table 2(a) corroborates that the best results are achieved when applying only EMF and BET observations. From this comparison, we conclude that Shapley values can contribute to identifying the significance of a certain type of observation in a parameter calibration scheme – with only the imprecise information that is available. The advantage of the Shapley value approach as compared to scenario analysis is that all possible combinations are assessed in a formal setting.
RECOMMENDATION
For parameter estimation with imprecise information, the following sequence of steps is recommended:
Irrespective of the following procedure, prior information (lower/upper bounds of values, distribution) on the parameters is to be determined. Likewise, information on the type of available imprecise information is to be gathered as well as information on the measured data and its quality (time series of information).
Determine if the problem can be addressed by using the classical concept of maximizing similarity, i.e., if the imprecise information can be easily expressed in statistical terms. Only if this is not the case, use ABC.
For maximizing similarity, the underlying minimization procedure results in a clear estimate of the parameter values. On the other hand, for ABC the result is a sample of acceptable solutions. The actual values of the parameters have then to be determined from this sample, usually by addressing key indicators of the probability density function of the values.
If several types of imprecise information are available, it is recommended to check which combination of the information gives the best results. Intuitively this can be done via scenario analysis, i.e., testing for each combination of information sources, but the use of Shapley values is advocated as providing a formal and established framework.
CONCLUSION
Model parameter estimation frequently suffers from the lack of precise information. Traditional calibration methods need to be enhanced to cope with such uncertainties. While there are different options available, the likelihood-free ABC is a simple, yet efficient method to deal with imprecise information. In this paper, we demonstrated:
the result of the ABC method matches the reference method of maximizing similarity for the real-world case study ‘wastewater-based epidemiology’
good calibration results can be achieved with ABC for a four-parameter rainfall–runoff model if binary and/or censored information are available. The quality of the parameter estimation is computed for a verification situation.
3 different examples of imprecise information are exemplified, i.e., binary data on CSO events, censored data on maximum flow in the catchment and emptying time of the basin.
even if the quality of the calibration cannot be estimated accurately for imprecise information, Shapley values of the posterior distribution of accepted samples in the ABC scheme indicate the significance of a specific type of observation
a shortcoming is the computational burden in the Monte-Carlo-based procedure. This limits applications to fast models and/or requires the use of efficient algorithms and parallel computing.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.