Abstract
Unscented Kalman filter (UKF) has its origin in transforming the Gaussian random variables for nonlinear estimation and has received little attention in the context of state estimation of conceptual hydrological models. This paper introduces UKF to estimate state variables of a conceptual hydrologic model. A symmetric point approach and a scaling framework are used for performing the sample generation process of UKF. This paper investigates the application of UKF for state estimation with a synthetic case study in which both the simulated state, the true state, and the corrected state are precisely known. The results show that the use of UKF can improve the performance of both the model outputs and the state variables as the difference between the corrected trajectories and the true trajectories decreases rapidly and tends to vanish after only a few iterations. Our results and comparisons also demonstrated the capability and usefulness of UKF for state estimation in two real basins.
INTRODUCTION
Hydrologic models are designed to describe the rainfall-runoff physical process which is generally considered to be highly nonlinear and time-varying (Singh 1964; Pilgrim 1976). Conceptual rainfall-runoff models simplify and conceptualize these complex processes using a set of simple mathematical equations (Moradkhani et al. 2005a). The main purpose of these models is to predict the future evolution of the system variables. Conceptual rain-runoff models are generally reported to be robust and reliable when applied to flood forecasting (Hsu et al. 1995). However, it is generally accepted that the satisfactory application of such a model can be hampered by many factors including uncertainty in the inputs (Bárdossy & Das 2006; Kavetski et al. 2006), inadequacy of the model which generally includes uncertainty in the model structure (Gupta et al. 2012), uncertainty in the model parameters (Vrugt et al. 2003), and uncertainty in the initial model conditions. As a result, the error correction and updating techniques have been applied to handle these problems. The Auto Regressive time series models have been used to produce improved performance of predictions by predicting the future discrepancies between the model simulations and the observations (Abrahart & See 2000; Broersen & Weerts 2005; Valipour et al. 2013). The sequential data assimilation (DA), which incorporates the information from observations into model simulations, has demonstrated its applicability in improving model performance (McLaughlin 2002; Kuczera et al. 2006; Liu & Gupta 2007). A few sequential DA techniques have been introduced in the hydrologic literature which mainly includes the dynamic identifiability analysis (Wagener et al. 2003), the parameter estimation approach based on the information localization (Vrugt et al. 2002), and the Bayesian recursive estimation technique (BaRE) (Misirli et al. 2013). The standard Kalman filter is proposed to provide a recursive solution to the discrete-data linear filter problem (Kalman 1960). Several nonlinear extensions of Kalman filter such as extended Kalman filter (EKF) (Jazwinski 1970) and ensemble Kalman filter (EnKF) (Evensen 1994) have been utilized in the state estimation due to the nonlinearity of the hydrologic system. The applications of EKF in the hydrological literature have been reported by a few researchers (Entekhabi et al. 1994; Walker & Houser 2001). It can be considered as a ‘first-order’ approximation to the optimal form since EKF linearizes the nonlinear system (state equation and measurement equation) using the ‘first-order’ approximations (Welch & Bishop 1995). However, these approximations can introduce considerable errors, which may lead to deterioration in the performance and even divergence after a few iterations (Evensen 1994; Reichle et al. 2002a, 2002b). The EnKF addresses the approximation problems by incorporating an ensemble of the state, which is generated by utilizing the Monte Carlo generation (Evensen 1994; Burgers et al. 1998; Van Leeuwen 1999). Some applications have been seen in the hydrological literature due to its simple procedures and capability for handling the nonlinear system (Burgers et al. 1998; Reichle et al. 2002a, 2002b; Moradkhani et al. 2005b; Sun et al. 2009; Lü et al. 2013). In practice, implementation of EnKF may be complicated by the ‘inbreeding’ problem since the ensemble is updated with a gain obtained from the same ensemble in the analysis step, which can lead to underestimation of the error variances for small ensemble sizes (Van Leeuwen 1999). Particle filtering (PF) is another alternative DA technique (Arulampalam et al. 2002). In PF, the sequential importance sampling (SIS) is often used to achieve the optimal representation of the posterior probability distribution. For SIS, the so-called degeneracy problem typically arises, resulting in only a small number of particles being valid for the procedure. In practice, the sampling importance resampling (SIR) is proposed to handle this problem. The SIR technique has been successfully utilized to achieve state-parameter estimation of a hydrologic model (Moradkhani et al. 2005a). However, this resampling procedure can lead to another problem, which is known as the sample impoverishment problem (Liu & Gupta 2007).
Another well-known nonlinear extension to the Kalman filter is the unscented Kalman filter (UKF) (Julier et al. 1995; Julier & Uhlmann 1997). Its usefulness to nonlinear systems and efficient sampling strategy have led to extensive applications in many branches of science and engineering (St-Pierre & Gingras 2004; Wu & Smyth 2007; Kandepu et al. 2008; Chatzi & Smyth 2009; Chowdhary & Jategaonkar 2010; Valverde & Terzija 2011). For example, the UKF has been utilized to estimate hydraulic conductivity (Sun & Sun 2015). However, so far the UKF has received little attention in the context of state estimation of hydrological models (Liu et al. 2012).
In this paper, we strive to investigate the application of UKF for state estimation of hydrologic models. We seek here to: (1) introduce the basic theory of UKF and the framework of the application of UKF in state estimation in a conceptual hydrological model; and (2) evaluate its performance with a synthetic case study and real basins.
METHODOLOGY
Limitations of the Kalman filter (KF) and the nonlinear extensions
The Kalman filter is proposed to address the problem of estimating the state of a linear dynamic process which can be described by the linear stochastic equations. For the nonlinear cases, the EKF has been widely applied to these cases by linearizing the estimation around the current estimate (Welch & Bishop 1995). However, there has been a long discussion about the divergence and unstable performance of EKF (Jazwinski 1970; Gauthier et al. 1993; Miller et al. 1994; Julier et al. 1995). The main purpose of EKF is to propagate the expected value and the covariance of the state (the probability distribution function) through the first-order linearization of the operators for the measurement model and the state transition model. However, this can introduce considerable errors in the face of strong nonlinear relationships since the higher order partial derivatives of the functions are neglected directly. In addition, the propagation process of the distributions of the state variables can introduce errors in the true posterior mean and covariance. This flaw can lead to unstable performance and even divergence of the EKF. Despite this main flaw, other problems such as the difficulty of calculating the Jacobian matrix also limit its applications (Da Ros & Borga 1997). The EnKF was then proposed to address the nonlinear problems mentioned above (Evensen 1994; Burgers et al. 1998). This technique has received much attention in the field of hydrology due to its relative simplicity and robustness when addressing the nonlinear filtering problem (Reichle et al. 2002a; Moradkhani et al. 2005b; Weerts & El Serafy 2006; Clark et al. 2008; Xie & Zhang 2010). The essential difference between EKF and EnKF lies in the approaches for propagating the probability distribution functions. Unlike the approximation implemented by local linearization, the EnKF utilizes a sampling strategy to propagate the probability distribution functions (Liu et al. 2012). While the EnKF may be a robust and popular technique for the nonlinear estimation problem, there have been some discussions about the ‘inbreeding’ problem. The main concern associated with this problem is that the gain for updating an ensemble is obtained from the same ensemble and then a modified version of EnKF named double DEnKF has been proposed to address this problem (Houtekamer & Mitchell 1998). A theoretical analysis shows that the DEnKF has only partly alleviated the problem, rather than solving it completely (Van Leeuwen 1999). Using small ensemble sizes can lead to a systematic underestimation of the error variances.
Unscented Kalman filter
The UKF was proposed as a new nonlinear extension of the Kalman filter in Julier & Uhlmann (1997). The UKF utilizes the unscented transform (UT) technique to propagate the mean and the covariance of the state, which uses a small set of weighted sample points (sigma points) to capture and propagate the expected value and covariance of the probability distributions (Julier et al. 2000). The main difference between this method and Monte Carlo-based methods such as EnKF and DEnKF is that the samples used in UKF are deterministic rather than random (Julier et al. 2000). An appealing advantage of the UT technique is that for any inputs, the approximations are accurate to at least the second order in the face of an arbitrary nonlinear function. Moreover, for Gaussian inputs, the approximations are accurate to at least the third order. Higher accuracy can be achieved by applying different sampling strategies and changing the corresponding parameters. Various algorithms can be utilized to generate the sigma points, which include the symmetric point algorithm (Julier & Uhlmann 1997), the minimal skew simplex point algorithm (Julier & Uhlmann 2002), the spherical simplex point algorithm (Julier 2003), etc. The main distinction of these algorithms is the number of sigma points used to propagate the expected value and covariance. For an n-dimensional state (space), the symmetric point algorithm requires 2n + 1 sigma points. However, for the minimal skew simplex point algorithm and the spherical simplex point algorithm, only n + 2 sigma points are required (Julier & Uhlmann 2002; Julier 2003). In addition, there are potential problems of numerical instability with the minimal skew simplex point algorithm (Julier 2003). A general framework was proposed to incorporate high order information of the system without additional computational costs while assuring the positive semi-definiteness at the same time under certain assumptions (all of the untransformed weights are non-negative) (Julier 2002).
General procedures of unscented Kalman filter











Sampling algorithms and scaling framework

















Note that in practice the positive semi-definiteness of the covariance may be violated after a few iterations and then the filtering process breaks down since the Cholesky factorization cannot be operated. For these cases, we restart the UKF with the state obtained by the XAJ model.
CASE STUDIES
To examine the capability of the UKF for state estimation, we present a synthetic study and a real case study using a simple conceptual rainfall-runoff model. The first study mainly serves to evaluate the performance of state estimation since we cannot get access to the actual state in the real cases. In the second case, we further applied the UKF to two real basins to test its performance and applicability. The Nash–Sutcliffe efficiency (NSE) (Nash & Sutcliffe 1970) and the root mean squared error (RMSE) were used as performance criteria in the synthetic case and the real case. The NSE value is a widely used measure that serves to compare the performance of a given model to the mean of the observations (Schaefli & Gupta 2007). An NSE value of 1 indicates a perfect match of the model output to the observed data, while an NSE value of 0 means that the performance of the model is only as accurate as the mean of the observations. In this study, the RMSE was utilized to evaluate the performance of both the state estimation and the output estimation, while the NSE was only used to evaluate the output (discharge) estimation.
Hydrological model
The Xinanjiang (XAJ) rainfall-runoff model was used in this study. The XAJ model is a simple conceptual rainfall-runoff model developed by Zhao (1992) which has been widely used in most humid and semi-humid regions in China (Hu et al. 2005; Zhang et al. 2012; Shi et al. 2013; Bai et al. 2016; Jiang et al. 2017; Sun et al. 2018b). The model used in this study consists of a runoff generation module (SRP), a runoff separation module (SOR), a flow concentration module (SFC) and a channel routing module (CR) (Figure 2). Table 1 lists the physical descriptions and the units of the parameters. For detailed information about the model, we refer to Zhao (1992) and Shi et al. (2013).
Physical descriptions of the parameters for the XAJ model
Parameter . | Units . | Description . |
---|---|---|
WUM | mm | Areal mean tension water capacity of the upper layer |
WLM | mm | Areal mean tension water capacity of the lower layer |
WDM | mm | Areal mean tension water capacity of the deeper layer |
B | / | Exponent of the tension water capacity distribution curve |
C | / | Coefficient of deep evapotranspiration |
IMP | / | Ratio of impervious area |
KC | / | Ratio of potential evapotranspiration to pan evaporation |
SM | mm | Areal mean free water capacity of the surface soil layer |
EX | / | Exponent of the free water capacity curve |
KI | / | Outflow coefficients of the free water storage to interflow |
KG | / | Outflow coefficients of the free water storage to groundwater |
CS | / | Recession constant of the surface water storage |
CI | / | Recession constant of the interflow storage |
CG | / | Recession constant of the groundwater storage |
CL | / | Recession constant of the lag and route method |
Lag | hour | Lag time of the lag and route method |
Parameter . | Units . | Description . |
---|---|---|
WUM | mm | Areal mean tension water capacity of the upper layer |
WLM | mm | Areal mean tension water capacity of the lower layer |
WDM | mm | Areal mean tension water capacity of the deeper layer |
B | / | Exponent of the tension water capacity distribution curve |
C | / | Coefficient of deep evapotranspiration |
IMP | / | Ratio of impervious area |
KC | / | Ratio of potential evapotranspiration to pan evaporation |
SM | mm | Areal mean free water capacity of the surface soil layer |
EX | / | Exponent of the free water capacity curve |
KI | / | Outflow coefficients of the free water storage to interflow |
KG | / | Outflow coefficients of the free water storage to groundwater |
CS | / | Recession constant of the surface water storage |
CI | / | Recession constant of the interflow storage |
CG | / | Recession constant of the groundwater storage |
CL | / | Recession constant of the lag and route method |
Lag | hour | Lag time of the lag and route method |
Structure of the XAJ model. Note that the runoff generation module used in this study includes the three-layer evaporation module.
Structure of the XAJ model. Note that the runoff generation module used in this study includes the three-layer evaporation module.
Descriptions of the variables in Equation (15) are presented in Table 2. The time designations are omitted for simplicity.
Descriptions of the state variables
Module . | Variable . | Units . | Description . |
---|---|---|---|
SRP | W | mm | Tension water storage |
SRP | WU | mm | Tension water storage of the upper layer |
SRP | WL | mm | Tension water storage of the lower layer |
SOR | S | mm | Free water storage |
SFC | QS | m3/s | Surface flow |
SFC | QI | m3/s | Interflow |
SFC | QG | m3/s | Groundwater flow |
Module . | Variable . | Units . | Description . |
---|---|---|---|
SRP | W | mm | Tension water storage |
SRP | WU | mm | Tension water storage of the upper layer |
SRP | WL | mm | Tension water storage of the lower layer |
SOR | S | mm | Free water storage |
SFC | QS | m3/s | Surface flow |
SFC | QI | m3/s | Interflow |
SFC | QG | m3/s | Groundwater flow |
Synthetic case
In this study, we present a synthetic case study to demonstrate the usefulness of the UKF by using this technique and the XAJ model. This synthetic case study is similar to the synthetic case study used in Moradkhani et al. (2005a), where the synthetic study was designed to demonstrate the capability for estimating the state variables and parameters. In this study, we only focus on state estimation and therefore the main difference between these two cases is that rather than sampling the starting points for parameters, for this study, starting points for state variables were sampled from the state space. Another essential difference is that in this study the sampling process was conducted by a Latin hypercube sampling algorithm rather than a normal random sampling algorithm due to its efficient stratification properties (Helton & Davis 2003). The true values of the state and the measurement were obtained by a free run of the XAJ model using a predefined set of parameters, model inputs and starting points for the state variables. Inputs to the model are precipitation and measured pan evaporation. We utilized the mean precipitation of a real flood event, which was selected from the flood events of Qilijie basin. Qilijie basin is a Chinese humid basin with an area of 14,787 km2, which is located in Fujian province in southwest China. Detailed information on this flood event is presented in Table 3.
Information about the precipitation data of the synthetic flood event
Start time . | End time . | Duration (hour) . | Sum (mm) . | STD (mm) . |
---|---|---|---|---|
1989/5/11 9:00 | 1989/6/3 8:00 | 552 | 347.3 | 1.12 |
Start time . | End time . | Duration (hour) . | Sum (mm) . | STD (mm) . |
---|---|---|---|---|
1989/5/11 9:00 | 1989/6/3 8:00 | 552 | 347.3 | 1.12 |
STD represents the standard deviation.




The procedures for generating the weight factors of the starting points are illustrated by Figure 3.
Procedures for generating the weight factors of perturbed initial conditions.
As evident from Figure 3, for j = 2, 3, 4, 5, 6 and 7, the starting points are directly generated by multiplying the weight values and the upper bounds. For j = 1, is the sum of the perturbed values of tension water storage of the three layers. Table 4 lists the parameters of this synthetic model.
Parameters of the synthetic model
Parameter . | Value . |
---|---|
WUM | 40 |
WLM | 100 |
WDM | 20 |
B | 0.4 |
C | 0.12 |
IMP | 0.02 |
KC | 0.9 |
SM | 20 |
EX | 1.5 |
KI | 0.3 |
KG | 0.4 |
CS | 0.93 |
CI | 0.95 |
CG | 0.98 |
CL | 0.4 |
Lag | 2 |
Parameter . | Value . |
---|---|
WUM | 40 |
WLM | 100 |
WDM | 20 |
B | 0.4 |
C | 0.12 |
IMP | 0.02 |
KC | 0.9 |
SM | 20 |
EX | 1.5 |
KI | 0.3 |
KG | 0.4 |
CS | 0.93 |
CI | 0.95 |
CG | 0.98 |
CL | 0.4 |
Lag | 2 |





Real case
Study areas
Two climatologically diverse basins were selected in this study, where snowmelt is not a dominant factor in runoff generation. Shaowu basin is located in the northwestern part of Min River basin and has an area of 2,745 km2. This basin is dominated by a subtropical monsoon climate and spring and summer are the wettest seasons. We next implement this methodology to Chai River basin which is generally regarded as a sub-basin of Liao River basin. The methodology performance was tested using data from Chai River basin with an area of 1,355 km2, located in one of the semi-humid regions in eastern China. Chai River basin is dominated by a temperate continental monsoon climate, resulting in abundant rainfall in summer. Thus, both a humid basin and a semi-humid basin are included in this study. The hydrological model used in the real case study is operated in a lumped and event-based way throughout the paper. We have collected 17 flood events and 33 flood events for Shaowu basin and Chai River basin, respectively.
Data basis
The annual precipitation in Shaowu basin ranges from 1,156 to 2,423 mm for the period 1988–1999. For Shaowu basin, we used eight precipitation stations and one hydrologic station (the outlet station of Shaowu basin). The outlet station is the only available station to obtain the measured pan evaporation data of Shaowu basin and the potential evaporation for this basin is evaluated by Equation (16).
The long term mean annual precipitation of Chai River basin ranges from 494 to 1,185 mm. Precipitation inputs were based on five stations for the period 1975–2016. Observed runoff data was based on the Chaihepu hydrologic station, which is located at the outlet of the basin. Note that for Chai River basin, since we cannot get access to the measured pan evaporation data, the potential evaporation inputs were obtained using an empirical method in Yingpan evaporation experimental station, which is located in Liaoning, China. A brief description of this method is illustrated in Figure 4.
Schematic for obtaining the potential evaporation for Chai River basin.
Figure 5 depicts the locations of two basins and the corresponding stations.
To simplify the parameter calibration problem, some of the insensitive parameters are set to plausible constants and are not further calibrated. This applies for the parameters WUM, WLM, WDM, B, C, IMP and EX. In addition, the sum of KI and KG is set to a fixed value (0.7) and therefore we only calibrated KI in this study. The ‘Lag’ L can be directly obtained by using the ‘trial-and-error’ method since this parameter is a positive integer, which mainly serves to determine the ‘translation’ effect of the hydrograph. The remaining parameters were calibrated using the Particle swarm optimization algorithm (Kennedy 2011). To reduce the dimensionality of the parameters, the parameter fields are assumed uniform. The optimization process was operated by maximizing the error function, as measured by NSE. The optimization algorithm uses fifty particles and the value of maximum iterations (epochs) is set to 500. Table 5 lists the parameters for the two basins.
Model parameters for Shaowu basin and Chai River basin
Parameter . | Shaowu basin . | Chai River basin . |
---|---|---|
WUM | 20 | 20 |
WLM | 80 | 100 |
WDM | 20 | 40 |
B | 0.4 | 0.4 |
C | 0.16 | 0.12 |
IMP | 0 | 0.03 |
KC | 0.6 | / |
SM | 10 | 30 |
EX | 1.5 | 1.5 |
KI | 0.3 | 0.58 |
KG | 0.4 | 0.22 |
CS | 0.8 | 0.85 |
CI | 0.82 | 0.976 |
CG | 0.97 | 0.999 |
CL | 0.1 | 0.4 |
Lag | 4 | 3 |
Parameter . | Shaowu basin . | Chai River basin . |
---|---|---|
WUM | 20 | 20 |
WLM | 80 | 100 |
WDM | 20 | 40 |
B | 0.4 | 0.4 |
C | 0.16 | 0.12 |
IMP | 0 | 0.03 |
KC | 0.6 | / |
SM | 10 | 30 |
EX | 1.5 | 1.5 |
KI | 0.3 | 0.58 |
KG | 0.4 | 0.22 |
CS | 0.8 | 0.85 |
CI | 0.82 | 0.976 |
CG | 0.97 | 0.999 |
CL | 0.1 | 0.4 |
Lag | 4 | 3 |
RESULTS AND DISCUSSION
Synthetic case
Figure 6 clearly shows that the corrected outputs (discharge) of all samples converge to the measurement shortly. After a few iterations (about 15 time steps), the corrected outputs are almost identical to the true measurement. Obviously, the original outputs (free of updating) differ from the measurement. This indicates that the original performance is considerably sensitive to the starting points while these perturbations can be gradually reduced by utilizing the UKF.
Corrected results using the UKF and the original results (free of updating) for hydrograph prediction (20 sets). Dashed line represents the measured discharge. (a) Precipitation data of the synthetic study. (b) Short dot line represents the results of the original case (free of updating). (c) Solid line represents the results of the corrected case (with UKF).
Corrected results using the UKF and the original results (free of updating) for hydrograph prediction (20 sets). Dashed line represents the measured discharge. (a) Precipitation data of the synthetic study. (b) Short dot line represents the results of the original case (free of updating). (c) Solid line represents the results of the corrected case (with UKF).
Figure 7 shows a scatter plot where the original performance of the model output is plotted against the corrected performance (measured by NSE and RMSE). Throughout the paper, it is assumed that simulation using UKF results in sets suffixed by ‘_UKF’, while simulation without UKF results in sets suffixed by ‘_O’.
Scatter plots of the original performance and the corrected performance. (a) Comparison of the result sets measured by RMSE. (b) Comparison of the result sets measured by NSE.
Scatter plots of the original performance and the corrected performance. (a) Comparison of the result sets measured by RMSE. (b) Comparison of the result sets measured by NSE.
Figure 7 clearly shows that for all the 20 sets of starting points, using UKF has significantly improved the original performance. In our case, the NSE values of original case (free of updating) range from 0.7567 to 0.9975 with the mean equal to 0.9163, while the values of corrected case range from 0.9954 to 0.9997 with the mean equal to 0.9986. To evaluate the tracking capacity in detail, the results of state estimation are shown in Figure 8 (W, WU and WL), Figure 9 (S) and Figure 10 (QS, QI and QG). The dashed line represents the true states, the short dot line represents the original states and the solid line represents the corrected states with UKF.


Descriptions of R, RS, RI, RG and PE
Variable . | Units . | Description . |
---|---|---|
R | mm | Runoff |
RS | mm | Surface runoff |
RI | mm | Interflow runoff |
RG | mm | Groundwater runoff |
PE | mm | Net rainfall |
Variable . | Units . | Description . |
---|---|---|
R | mm | Runoff |
RS | mm | Surface runoff |
RI | mm | Interflow runoff |
RG | mm | Groundwater runoff |
PE | mm | Net rainfall |
Obviously, the difference between the true values and the simulated values will rapidly decrease and finally vanish after a few iterations because the coefficient of the first component in Equation (25) is much less than one. Moreover, this may lead to a deterioration in the performance of the model output when utilizing UKF. Even though the correction of S can lead to a slight deterioration in the performance, we still do not intend to conduct UKF without updating S. The main reason for this is that the state of a system (a set of variables) should be able to determine the future response (future state and output) given the inputs and the equations describing the dynamics (Dorf & Bishop 2011). Without this variable (S), the variable is not sufficient to determine the future response.
An interesting observation is that the convergence time of the XAJ output is different from the convergence time of the state variables. The convergence time of the output (after about 15 time steps, see Figure 7) is much earlier than the time of soil moisture (after about 43 time steps, see Figure 8). This may suggest an important fact that the superior performance of model outputs does not necessarily imply that the performance of state variables is also satisfactory. Let us assume that the XAJ model operates without the help of UKF (free of correction) after the convergence time of the output. Clearly, the output will tend to diverge from the true trajectory after only a few iterations since there is a considerable difference between the simulated tension water storage and the true tension water storage. However, if we assume that the XAJ model operates without the UKF after the convergence time of the tension water storage, the superior performance is very likely to be maintained since the simulated state is almost identical to the true state and the only error source is assumed to be the state variables. Figure 11 shows the results of the original state and the corrected state as measured by RMSE.
Scatter plots depicting the results of simulated and observed state measured by RMSE. (a) Comparison of the result sets of W. (b) Comparison of the result sets of WU. (c) Comparison of the result sets of WL. (d) Comparison of the result sets of S. (e) Comparison of the result sets of QS. (f) Comparison of the result sets of QI. (g) Comparison of the result sets of QG.
Scatter plots depicting the results of simulated and observed state measured by RMSE. (a) Comparison of the result sets of W. (b) Comparison of the result sets of WU. (c) Comparison of the result sets of WL. (d) Comparison of the result sets of S. (e) Comparison of the result sets of QS. (f) Comparison of the result sets of QI. (g) Comparison of the result sets of QG.
Real cases
Figure 12 shows a scatter plot of NSE and RMSE between the ‘UKF’ performance (‘NSE: Q_UKF’ and ‘RMSE: Q_UKF’) and the original performance (‘NSE: Q_O’ and ‘RMSE: Q_O’) for Shaowu basin. Clearly, for all flood events using UKF, there is a significant increase in the performance of the model output since all the points are located in the top left of Figure 12(a) and the bottom right of Figure 12(b). NSE values obtained by using UKF range from 0.8702 to 0.9950 with the mean value equal to 0.9738, while for the original case the NSE values merely range from 0.5435 to 0.9666 with the mean value equal to 0.8242. Similar results are obtained for RMSE (see Figure 12(b)).
Scatter plot depicting the results of UKF and the original results as measured by NSE and RMSE (Shaowu basin).
Scatter plot depicting the results of UKF and the original results as measured by NSE and RMSE (Shaowu basin).
Next we apply the UKF and XAJ model in Chai River basin. Figure 13 displays a similar scatter plot.
Scatter plot depicting the results of UKF and the original results as measured by NSE and RMSE (Chai River basin).
Scatter plot depicting the results of UKF and the original results as measured by NSE and RMSE (Chai River basin).
Again, the UKF has improved the performance of the XAJ model for Chai River basin. The mean of NSE values has increased from 0.7821 to 0.9777, and the mean of RMSE values has decreased from 38.0758 to 12.2840 m3/s. Tables 7 and 8 list the results in the terms of NSE and RMSE for both basins. Therefore, these results indicate that the use of UKF has resulted in a considerable increase in the model performance for both basins. Despite the satisfactory performance, note that in one of the flood events of Chai River basin there is a slight deterioration in the model performance when using UKF. This may indicate that the attribution of the uncertainty to only the state estimation is inappropriate due to the fact that uncertainty may stem from several sources: model structure, parameters, states, etc. (Sun et al. 1998, 2012; Clark & Vrugt 2006; Kuczera et al. 2006; Hendricks Franssen & Kinzelbach 2008; Renard et al. 2010; Sun & Sun 2015). For example, in some cases, both the uncertainty of model parameters and model state are dominant but we attribute all the uncertainty to only state uncertainty.
Results of the corrected performance and the original performance measured by NSE and RMSE (Shaowu basin)
. | Shaowu basin . | |||
---|---|---|---|---|
NSE: Q_O . | NSE: Q_UKF . | RMSE: Q_O (m3/s) . | RMSE: Q_UKF (m3/s) . | |
Mean | 0.8242 | 0.9738 | 294.2728 | 115.0182 |
Standard Deviation | 0.1144 | 0.0325 | 123.9293 | 83.3793 |
Minimum | 0.5435 | 0.8702 | 172.4923 | 42.5896 |
Maximum | 0.9666 | 0.9950 | 688.0272 | 335.2415 |
. | Shaowu basin . | |||
---|---|---|---|---|
NSE: Q_O . | NSE: Q_UKF . | RMSE: Q_O (m3/s) . | RMSE: Q_UKF (m3/s) . | |
Mean | 0.8242 | 0.9738 | 294.2728 | 115.0182 |
Standard Deviation | 0.1144 | 0.0325 | 123.9293 | 83.3793 |
Minimum | 0.5435 | 0.8702 | 172.4923 | 42.5896 |
Maximum | 0.9666 | 0.9950 | 688.0272 | 335.2415 |
Results of the corrected performance and the original performance measured by NSE and RMSE (Chai River basin)
. | Chai River basin . | |||
---|---|---|---|---|
NSE: Q_O . | NSE: Q_UKF . | RMSE: Q_O (m3/s) . | RMSE: Q_UKF (m3/s) . | |
Mean | 0.7821 | 0.9777 | 38.0758 | 12.2840 |
Standard Deviation | 0.1557 | 0.0214 | 48.1152 | 14.7709 |
Minimum | 0.2335 | 0.9349 | 3.2442 | 1.8018 |
Maximum | 0.9247 | 0.9969 | 231.6114 | 53.0028 |
. | Chai River basin . | |||
---|---|---|---|---|
NSE: Q_O . | NSE: Q_UKF . | RMSE: Q_O (m3/s) . | RMSE: Q_UKF (m3/s) . | |
Mean | 0.7821 | 0.9777 | 38.0758 | 12.2840 |
Standard Deviation | 0.1557 | 0.0214 | 48.1152 | 14.7709 |
Minimum | 0.2335 | 0.9349 | 3.2442 | 1.8018 |
Maximum | 0.9247 | 0.9969 | 231.6114 | 53.0028 |
CONCLUSIONS
In this study, UKF was used to estimate the states of a conceptual hydrologic model. The symmetric point strategy and the general scaling framework were utilized to conduct the sampling process. A synthetic case was used to evaluate the performance of UKF for estimating the model state. The results showed that the UKF can significantly improve the accuracy of the initial conditions since the difference between the corrected trajectories and the true trajectory decreases rapidly. The application of UKF was further conducted in two real basins to demonstrate the capability of UKF in state estimation. The performance of the XAJ model has been significantly improved with the application of UKF in state estimation. It is worth noting that both state estimation and parameter estimation should be considered in the application of UKF. In this study, however, the hydrological model for the real case study was operated at event scale and the model parameters are usually considered steady during such a short period (Madsen & Skotner 2005; Bárdossy & Singh 2008). Hence, we only focus on state estimation rather than dual estimation (or joint estimation).
We would like to note that an important fact is that for real cases uncertainty may stem from several sources: model structure, parameters, states, etc. (Clark & Vrugt 2006; Kuczera et al. 2006; Weerts & El Serafy 2006). Considering this, it may be inappropriate to attribute all the uncertainty to only state uncertainty, which can result in model states being tweaked to unrealistic values (e.g. the estimated surface flow/runoff is negative) to compensate for other uncertainty (e.g. input uncertainty) (Sun et al. 2018a). This may indicate that the method ignores the physical explanation of the process of runoff generation on a hillslope. Therefore, instead of investigating correcting techniques, we should focus on implementing a better description of the actual physical process. Such problems can lead to ‘false satisfactory’ situations where the right answers are obtained (e.g. considerable improvement in the output performance of hydrological modelling) for all the wrong reasons. In considering this, the choice of the right hydrological model, able to capture the main runoff generation process of the study area, should be as important as (or even more important than) a good correction method algorithm. Therefore, future work is needed to develop this method for uncertainty analysis and take other sources of uncertainty into consideration.
It is also worth noting that the computational cost of UKF can significantly increase when increasing the state dimension (Sun & Sun 2015). As a result, UKF may not be suitable for distributed hydrological models. Considering this limitation, techniques for reducing the computational cost such as the dimension reduction technique may be investigated in future studies. In addition, research aimed at improvements to the current UKF algorithm, including reliable techniques for ensuring the positive semi-definiteness of the covariance during the propagation and the sample procedure, are an open area for further research.