## Abstract

The dynamic system response curve (DSRC) has its origin in correcting model variables of hydrologic models to improve the accuracy of flood prediction. The DSRC method can lead to unstable performance since the least squares (LS) method, employed by DSRC to estimate the errors, often breaks down for ill-posed problems. A previous study has shown that under certain assumptions the DSRC method can be regarded as a specific form of the numerical solution of the Fredholm equation of the first kind, which is a typical ill-posed problem. This paper introduces the truncated singular value decomposition (TSVD) to propose an improved version of the DSRC method (TSVD-DRSC). The proposed method is extended to correct the initial conditions of a conceptual hydrological model. The usefulness of the proposed method is first demonstrated via a synthetic case study where both the perturbed initial conditions, the true initial conditions, and the corrected initial conditions are precisely known. Then the proposed method is used in two real basins. The results measured by two different criteria clearly demonstrate that correcting the initial conditions of hydrological models has significantly improved the model performance. Similar good results are obtained for the real case study.

## INTRODUCTION

Hydrologic models are generally designed to model the rainfall-runoff physical process which is generally considered to be nonlinear and time-varying (Singh 1964; Pilgrim 1976; Vo & Gourbesville 2016; Zhang *et al.* 2018). Conceptual rainfall-runoff models simplify and conceptualize these complex processes using a set of simple mathematical equations (Moradkhani *et al.* 2005a). Conceptual rain-runoff models are generally reported to be robust and reliable in the applications of flood forecasting (Hsu *et al.* 1995). However, it is generally accepted that the satisfactory application of such a model can be hampered by many factors including errors in the inputs (Bárdossy & Das 2006; Kavetski *et al.* 2006), inadequacy of the model which generally includes errors in the model structure (Gupta *et al.* 2012), errors in the model parameters (Vrugt *et al.* 2003), and errors in the model initial conditions (Seo *et al.* 2009).

The error correction (updating) methods have been made to handle these problems. The auto-regressive (AR) time series models have been used to produce an improved performance of predictions by predicting the future discrepancies between the model simulations and the observations (Abrahart & See 2000; Broersen & Weerts 2005; Valipour *et al.* 2013). Some improvements of the Kalman filter (KF) such as extended Kalman filter (EKF), particle Kalman filter (PKF), and ensemble Kalman filter (EnKF) have become successful techniques in the context of hydrological modeling (Reichle *et al.* 2002; Moradkhani *et al.* 2005b; Weerts & El Serafy 2006; Dumedah & Coulibaly 2014; Wang & Babovic 2016).

The dynamic system response curve (DSRC) method has its origin in correcting simulated runoff (Bao *et al.* 2014). Further, the method was used to update areal mean precipitation (Si *et al.* 2015). The DSRC method approximates the nonlinear relationship between the model outputs and the model variables using the first-order Taylor linearization and represents these complex behaviors in simple matrix forms. Then the correction is achieved by solving the equations using the least squares (LS) technique. While DSRC may be a simple approach, there have been some discussions about the flaws of the method (Bao *et al.* 2017; Sun *et al.* 2018). One of the main concerns is its unstable performance of the least square solution caused by the ill-posed property. In such a situation, the regularization techniques are commonly employed to obtain meaningful solutions (Neumaier 1998; Engl 2000; Golub *et al.* 2010).

In this paper, we present a brief description of DSRC and its main problem, then a new version of the DSRC method using truncated singular value method (TSVD) and L-curve criterion to overcome the ill-posedness. Moreover, the new method is extended to correct the initial conditions of a conceptual rainfall-runoff model to improve the performance. The new method is tested with a synthetic case study and a real case study. We present a synthetic case study which serves to test the capability of the new method in the third part. After that, satisfactory results are also obtained when applying the new method and the hydrological model to two real basins having different hydro-meteorological characteristics. Therefore, the capability and usefulness of the new method for estimating initial conditions are demonstrated via a synthetic case study and a real case study.

## METHODOLOGY

### DSRC method and its limitations

*et al.*2015): with and where and are the measurement and the calculated model output (discharge) at time

*m*, respectively,

*x*and are the true model variable and the approximate model variable at time

_{i}*i*, respectively, represents the partial derivatives of the output at time

*m*with respect to the model variable at time

*i*. Applying Equation (1) to all time steps (let

*m =*1, 2, …,

*p*, we can obtain the following equation: where represents the Jacobian matrix, and

**X**represent the measurement vector and actual model variable vector, and represent the output vector and the calculated model variable vector, respectively. Using the LS method, the solution of Equation (4) is given by:

While the DSRC method may be a simple method, there have been discussions about the robustness problem of DSRC. In our previous study, we have demonstrated that the ill-posed property of DSRC is one of the main reasons that lead to the unstable performance (Sun *et al.* 2018). This motivates us to incorporate some sort of regularization technique to stabilize the solution. Several regularization techniques, which mainly include the Tikhonov regularization (Golub *et al.* 2010), the Ridge estimation (Hoerl & Kennard 2000), the truncated singular value decomposition method (Hansen 1990), and some iterative regularization methods (e.g., Levenberg–Marquardt method), have been proposed to handle ill-posed problems. In this paper, we will confine ourselves to the TSVD method since a detailed comparison of regularization techniques is beyond the goal of this paper.

### TSVD method and L-curve criterion

**A**is a coefficient matrix,

**x**is an unknown vector with

*m*elements,

**b**is the vector of measurement (also called right-hand side) which is contaminated with the noise vector. Consider the singular value decomposition (SVD) of the matrix

**A**(Dan 1996): The vectors

**u**

*and*

_{i}**v**

*are the left and right singular vectors of*

_{i}**A**with orthonormal columns, while

*σ*are the singular values with . Using SVD, the solution of LS is rewritten in the form: As evident from the above equation, the solution is likely to be dominated by the perturbations (noise) to

_{i}**b**when some of the singular values are small (Fierro

*et al.*1997), which leads to the large instability of the solution. An efficient method to stabilize the solution is to neglect the smallest singular values (Hansen 1998). Neglecting the smallest singular values, we can obtain the solution of TSVD: where

*k*is the regularization (truncation) parameter. Clearly, the main idea of the TSVD method is to neglect the components of the solution corresponding to the smaller singular values (Xu 1998) and thus an important issue is to choose the proper criterion for choosing the truncation level (i.e., regularization parameter) which determines the number of the smallest singular values to be neglected (Hansen 1987). Unfortunately, there is no general criterion for this problem even though several approaches such as the generalized cross-validation (GCV) (Golub

*et al.*1979), the discrepancy principle (Morozov 1966), and the L-curve criterion (Hansen & O'Leary 1993) have been proposed. The numerical experiment presented in Hansen & O'Leary (1993) shows that the L-curve criterion has an overall more stable performance than the GCV method when using the TSVD method. Hence, in this study, we use the combination of TSVD and L-curve criterion. The underlying idea for the L-curve method is to seek a balance in minimizing both the solution size and the residual size. The plot of L-curve is typically a combination of two regions: the uppermost part which is vertical and the rightmost part which is horizontal (Hansen 1998). Then, the regularization parameter (the truncation level) is chosen by locating the corner (the point of maximum curvature) of the curve. For detailed information about the L-curve criterion and TSVD method, we refer the reader to these references (Hansen 1987, 1990, 1992; Hansen & O'Leary 1993; Xu 1998). For lack of a better name, we will distinguish the proposed method from the DSRC method by calling it TSVD-DSRC.

### TSVD-DSRC method

**X**(1) and are the actual initial state and the calculated initial state

*.*Note that in these equations the subscripts 1, 2, …,

*n*are used to distinguish different variables at the first time step while in the previous discussion (Equations (1)–(3)) the subscripts 1, 2, …,

*n*represent time 1, time 2, …, time

*n*. Similarly, applying Equation (10) to all time steps, we can obtain the following equation in matrix notation:

*k*is the regularization parameter (i.e., truncated level) obtained by the L-curve criterion.

A general schematic representation of the proposed method is illustrated in Figure 1. The parts in the dashed box are designed for the real case rather than the synthetic case since the real case is more complex than the synthetic case (i.e., the total error between the simulation and the observation may stem from several sources such as model structure, parameters, and initial conditions, rather than only initial conditions). As illustrated in Figure 1, for different hydrological models, the only difference in the operation process of the method is the selection of the vector of model initial model conditions, which is determined by the designer based on the model structure of a specific model. It is recommended that the vector of the initial model conditions should be identical with the vector of state variables based on the state-space model where the state variables should be sufficient to determine the future response (the future values of outputs and state variables) given the inputs and the equations describing the system (Dorf & Bishop 2011). Thus, the state variables (hence the vector of initial model conditions) of different hydrological models should be determined by the designer/engineer based on their knowledge about the model.

## CASE STUDIES

To examine the usefulness of the proposed method, we present a synthetic study and a real case study using a simple conceptual rainfall-runoff model. The first study mainly serves to evaluate the performance for estimating the initial conditions since we cannot get access to the actual initial conditions in the real cases. In the second case, we further applied the proposed method to two Chinese basins to test its performance and applicability. The Nash–Sutcliffe efficiency (*NSE*) (Nash & Sutcliffe 1970) and the root mean squared error (*RMSE*) were used as performance criteria for both the synthetic case and the real case. The *NSE* value is a widely used measure that serves to compare the performance of a given model to the mean of the observations (Schaefli & Gupta 2007). An *NSE* value of 1 indicates a perfect match of the model output to the observed data, while an *NSE* value of 0 means that the performance of the model is only as accurate as the mean of the observations.

### Hydrological model

The Xinanjiang (XAJ) rainfall-runoff model is a conceptual rainfall-runoff model proposed by Zhao (1992) and has been extensively used in most humid and semi-humid regions in China. The model used in this study consists of a runoff generation module (SRP), a runoff separation module (SOR), a flow concentration module (SFC), and a channel routing module (CR) using the lag and route method (Figure 2). Table 1 lists the physical descriptions and the units of the parameters. For detailed information about the model, we refer to the following references (Zhao 1992; Shi *et al.* 2013).

Parameter | Units | Description |
---|---|---|

WUM | mm | Areal mean tension water capacity of the upper layer |

WLM | mm | Areal mean tension water capacity of the lower layer |

WDM | mm | Areal mean tension water capacity of the deeper layer |

B | / | Exponent of the tension water capacity distribution curve |

C | / | Coefficient of deep evapotranspiration |

IMP | / | Ratio of impervious area |

KC | / | Ratio of potential evapotranspiration to pan evaporation |

SM | mm | Areal mean free water capacity of the surface soil |

EX | / | Exponent of the free water capacity curve |

KI | / | Outflow coefficients of the free water storage to interflow |

KG | / | Outflow coefficients of the free water storage to groundwater |

CS | / | Recession constant of the surface water storage |

CI | / | Recession constant of the interflow storage |

CG | / | Recession constant of the groundwater storage |

CL | / | Recession constant of the lag and route method |

Lag | hour | Lag time of the lag and route method |

Parameter | Units | Description |
---|---|---|

WUM | mm | Areal mean tension water capacity of the upper layer |

WLM | mm | Areal mean tension water capacity of the lower layer |

WDM | mm | Areal mean tension water capacity of the deeper layer |

B | / | Exponent of the tension water capacity distribution curve |

C | / | Coefficient of deep evapotranspiration |

IMP | / | Ratio of impervious area |

KC | / | Ratio of potential evapotranspiration to pan evaporation |

SM | mm | Areal mean free water capacity of the surface soil |

EX | / | Exponent of the free water capacity curve |

KI | / | Outflow coefficients of the free water storage to interflow |

KG | / | Outflow coefficients of the free water storage to groundwater |

CS | / | Recession constant of the surface water storage |

CI | / | Recession constant of the interflow storage |

CG | / | Recession constant of the groundwater storage |

CL | / | Recession constant of the lag and route method |

Lag | hour | Lag time of the lag and route method |

Module | Variable | Units | Description |
---|---|---|---|

SRP | W | mm | Tension water storage |

SOR | S | mm | Free water storage |

SFC | QT | m^{3}/s | Total flow |

SFC | QI | m^{3}/s | Interflow |

SFC | QG | m^{3}/s | Groundwater flow |

Module | Variable | Units | Description |
---|---|---|---|

SRP | W | mm | Tension water storage |

SOR | S | mm | Free water storage |

SFC | QT | m^{3}/s | Total flow |

SFC | QI | m^{3}/s | Interflow |

SFC | QG | m^{3}/s | Groundwater flow |

*QT*,

*QI*, and

*QG*satisfy the following relationship: where

*QS*represents the surface flow. The time designations are omitted for simplicity.

### Synthetic case

*et al.*(2005a), where the synthetic study was designed to demonstrate the capability for estimating the state variables and parameters. In this study, we focus on estimating the initial conditions and therefore only starting points for the model state were sampled. Another essential difference is that in this study the sampling process was conducted by Latin hypercube sampling algorithm rather than the normal random sampling algorithm due to its efficient stratification properties (Helton & Davis 2003). The true values of the measurement were obtained by a free run of the XAJ model using a predefined set of parameters, model inputs (precipitation and measured pan evaporation), and initial conditions. We utilized the precipitation data of a real flood event in Qilijie basin. The measured pan evaporation is assumed to be a constant value (4) and the potential evaporation is evaluated by: where

*EM*represents the measured pan evaporation and

*K*is one of the model parameters which determines the ratio of potential evaporation with respect to the measured pan evaporation. Qilijie basin is a Chinese humid basin with an area of 14,787 km

^{2}, which is located in Fujian province of Southwest China. Detailed information of this flood event is presented in Table 3. We only utilize the data for the first 500 hours.

Start time | End time | Duration (hour) | Sum (mm) | STD (mm) |
---|---|---|---|---|

1989/5/11 9:00 | 1989/6/3 8:00 | 552 | 347.3 | 1.12 |

Start time | End time | Duration (hour) | Sum (mm) | STD (mm) |
---|---|---|---|---|

1989/5/11 9:00 | 1989/6/3 8:00 | 552 | 347.3 | 1.12 |

*et al.*2006). Specifically, a random weight sample of size 20: is obtained by selecting random numbers from the intervals (0, 0.05), (0.05, 0.1), …, (0.95, 1) since the range [0,1] is divided into 20 disjoint intervals with equal probability. For each interval, one value

*ω*is randomly selected. Then the 20 values for

_{ij}*ω*

_{i}_{1}are randomly paired with the values for

*ω*

_{i}_{2}. After that, these pairs are randomly combined with the values for

*ω*

_{i}_{3}. This process is continued until the sample of Equation (19) is obtained. For detailed descriptions of this sampling method, we refer to these references (Helton & Davis 2003; Helton

*et al.*2006). The upper bounds for the starting points of Equation (16) are given by: where

*WM*is the tension water storage capacity and

*SM*is the free water storage capacity. The calculated values for the initial conditions are then obtained by multiplying the weight values and the upper bounds and the true initial condition is given by:

Table 4 lists the parameters of the XAJ model for the synthetic case.

Parameter | Value |
---|---|

WUM | 40 |

WLM | 100 |

WDM | 20 |

B | 0.4 |

C | 0.12 |

IMP | 0.02 |

KC | 0.9 |

SM | 20 |

EX | 1.5 |

KI | 0.3 |

KG | 0.4 |

CS | 0.93 |

CI | 0.95 |

CG | 0.98 |

CL | 0.4 |

Lag | 2 |

Parameter | Value |
---|---|

WUM | 40 |

WLM | 100 |

WDM | 20 |

B | 0.4 |

C | 0.12 |

IMP | 0.02 |

KC | 0.9 |

SM | 20 |

EX | 1.5 |

KI | 0.3 |

KG | 0.4 |

CS | 0.93 |

CI | 0.95 |

CG | 0.98 |

CL | 0.4 |

Lag | 2 |

^{2}. Measurement noise (additive white noise) is generated to perturb the true measurement. Hence, the measured discharge vector is obtained by: where

**is the measurement noise vector,**

*δ*Q**Q**

*is the true measurement vector (free of noise), and*

_{true}**Q**

*is the measurement noise vector (with noise). The measurement noise is generated with a normal distribution with a mean of zero and the noise variance is obtained by setting equal to 0.01. is the 2-norm.*

_{obs}### Real cases

#### Study areas

Two climatologically diverse Chinese basins were selected where snowmelt is not a dominant factor for runoff generation. Shaowu basin is located in the northwestern part of Min River basin and has an area of 2,745 km^{2}. This basin is dominated by subtropical monsoon climate and spring and summer are the wettest seasons. We next implement this methodology to Chai River basin which is generally regarded as a sub-basin of the Liao River basin. The methodology performance was tested using data from the Chai River basin with an area of 1,355 km^{2}, located in one of the semi-humid regions in eastern China. Chai River basin is dominated by temperate continental monsoon climate, resulting in abundant rainfall in summer. Thus, both a humid basin and semi-humid basin are included in this study. We have collected 17 flood events and 33 flood events for Shaowu basin and Chai River basin, respectively.

#### Data basis

The annual precipitation in Shaowu basin ranges from 1,156 to 2,423 mm for the period 1988–1999 and the corresponding measured pan evaporation ranges from 682 to 893 mm. For Shaowu basin, we used eight precipitation stations and one hydrologic station (the outlet station of Shaowu basin). The outlet station is the only available station to obtain the measured pan evaporation data of Shaowu basin and the potential evaporation for this basin is evaluated by Equation (18). Figure 3 depicts the locations of the two basins and the corresponding stations.

The long-term mean annual precipitation of Chai River basin ranges from 494 mm to 1,185 mm. Precipitation inputs were based on five stations for the period 1975–2016. Observed runoff data were based on the Chaihepu hydrologic station which is located at the outlet of the basin. For the Chai River basin, since we cannot get access to the measured pan evaporation data, the potential evaporation inputs were obtained by an empirical method in Yingpan evaporation experimental station located in Liaoning, China. A brief description of this method is illustrated in Figure 4.

For both basins, the model is operated in a lumped and event-based way. The initial conditions of the model are set empirically. To simplify the parameter calibration problem of the hydrological model, some of the insensitive parameters are set to plausible constants and are not further calibrated (Zhao 1992; Shi *et al.* 2013). This applies to the parameters *WUM*, *WLM*, *WDM*, *B*, *C*, *IMP*, and *EX*. The sum of *KI* and *KG* is set to a fixed value (0.7) and therefore we only calibrated *KI* in this study. The ‘Lag’ *L* can be directly obtained by using the ‘trial-and-error’ method since this parameter is a positive integer which mainly serves to determine the ‘translation’ effect of the hydrograph. The remaining parameters were calibrated using the particle swarm optimization (PSO) algorithm (Kennedy 2011). The optimization process was operated by maximizing the error function, as measured by *NSE*. The optimization algorithm uses 50 particles and the value of maximum iterations (epochs) is set to 500. Table 5 lists the parameters for the two basins.

Parameter | Shaowu basin | Chai River basin |
---|---|---|

WUM | 20 | 20 |

WLM | 80 | 100 |

WDM | 20 | 40 |

B | 0.4 | 0.4 |

C | 0.16 | 0.12 |

IMP | 0 | 0.03 |

KC | 0.6 | / |

SM | 10 | 30 |

EX | 1.5 | 1.5 |

KI | 0.3 | 0.58 |

KG | 0.4 | 0.22 |

CS | 0.8 | 0.85 |

CI | 0.82 | 0.976 |

CG | 0.97 | 0.999 |

CL | 0.1 | 0.4 |

Lag | 4 | 3 |

Parameter | Shaowu basin | Chai River basin |
---|---|---|

WUM | 20 | 20 |

WLM | 80 | 100 |

WDM | 20 | 40 |

B | 0.4 | 0.4 |

C | 0.16 | 0.12 |

IMP | 0 | 0.03 |

KC | 0.6 | / |

SM | 10 | 30 |

EX | 1.5 | 1.5 |

KI | 0.3 | 0.58 |

KG | 0.4 | 0.22 |

CS | 0.8 | 0.85 |

CI | 0.82 | 0.976 |

CG | 0.97 | 0.999 |

CL | 0.1 | 0.4 |

Lag | 4 | 3 |

## RESULTS AND DISCUSSION

### Synthetic case

Figure 5 shows the estimated initial conditions for the variables *W* and *S*. The corrected initial conditions of all samples converge to the true conditions shortly. After about 50 time steps, for both variables, the corrected initial conditions are almost identical with the true values, indicating that there is a significant improvement in the accuracy of initial conditions.

An interesting observation of Figure 6 is that the convergence time of *QT* is different from the convergence time of *QI* and *QG*. The convergence time of *QT* (after about 6 time steps) is much earlier than the time of *QI* (after about 100 time steps) and the time of *QG* (after about 60 time steps). As mentioned above, the model output is directly calculated using the lag and route method with the total flow *QT*. Therefore, using the proposed method has also considerably improved the performance of model outputs due to the satisfactory performance of *QT* (Figure 7).

Figure 7 clearly shows that the corrected outputs (discharge) of all samples converge to the measurement shortly since the *RMSE* decreases rapidly and the *NSE* increases rapidly, respectively. Together with the various convergence time of different variables, this may suggest an important fact that operating the XAJ model using different initial conditions can lead to the similar satisfactory performance of the model outputs.

### Real cases

Figure 8 shows a scatter plot of *NSE* and *RMSE* between the corrected performance (NSE: Q_C and RMSE: Q_C) and the original performance (NSE: Q_O and RMSE: Q_O) for the Chai River basin. Clearly, for most of the flood events using the proposed method, there is an increase in the performance of the model output since there is no point located in the bottom right of Figure 8(a) and the top left of Figure 8(b). For the corrected case, the *NSE* ranges from 0.483 to 0.971, while the *NSE* of the original case merely ranges from 0.233 to 0.971. Similar results are obtained for the criterion *RMSE* (see Figure 8(b)).

Then the proposed method is further applied to Shaowu basin. Figure 9 displays a scatter plot similar to Figure 8.

As expected, for the Shaowu basin, the use of the proposed method has also resulted in improvement in the overall performance by estimating the model initial conditions. The mean *RMSE* decreases from 408.97 m^{3}/s to 396.12 m^{3}/s and the mean *NSE* increases from 0.636 to 0.683 when correcting the initial conditions. Note that for all the flood events of both basins, there is no deterioration in the model performance when using the proposed method as compared to the original performance (free of initial conditions estimation). The reason for this is that we add an additional restriction in the dashed box of Figure 1 (*NSE* ≥ *NSE_O*), indicating the initial conditions are only updated when the new performance (using the updated initial conditions at the current time step) is better than the original performance. In other words, this restriction will assure that there is no deterioration in the model performance when using the performance.

Therefore, these results indicate that the use of the proposed method has improved the model performance for both basins. We would like to note an important fact that as presented in the section ‘TSVD-DSRC method’, the new method is presented in a general perspective and is not restricted to a specific hydrological model (XAJ model was only utilized to test the proposed method). Thus, clearly, this is a general method rather than a specific one. As shown in Figure 1, the proposed method is easy to operate. For different hydrological models, the only different step is to choose the vector of the initial model conditions based on the model structure. Hence, the proposed method is a general method which can be utilized to handle the error in the model initial conditions (hence improve the performance hydrological modeling). In this study, we only test the new method with a simple conceptual hydrological model. In considering this, a logical next step is to test the new method with more physically based distributed models such as the hydrodynamic models (Singh *et al.* 2014; Cea & Bladé 2015; Liang *et al.* 2015; Xia *et al.* 2017; Yu & Duan 2017) and the hybrid models (combining hydrodynamic and hydrological techniques) (Cea *et al.* 2010; Bellos & Tsakiris 2016).

Despite the stable performance, note that for some of the flood events, there is no increase in the model performance when compared to the original case. This may be caused by the fact that errors may stem from several sources: model structure, parameters, states, initial conditions, etc. (Clark & Vrugt 2006; Kuczera *et al.* 2006; Hendricks Franssen & Kinzelbach 2008; Renard *et al.* 2010), which means that the error in the initial conditions is not the only error source. For example, in some cases the model structure error is dominant but we attribute all the errors to the error of initial conditions. Hence, it may be inappropriate to attribute all the errors to only initial conditions. Unfortunately, our capabilities for quantifying model error still show considerable deficits, even though there have been many hydrological models which are generally reported to be reliable (Costabile *et al.* 2012a, 2012b; Liu *et al.* 2012; Singh *et al.* 2014; Liang *et al.* 2015; Bellos & Tsakiris 2016; Xia *et al.* 2017; Yu & Duan 2017).

## CONCLUSIONS

In this study, a new version of the DSRC method using the regularization techniques is proposed to correct the initial conditions of a conceptual rainfall-runoff model. The ill-posed property of the DSRC method can result in unstable performance. To solve this problem, a new regularized scheme is proposed. This new version of the DSRC method uses the truncated singular value decomposition to stabilize solutions and the choice of the regularization parameter is implemented by the L-curve method. A synthetic case was used to evaluate the performance of estimating the initial values of the model state. The results show that the proposed method can significantly improve the accuracy of the initial model conditions since the difference between the corrected trajectories and the true trajectory decreases rapidly and hence improves the model performance. Then, a further application, where the method was applied in two Chinese basins, was used to demonstrate the applicability in real cases which are actually more complex.

Research aimed at improvements to the current algorithm, including reliable techniques for reducing the computational demand and improving the performance for estimating the initial conditions, will continue. Despite the good conformity between the observations and the simulations, our capabilities for distinguishing and quantifying model error still show considerable deficits. Model errors may stem from many sources such as model structure, parameters, states and initial conditions. In considering this, it may be inappropriate to attribute all the errors to the error of initial conditions. Therefore, future work is needed to take other sources of errors into consideration. Also, to obtain an overall better performance, further work is needed to incorporate some additional intrinsic information of the hydrological model such as constraints on model variables. The computational cost of the proposed method can significantly increase when applying to distributed hydrological models. Considering this limitation, techniques for reducing the computational cost, such as dimension reduction technique, may be investigated in a future study.

## ACKNOWLEDGEMENTS

This research is supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX18_0575), the Fundamental Research Funds for Central Universities (2018B604X14), the National Key R&D Program of China (2016YFC0402703), the National Natural Science Foundation of China (51709077), the National Natural Science Foundation of China (41371048), the National Natural Science Foundation of China (51479062), and the National Natural Science Foundation of China (51709076).