## Abstract

Quantifying forecast uncertainty is of great importance for reservoir operation and flood control. However, deterministic hydrological forecasts do not consider forecast uncertainty. This study develops a conditional probability model based on copulas to quantify forecast uncertainty. Three updating models, namely auto-regressive (AR) model, AR exogenous input model, and adaptive neuro fuzzy inference system model, are applied to update raw deterministic inflow forecasts of the Three Gorges Reservoir on the Yangtze River, China with lead times of 1d, 2d, and 3d. Results show that the conditional probability model provides a reasonable and reliable forecast interval. The updating models both enhance the forecast accuracy and improve the reliability of probabilistic forecasts. The conditional probability model based on copula functions is a useful tool to describe and quantify forecast uncertainty, and using an updating model is an effective measure to improve the accuracy and reliability of probabilistic forecast.

## INTRODUCTION

Flooding is one of the most frequent and severe natural disasters in China; thus, flood forecasting plays a critical role in flood control, disaster reduction, and water resources management (Shen *et al.* 2015; Wu *et al.* 2017). Due to the extreme complexity of hydrological processes and the limitation of human knowledge, there inevitably remains uncertainty in the hydrological model output, which mainly results from the errors of model input, model structure, and model parameter (Liu & Gupta 2007). However, traditional and deterministic hydrological forecasting does not consider forecast uncertainty. Decision-makers cannot make a quantitative assessment of the possible risk of flood control and reservoir operation in the absence of uncertainty information. Incorrect decisions may result in casualties and economic losses (Palmer 2002). Therefore, it is essential to quantitatively assess the inherent uncertainty of hydrological forecasts.

To overcome the shortage of deterministic forecasts, many researchers have concentrated on the study of probabilistic forecast in recent years (Krzysztofowicz 1999; Duan *et al.* 2007; Montanari & Grossi 2008; Todini 2008; Pianosi & Raso 2012; Madadgar & Moradkhani 2014; Khajehei & Moradkhani 2017; He *et al.* 2018; Liu *et al.* 2018). The current hydrological uncertainty study mainly focuses on two approaches. The first one is the total factor coupling approach, which quantifies the uncertainties of each link in the rainfall–runoff process, such as rainfall input uncertainty, model structure uncertainty, and model parameter uncertainty. One example of this approach is ensemble forecast that generates a set of forecast values through different inputs, different models, and different parameter optimization methods to express forecast uncertainty (Palmer *et al.* 2005). Many statistical post-processing methods such as Bayesian model averaging, model conditional processor, and ensemble model output statistics are proposed to accommodate the under-dispersion of ensemble forecasts (Gneiting *et al.* 2005; Raftery *et al.* 2005; Zappa *et al.* 2008; Coccia & Todini 2011). The other one is the total error analysis approach, which directly processes the deterministic forecasts to realize the probabilistic forecasts, such as hydrological uncertainty processor, quantile regression, and Bayesian joint probability model (Krzysztofowicz & Kelly 2000; Weerts *et al.* 2010; Zhao *et al.* 2015). The core of the probabilistic forecast literature mentioned above is to establish multivariate joint distribution, where the multivariate Normal distribution is used commonly (Biondi & Todini 2018). Considering that hydrological variables often do not follow the Normal distribution, many data transformation methods such as Box-Cox transformation (Box & Cox 1964) or Normal quantile transform (Kelly & Krzysztofowicz 1997) have been proposed to handle any marginal distributions.

Most previous probabilistic flood forecasting models only consider the uncertainty of the hydrologic model structure and parameters; this study will consider the uncertainty of hydrologic model structure and parameters, as well as model input. So, a conditional probability model based on copula is developed to post-process daily streamflow forecasts and estimate uncertainty in this study. For the purpose of improving the accuracy and reliability of daily streamflow forecasts, three updating models, including auto-regressive (AR), AR exogenous input (ARXM), and adaptive neuro fuzzy inference system (ANFIS), are applied to correct forecasting errors. The effect of three updating models on the accuracy and reliability of probabilistic forecast is also explored and discussed. The remainder of this paper is organized as follows: the conditional probability model, three updating models, and evaluation criteria are explained in the next section. The study area and the hydrological forecasting model are presented in the section ‘Case studies’. In the following section, the results and discussion are given. Finally, the main conclusions are summarized in the last section.

## METHODOLOGY

### Copula function

*n*-dimensional random variables are , where

*n*is the number of random variables and is the value of random variables . Let be the joint distribution function of random variables . According to Sklar's theorem (Sklar 1959), the multivariate cumulative distribution function (CDF) can be written as follows:where

*C*is the copula function and is the parameter of

*C*. If are continuous, then

*C*is unique.

There are different families of copulas that have been employed in a variety of applications (Nelsen 2006). Of all the copula families, Archimedean copulas find a wide range of applications in hydrology, since they can be easily constructed and possess many good properties like simplicity and adaptability. Gumbel-Hougaard, Clayton, and Frank copulas are three commonly used one-parameter Archimedean copulas, which are widely applied in hydrology by many authors (Zhang & Singh 2006; Duan *et al.* 2016; Zhong *et al.* 2017; Yin *et al.* 2018a, 2018b), and chosen in this study to establish the joint distribution between forecasts and observations. The expression of these copulas and the correlation between their parameters and Kendall correlation coefficients are listed in Table 1. The parameters of the two-dimensional Archimedean copulas can be estimated by Kendall correlation coefficients.

*RMSE*) is used to measure the goodness of fit between the empirical and the theoretical joint distribution (Zhang & Singh 2007), and defined as follows:where and are the empirical and theoretical joint CDF values, respectively. The smaller the value of

*RMSE*is, the better the copula function.

### Conditional probability

Let predictand *H* be the observed discharge whose realization *h* is being forecast. Let estimator *S* be the output discharge generated by a corresponding deterministic forecast model whose realization *s* constitutes a point estimate of *H*. Let random variable *H*_{0} represent the observed discharge at the time *k* = 0 when the forecast is prepared; then *H _{k}* (

*k*

*=*1, 2

*, …, N*) is the observed discharge and

*S*(

_{k}*k*= 1, 2, …,

*N*) is the corresponding deterministic forecast discharge at lead time

*k.*What the rational decision-maker then needs is not a single number

*s*, but the distribution function of predictand

_{k}*H*conditional on

_{k}*S*

_{k}*=*

*s*(Liu

_{k}*et al.*2018).

The solution of conditional probability model based on copula (Equation (5)) is given by the following steps:

- 1.
In the model calibration period, the parameters of marginal distribution and copula function are estimated by the data series of observations (

*H*) and forecast values (*S*), respectively. - 2.In the real-time flood forecasting, the observations are unknown for the lead times of 1d, 2d, and 3d. We discretize and determine the distribution range [
*h*,_{min}*h*_{ma}_{x}] for the observation*H*, according to its real flood probability distribution characteristics. Then the discrete*h*can be expressed as follows:where*h*and_{min}*h*_{ma}_{x}are the minimum and maximum observation,*n*is the number of discretization, and is the discretization spacing. - 3.
Calculate the density function of copula and the density function value , according to the forecast values (

*s*) and the discrete value_{k}*h*. - 4.
Calculate the conditional density function value by Equation (5).

For a given confidence level , let and be the 5% and 95% quantiles of the random variable *h*, respectively. Then the 90% confidence interval of *h* is used to reflect the forecast uncertainty.

### Updating models

#### AR updating model

*t*,

*h*and

_{t}*s*denote the observed and forecasted flow discharge at the current time

_{t}*t*, respectively, is the mean value of forecast error series, and is the distance average series of

*e*, where the mean value of equals zero.

_{t}*p*is the order of the AR model, which can be determined by autocorrelation and partial autocorrelation and Bayesian information criterion (BIC).

#### ARXM model

*p*and

*q*are the orders of auto-regressive and the exogenous input parts of the ARXM, respectively. are the coefficients of the ARXM model, which are estimated by the ordinary least square method.

The ARXM updating model differs from the AR updating model. In the ARXM updating procedure, the model acquires the updated forecasts directly, whereas during the AR updating process, the forecast errors at the current time are firstly estimated, and then added to the original model output to obtain the updated forecasts.

#### Adaptive neural fuzzy inference system

According to the types of fuzzy reasoning and fuzzy if-then rules, the ANFIS can be categorized into three types (Jang 1993). In this study, Takagi and Sugeno's fuzzy if-then rules are used, according to which the output of the whole fuzzy inference system is a weighted average of each rule's output, the weight values of each rule output are determined by the membership function (a mathematical tool used to represent the ‘degree of truth’ of the element belonging to a fuzzy set).

*x*

_{1}and

*x*

_{2}and the fuzzy system has only two if-then rules of Takagi and Sugeno's type. Then, its typical fuzzy inference rule can be described as follows:where

*A*

_{1}and

*A*

_{2}are the linguistic label (small or large) of the inputs and

*B*

_{1}and

*B*

_{2}are the linguistic label (small or large) of the inputs .

As shown in Figure 1, the ANFIS network structure consists of five layers, the functions of each layer node are explained as follows:

*x*

_{1}belongs to the corresponding fuzzy set

*A*, having the following form:where is the membership function of the fuzzy set

_{i}*A*, the Gaussian function is chosen in this study.

_{i}*F*denotes a non-linear function.

_{ANFIS}### Evaluation criteria

#### Deterministic forecast

- 1.Nash-Sutcliffe efficiency (
*NSE*)where and denote the observed and forecasted flow discharges at time*t*, respectively; and is the average value of observed flows. The closer the value of*NSE*is to 1, the better the model performs.*NSE*indicators are sensitive to the evaluation of high flow, while being insensitive to the additive and proportional differences between forecasts and observations. - 2.
- 3.
*RMSE*where*RMSE*represents the standard deviation of the differences between the observed values and simulated values, which is sensitive to outliers. The value closer to zero indicates a better forecast. In the evaluation of medium and high flows, the*RMSE*indicator can better reflect the pros and cons of the model. - 4.Mean absolute error (
*MAE*)where*MAE*is the average absolute error between the observed and forecasted value, it can reflect the forecast error of each point more intuitively. Small*MAE*values are preferred. In the evaluation of low and medium flows, the*MAE*indicator can better reflect the pros and cons of the model. The units of*RMSE*and*MAE*are m^{3}/s.

#### Probabilistic forecast

In this section, continuous ranked probability score (*CRPS*), quantile-quantile (Q-Q) plot, and other indicators are used to evaluate the quality of probabilistic forecast.

- 1.
*CRPS*

*CRPS*is the most widely used criterion to give a comprehensive evaluation of the quality of the probabilistic forecast, which can address the reliability and sharpness of predictive distribution simultaneously (Gneiting & Raftery 2007). The

*CRPS*is defined as follows:where is the cumulative probability function of the forecast at time

*t*; is the observed flow at time

*t*; and denotes the Heaviside step function, which is calculated byFor the deterministic forecasts, the

*CRPS*reduces to generalize the absolute error, which is equivalent to the

*MAE*(Zhao

*et al.*2015; Liu

*et al.*2018). Thus, it can also be used to directly compare the accuracy of the deterministic forecast and probabilistic forecast. Smaller

*CRPS*values indicate better performance.

- 2.
Q-Q plot

*et al.*(2010) is used and calculated as follows:where is the value that the predictive CDF derives at the observation and is the non-exceedance uniform probability of the observation. The value of the ranges from 0 to 1, large values mean it has greater reliability.

- 3.
Assess the 90% confidence interval

*CR*), average band-width (

*B*), average relative band-width (

*RB*), and percentage of observations bracketed by the unit confidence interval (

*PUCI*). These indices are used and calculated as follows:where

*n*is the number of measured data contained in the prediction interval. It describes the ratio of the number of observed points within the confidence interval to the total number of the observed points. The larger the

_{c}*CR*value indicates the higher percentage of the observed flows within the confidence intervalwhere denotes the upper bound of the 90% confidence interval corresponding to the 95% quantile and denotes the lower bound of the 90% confidence interval corresponding to the 5% quantilewhere is the observed flow at time

*t*. The average band-width and the average relative band-width are also commonly used to assess the quality of the confidence interval. For a given confidence level , it is best that the values of

*B*and

*RB*should be as small as possible.

*RB*value indicates that the average relative band-width of the confidence interval is narrower; the larger the

*CR*value means higher percentage of the observations bracketed by the confidence interval. However, Xiong

*et al.*(2009) indicated that as the

*CR*increased, the

*RB*value of the confidence interval also increased. The two indices are often contradictory when assessing the quality of the 90% confidence interval. Therefore, a comprehensive index of the percentage of observations bracketed by the unit confidence interval (

*PUCI*) proposed by Li

*et al.*(2011) was used to evaluate the quality of the uncertainty interval estimate. It can be expressed as follows:The larger the

*PUCI*value, the better the performance of the 90% confidence interval is.

## CASE STUDIES

### Study area and data

The Yangtze River is the largest river in China and the third longest river in the world. The Three Gorges Reservoir (TGR), located in the upper reaches of the Yangtze River, was selected as a case study. It is the largest and the most important water conservancy project in China. The river channel above the dam site of the TGR is 4,500 km long with a contributing drainage area of . The TGR plays an important role in flood control in the middle and lower sections of the Yangtze River, with the combined benefits of power generation, navigation improvement, etc.

The TGR intervening basin, shown in Figure 2, has a catchment area of 55,907 km^{2}, accounting for 5.6% of the TGR basin area. The mean annual rainfall and evaporation in the basin are 1,100 mm and 900 mm, respectively. The mean annual inflow is 14,300 m^{3}/s. The observed and forecasted daily inflows of Cuntan and Wulong hydrological stations, and the daily rainfall data in the intervening basin from 2012 to 2015 are collected for the inflow forecasting of the TGR. The data series from 2012 to 2014 are used for calibration, and the data of 2015 are used for validation. Numerical precipitation forecasting is necessary to prolong the forecast horizon. The numeric weather prediction daily precipitation predictions from the European Center for Medium-range Weather Forecast (ECMWF) in lead times of 1d, 2d, and 3d are used as a hydrological model input in this study. The statistical characteristics of calibration and validation data are also shown in Figure 3. The maximum and minimum flow during the validation period are included in the calibration period, and the samples in the validation period have similar statistical characteristics during the calibration period, indicating that the calibration and validation period are reasonably divided and can be used for model calibration.

### Hydrological forecasting model

*Q*(

_{sx}*t*),

*Q*(

_{ct}*t*), and

*Q*(

_{wl}*t*) denote the inflow of TGR, the flow discharges of Cuntan station and Wulong station at time

*t*, respectively.

The Xinanjiang model proposed by Zhao (1980) is suitable for humid and semi-humid regions and has been widely used for flood simulation and forecasting in China. The model is mainly characterized by the concept of runoff formation as natural storage in humid regions, which implies that runoff is produced after the soil water content reaches field capacity. The storage capacity curve is used to solve the problem of the uneven space distribution of the runoff formation throughout the basin. There are 15 parameters, and each parameter has a defined physical meaning. A more detailed description of the model is available in Zhao (1980).

In the model simulation calibration and verification periods, the observed flood series of the Cuntan and Wulong hydrological stations as well as the observed rainfall of the interval basin area are used as model inputs. The output of the model is the inflow of the TGR. The parameters of the Xinanjiang model and Muskingum method are optimized by the genetic algorithm (Goldberg 1989). The simulation results for the *NSE* and *RE* are 99.05% and −0.57% in the calibration period, 98.30% and −1.57% in the validation period, respectively. The simulation results indicate that the established model can be used as a forecasting scheme in practice.

## RESULTS AND DISCUSSION

### The order of the three updating models

The order of the three updating models can be determined by using the autocorrelation function (ACF), partial ACF (PACF) and the BIC. In the ARXM and ANFIS updating procedure, the recent values of the observed discharge together with the non-updated outputs of the hydrological model are used as inputs, whereas during the AR updating process, the distance average series of the forecast error series is used as its input, the order of models is determined separately. The BIC results of the AR model under different model orders are plotted in Figure 5 (red line). The ACF and PACF of the error time series are shown in Figure 6. In Figure 5, the BIC has the minimum value at the order *p* = 5. In Figure 6, the ACF of the error time series tends to be zero with the increase in the lags. However, the partial autocorrelation coefficient is truncated, which occurs roughly at the order *p* = 5. Therefore, the order of the AR model is determined as 5.

For the ARXM and ANFIS updating models, the order *p* and *q* need to be determined. The ACF and PACF of the observed TGR inflow series are calculated and shown in Figure 7. In Figure 7, the PACF is truncated at the order *p* = 3. The BIC results of different orders *q* at order *p* = 3 are shown in Figure 5 (blue line), on which the BIC has the minimum value at the order *q* = 3. Thus, the order *p* = 3 and *q* = 3 are used to update the forecasts by the ARXM and ANFIS models.

### The marginal distributions

First, the marginal distributions of the observed flow, raw forecasts, and updated forecasts need to be determined. Seven commonly used distributions, including Gamma, Lognormal, Generalized Extreme Value (GEV), Pearson type III (P-III), Gumbel, Weibull, and Log-logistic, are chosen as candidate distributions to fit the marginal distributions of all variables. The parameters of each distribution are estimated by the maximum likelihood estimation method. The modified Shapiro-Wilk test developed by Ashkar & Aucoin (2012) is applied to evaluate the goodness of fit of the probability distributions. The test results of each candidate distribution are listed in Table 2. The best marginal distribution is the one that has the maximum test values. As shown in Table 2, the P-III distribution performs the best fitting with the maximum test values for all variables. The upper tail of the observed flow series is relatively large, which is consistent with the characteristics of the P-III distribution and has been widely used by many authors (Zhong *et al.* 2017; Yin *et al.* 2018a, 2018b). The estimated parameters of P-III distributions for all variables are listed in Table 3.

### Construction of the joint distribution based on copula

Three commonly used Archimedean copula functions, including Gumbel, Clayton, and Frank, are selected to construct the joint distribution of the forecasts and observations. The Kendall correlation coefficient between observed and forecasted series is calculated, and then the parameters of three copula functions are computed, according to the relationship between and . The *RMSE* values for evaluating the goodness of fit between theoretical and empirical joint distributions are calculated for each copula function. The results listed in Table 4 show that the Gumbel copula has the best fit with the smallest *RMSE* values for lead times of 1d, 2d, and 3d. Thus, the Gumbel copula is applied for constructing the joint distribution between observations and forecasts. It can also be seen in Table 4 that the Kendall correlation coefficient between observations and updated forecasts is greater than the original ones, which indicates that the Kendall correlation coefficient between observed series and forecasted series has been improved after error correction. Furthermore, it can be seen in Table 4 that the Kendall correlation coefficient decreases as the lead time increases as expectedly, since forecast quality decreases as the lead time increases.

### Comparison of deterministic and probabilistic forecasting results

#### Deterministic forecasting results

According to the distribution range of the observation, let *h _{min}* = 3,000 m

^{3}/s and

*h*= 90,000 m

_{max}^{3}/s (the cumulative probability of which is above 0.99999). It is theoretically better to select a small discrete spacing. However, smaller discrete spacing introduces more computational complexity into the calculation. Therefore, the discrete spacing is selected in this study. Then the discrete

*H*of each time step can be calculated by Equation (6), where the number of discretization is

*n*= 43,501. Based on the principle of mathematical statistics, the probability median forecast is calculated by Equation (8) and used as the point estimate results of probabilistic forecast. Four evaluation indexes, namely

*NSE*,

*RE*,

*RMSE*, and

*MAE*, are used to evaluate the accuracy of the median forecast. The comparison with the raw forecast is presented in Figure 8, which shows that in terms of all evaluation indicators, the accuracy of probability median forecast is significantly improved compared with the raw forecast. Compared with the raw forecast results, the

*RE*of the probability median forecast are reduced by 0.12%, 3.38%, and 5.01%, and the

*NSE*are improved by 0.1%, 0.6%, and 1.6% for lead times of 1d, 2d, and 3d in the validation period, respectively.

The deterministic evaluation indexes of raw forecast and three updating models for lead times of 1d, 2d, and 3d in calibration and verification periods are listed in Table 5. The comparison between the forecast results of the three updating models and the original model is also shown in Figure 9. It is obvious that the performances of three updating models are superior to that of the raw forecasts in terms of almost all evaluation indicators. The ANFIS updating model performs the best for almost all lead times, while the ARXM updating model performs slightly better than the AR model for 1d, 2d, and 3d lead times. The ANIFIS model performs the best because it not only can effectively identify the non-linear relationship in hydrological process but also effectively classify the inputs into high and low levels, which can consider the relationship between errors and flood magnitude. In addition, according to the results of *NSE* and *RE* indexes, the improvement degree becomes more obvious with the increase in lead time. However, for the *RMSE* and *MAE* indexes, the improvement degree decreases with lead time increases. For the ANFIS updating model, the *MAE* indicator in 1d, 2d and 3d lead times has improved compared with the original forecast, while for the AR and ARXM models, the improvement in the *MAE* index is no longer achieved for long lead times (3d) compared to the original forecast. This indicates that the ability of the updating model for correcting medium and low flow is gradually lost as the lead time increases. Comparing the results of indicators in Figures 8 and 9, it can be found that the traditional updating models, including AR, ARXM, and ANFIS, are better than the copula-based probability model.

#### Probabilistic forecasting results

The conditional probability density and distribution curve of the observation *H,* given *S _{k}* =

*s*, can be obtained based on the conditional probability for given forecasts. Take 1 July 2015 (

_{k}*S*

_{1}= 38,462 m

^{3}/s,

*S*

_{2}= 33,469 m

^{3}/s, and

*S*

_{3}= 26,386 m

^{3}/s) as an example, the PDF and CDF of observation

*H*are presented in Figure 10. As can be seen in Figure 10, the conditional density function becomes fatter with the increase in lead time, indicating that the flood forecasting uncertainty increases as the lead time increases.

In order to evaluate the reliability of the probabilistic forecast, the predictive Q-Q plots for three updating models with different lead times are shown in Figure 11. As the Kolmogorov–Smirnov test is highly sensitive when applied to the large data sets (Serinaldi 2009), none of the Q-Q plots lie within the significance bands. However, overall the Q-Q plots for the updating models are closer to the 1:1 line than the original model for all lead times. It is noted that we cannot obtain enough information from the visual inspection of the Q-Q plot to evaluate their reliability. Therefore, the *CPRS* and are used to quantitatively measure the reliability of the probabilistic forecasts. The values of *CRPS* and for the probability raw forecast and updating models for lead times of 1d, 2d, and 3d are listed in Table 6. The values of for three updating models are closer to 1 than those of the original model, which implies that the probabilistic forecasts are more reliable after forecast correction. It is also shown that the *CRPS* values increase as the lead time increases, which indicates that the probabilistic forecast is less sharp and reliable at longer lead times. Compared with the corresponding raw deterministic forecast, the *CRPS* values of the probabilistic forecast in the validation period are reduced by 25.76%, 26.86%, and 26.37% for lead times of 1d, 2d, and 3d, respectively. It indicates that the conditional probability model based on copula is a useful tool for quantifying and assessing forecast uncertainty. Table 6 also shows that as the accuracy of deterministic forecast increases after error correction, the corresponding probabilistic forecast becomes more reliable. This is because the conditional probability model achieves probabilistic forecast on the basis of given deterministic forecast and the reliability of probabilistic forecast depends largely on the accuracy of deterministic forecast. In terms of the comprehensive index *CRPS*, the ANFIS model performs the best in almost all lead times.

The evaluation index results of 90% confidence interval at different lead times are illustrated in Figure 12. The average and relative band-widths gradually increase as lead times increase, indicating that the inflow forecast uncertainty increases. The *B* and *RB* values of three updating models are smaller than that of the original forecast. The ANFIS updating model has the lowest *B* and *RB* values among the three updating models. In terms of *CR* values, the coverage rate of different updating models in all lead times is close to or exceeds the given confidence level 90%, indicating that the 90% forecast interval is reasonable and reliable. The *PUCI* indicator is used to compare the performance of different updating models. As shown in Figure 12, the *PUCI* values of probabilistic forecast decrease as lead times increase, which means that the forecast uncertainty of reservoir inflow increases with lead times. On the other hand, the *PUCI* value of updated forecast increases significantly compared with that of the raw deterministic forecast, which implies that the probabilistic forecasts are more reliable after error correction. In terms of *PUCI* value, the ANFIS model performs the best with the largest *PUCI* value among the three updating models. There is no significant difference between the ARXM and AR models.

The 90% confidence intervals of the raw forecast model and ANFIS updating model in the flood season in the validation period are plotted in Figure 13. It is observed that the 90% confidence interval covers most of the observed flows, indicating that the attained probabilistic forecast interval is reliable. The band-width of 90% confidence intervals attained by the updating models is narrower than the original. Furthermore, the 90% confidence intervals become wider at a longer lead time, indicating that the forecast uncertainty becomes larger as lead time increases.

## CONCLUSIONS

Quantifying the inherent uncertainty in flood forecasting is of great importance for flood control decision-making and water resources management. Properly communicating uncertainties in flood forecasting provides the basis for reservoir operation, and thus prevents casualties and economic losses. However, traditional flood forecasting does not consider predictive uncertainty. This study develops a conditional probability model based on copula to quantify and estimate predictive uncertainty. Three updating models, namely the AR model, ARXM model, and ANFIS model, are applied to correct raw forecasts. The TGR in China is selected as a case study. The main conclusions of this study can be summarized as follows:

- 1.
Three updating models can significantly improve the raw forecast results in terms of the deterministic evaluation criteria. The degree of improvement in forecast accuracy by the ANFIS updating model outperforms the ARXM and AR models.

- 2.
The conditional probability model based on copula is a useful tool for post-processing deterministic forecast to quantify forecast uncertainty, providing both a deterministic forecast value along with a reasonable and reliable uncertainty interval. The updated probability forecasting results show higher accuracy and reliability than that of raw forecasts in terms of

*CRPS*, , and*PUCI*, indicating that the updating models are a useful tool to improve the reliability of probabilistic forecast. The ANFIS model is superior to the ARXM and AR models in terms of comprehensive indicator*CRPS*and*PUCI*in almost all lead times. - 3.
The conditional probability forecast model based on copula can quantitatively assess forecast uncertainty, and the three updating models can improve the accuracy and reliability of the probability forecast. In terms of deterministic flood forecasting, the conditional probabilistic forecasting model based on copula can improve the forecast accuracy; however, the degree of improvement by the three updating models is more significant.

In summary, the developed conditional probability model combined with the updating models quantify the hydrological forecasting uncertainty accurately and reliably. Like most existing studies, this paper focuses on evaluating the pros and cons of probabilistic forecasting schemes; however, how to effectively take advantage of the risk information provided by probabilistic forecasts in the reservoir operation decision-making process needs further study.

## ACKNOWLEDGEMENTS

This study is financially supported by the National Key R&D Project (2016YFC0402206) and National Natural Science Foundation (51539009, 51879192) of China. We are very grateful to the editor and anonymous reviewers for their valuable comments and constructive suggestions, and Dr Bryan O'Reilly for proofreading that helped us to greatly improve the manuscript.