## Abstract

Uncertainty analysis is important and should be always considered when using models for flood forecasting. In this paper, the ‘Principal Components Analysis-Hydrologic Uncertainty Processor’ (PCA-HUP) was developed for probabilistic flood forecasting (PFF) and further evaluated in the middle Yellow River, China. Due to the severe sediment erosion, small and medium floods drain in the main channel (normal floods) while large floods would spill over the bank and drain in river floodplains (overbank floods). Thus, the practical routing methods were used to provide the deterministic flood forecasting (DFF) input for PCA-HUP. PCA-HUP quantifies the forecast uncertainty and provides PFF results. The comparison of performance between the DFF and PFF outputs indicated that PFF could also provide a good accuracy of deterministic hydrograph. In order to explore the performance decay of DFF and PFF with lead time increasing, the lead times *n* = 1, 6 and 10 hours were chosen for comparison. Results suggested that, with the increasing lead time, the performances of both DFF and PFF decayed accordingly. As a consequence, this study proved the practicability of PCA-HUP in the operational forecasting for both normal and overbank floods in the middle reach of Yellow River.

## INTRODUCTION

Flood forecasting has become an operational support to real-time reservoir management as well as to flood alert and flood risk reduction since 1970s, and thus has been extensively concerned and studied in the past decades. Currently, the results of most operational flood forecasting, discharge or water levels are in deterministic form, namely the forecast was based on one or more single valued model ouptuts and this could be defined as deterministic flood forecasting (DFF). Many hydrologic models have been developed for flood forecasting (Todini 2007; Qu *et al.* 2009). Conceptual models were developed based on different simplifications and assumptions of actual hydrologic processes (Zhao 1992; Todini 1996; Li *et al.* 2014b). In order to simulate or forecast the real hydrologic process in more detail, physically based distributed hydrologic models were developed with the integration of remote sensing and geographic techniques (Abbott *et al.* 1986a, 1986b; Li *et al.* 2014a).

However, flood forecasting is always accompanied with uncertainties from input, model structure, and model parameter due to the complexity of natural hydrologic process and the limitation of human's knowledge on the physical hydrology world (Jiang *et al.* 2013, 2016; Yu *et al.* 2015; Li *et al.* 2017). More and more hydrologists realize that uncertainty analysis should always be considered when using a rainfall-runoff model to flood forecasting. Furthermore, some studies attempted to produce flood forecasting results with the uncertain information, i.e. in the probabilistic form (Krzysztofowicz 1999; Kavetski *et al.* 2006; Ajami *et al.* 2007). Such models or methods of probabilistic flood forecasting (PFF) could quantify the forecast uncertainty by providing the estimates of probability density function (PDF) or cumulative distribution function of the predictand (Bogner & Pappenberger 2011). Then, the quantitative assessment on forecast uncertainty in the forms of variance or confidence interval in addition to unique predictions as DFF could be obtained. Thus, PFF could provide rich information of the predictand for decision-making of flood control (Liang *et al.* 2012). The methods of PFF could be categorized into two types. One is the ‘element coupled’ approach. In this approach, each element in rainfall-runoff process that brings uncertainty into final forecasting is identified, and then the PDF for each element could be estimated and coupled together within a hydrologic model to obtain the PDF of the predictand. For example, the popular uncertainty analysis method named generalized likelihood uncertainty estimation (GLUE) (Beven & Binley 1992) could analyze the model parameter uncertainty based on Monte Carlo simulations. Kavetski *et al.* (2006) analyzed the uncertainty from multi-sources in rainfall-runoff modeling with the framework of Bayesian Total Error Analysis (BATEA). Similarly, Ajami *et al.* (2007) proposed the Integrated Bayesian Uncertainty Estimator (IBUNE) that considers the uncertainty from precipitation input and parameter with the SCEM-UA technique (Vrugt *et al.* 2003) and the uncertainty of model structure with the Bayesian Model Averaging method. Another type of PFF method is the ‘error analysis’ approach that analyzes the characteristics of bias or error between the DFF output and observation and further generates the results in a probabilistic form. A typical ‘error analysis’ method is the Bayesian Forecasting System (BFS) (Krzysztofowicz 1999, 2002) that provides a general methodology for PFF directly based on model forecasts and observations. In BFS, the total predictive uncertainty in flood forecasting is characterized as a Bayesian combination of forecast precipitation uncertainty and hydrologic uncertainty that are processed separately through a precipitation uncertainty processor and hydrologic uncertainty processor (HUP), respectively. The main advantage of BFS is its ‘model-free’ framework that could be integrated with the results from any deterministic rainfall-runoff model because it only analyses the deterministic model's output. In the calculation of HUP, however, there exists a linear hypothesis of prior distribution and likelihood function being in the normal space. This explicit linear relationship among independent variables in the likelihood function would cause the multicollinearity of the regression equation (Kumar 1975). In the HUP model, the parameters of regression equation are estimated by the linear square method. This may lead to the instability of regression equation and further decrease the forecast accuracy because of the multicollinearity (Llorca 1999).

For eliminating the influence of multicollinearity in parameter estimation of HUP and improving the prediction accuracy, the ‘Principal Components Analysis-Hydrologic Uncertainty Processor’ (PCA-HUP) was developed as a PFF model based on the traditional HUP model and principal components analysis. In this paper, the middle reach of Yellow River was selected as an example, and the PAC-HUP model was used to analyze the errors in DFF results and obtain the posterior distribution of the forecasting variables. To our knowledge, PFF has not been used in operational flood forecasting in the middle reaches of Yellow River, and this study should be the first attempt. The remainder of this paper is structured as follows. First, the study reach, data and methodology are described. Second, results of DFF and PFF are presented and discussed. Finally, the main conclusions are obtained.

## MATERIALS AND METHODS

### Study area

The Lower North Mainstem (LNM) is defined as the main stream between the Longmen and Tongguan stations in the middle reach of Yellow River. The outlet of Tongguan hydrologic station controls the inflows from the upstream of Longmen station and the runoff from the Wei, Fen and Beiluo River basins. The floods that occurred at the Tongguan station were mainly produced by the inflow from the drainage area of Longmen station and the flow from the tributary basins. Flood hydrograph of the LNM reach is characterized by the rapid rise and recession, especially when flood generates from the inflow of Longmen station, with the flood duration usually less than 3 days (MYRHB-YRCC 2005). The length of LNM reach is 136.5 km with a river bed gradient of 0.03–0.06%, and the watershed area is 4,080 km^{2} (Figure 1).

A unique characteristic of the LNM reach is the catastrophic channel change due to the sediment deposition and erosion of the middle Yellow River. The channel width at the Longmen station is only about 130 m, but suddenly extends to about 3,000 m after flowing to the Yumenkou section at the mountain col. This leads to a decrease of flow velocity and sediment transporting capacity and an increase of sediment deposition along the channel. The width of the following channel ranges between 3 and 19 km, due to severe sediment erosion. Thus, small floods would drain in the main channel while large floods would spill over the bank and drain in river floodplains on the two sides of the main channel of the LNM reach. With respect to magnitude, these two types of floods could be defined as normal flood and overbank flood, respectively, and they are interpreted in Figure 2.

### The DFF methodology

The main problem of flood forecasting at the Tongguan station is the channel routing. Thus, observed flow at four upriver hydrologic stations could be used as the input data for the flood forecasting of Tongguan station. This method has been used in operational flood forecasting by the Hydrology Bureau of Yellow River Conservancy Commission (YRCC). According to the flood characteristics, YRCC simplifies the flood forecasting problem of the LNM reach as three routing calculations: (1) flood routing calculation from the Longmen and Hejin stations (sum of the flows at these two stations) to Tongguan station with the assumption that the routing effect from Hejin station to the mainstem of Yellow River could be neglected; (2) flood routing calculation from Zhuangtou station to Tongguan station; and (3) flood routing calculation from Huaxian station to Tongguan station. The sum of these three routing calculation results could be viewed as the total forecast flow at Tongguan station. This flood forecasting method (including its assumptions) is supported by observations and plays a significant role in operational forecasting of YRCC (MYRHB-YRCC 2005; Yuan *et al.* 2015a, 2015b, 2016). The hydrographs of the upstream stations (Longmen, Hejin, Zhuangtou and Huaxian) could be forecasted by the Operational Flood Forecasting System of YRCC with different lead times. For a specific flood, the maximum value of lead time was equal to the duration of the flood event. The forecasted inflow at the upstream stations was used as the input of the channel routing model. Thus, the forecasted outflow at the Tongguan station had the same lead time as the inflow series. For the flood routing calculation of a specific reach, different methods should be used for normal and overbank floods. Meanwhile, the routing method may also be developed according to the characteristics of flood and channel reach. The methods used for DFF in the study are described as below.

#### Muskingum routing methods

*I*and

*O*are the inflow and outflow of a reach with the subscripts of 1, 2 for the current and next time steps, respectively;

*Δt*is time interval in the calculation; and

*C*

_{0},

*C*

_{1}and

*C*

_{2}are the coefficients that could be calculated from the parameters

*k*and

*x*.

*Δt*is assumed to be equal to the

*k*value for ensuring the correctness of linear finite solution. When applying to a relatively long channel reach, the original method would reduce the accuracy of flood forecasting. Thus, the Muskingum segmentation method was developed to overcome this pitfall (Gill 1978). A long channel reach could be divided into

*m*segments while the longitudinal distribution of river flow along a specific segmental reach could approximately be linear, and then the consecutive routing calculation could be conducted for these

*m*segments to the final outflow. The following equation describes the consecutive routing calculation using the Muskingum method and produces the final outflow: where

*O*(

*t*) is the time series of outflow at the downstream section of the reach;

*t*is the time step; the coefficients of

*C*

_{0},

*C*

_{1}and

*C*

_{2}could be determined using

*k*and

*x*that assumes to be constant for each segmental reach;

*i*is a counting variable of segmental reach (

*i*= 1, 2, 3, …,

*m*).

*q*, i.e.

_{i}*k*

_{i}=

*f*(

*q*) and

_{i}*x*=

_{i}*f*(

*q*). This routing calculation variant version is defined as the layered Muskingum method. For a specific compound open channel reach, the total inflow could be divided into

_{i}*p*layers (

*p*= 2 for the overbank case in Figure 2). The layered inflow would be routed to the downstream section separately: where the subscripts of

*i*and

*t*represent the number of layer and time step, respectively. The final total outflow at the downstream section at time

*i*could be obtained by summing up all layered outflow (

*O*).

_{i,t}#### Empirical method of storage – flow median line

*WT*and level

*H*(

*H*∼

*WT*) could be established according to the channel vertical section observed before the flood season. Furthermore, the relationship between channel flow

*q*and water level

*H*(

*H*∼

*q*) at a specific control section is also established. According to these two relation schemes, the basic formula of the storage–flow median lines written as: It should be noticed that there are two assumptions in the method of storage–flow median line: (a) the process of overbank flood is assumed as the flood routing in a reservoir, i.e. the floodplain is generalized as a complete reservoir unit, with the outflow point being at the middle of the downstream section; (b) the relationships between the water level of reservoir outflow point and the corresponding flow at upstream/downstream stations could be established and extended to high flow interval through Manning's formula.

### The PFF methodology

#### HUP model

In this study, the HUP model was implemented with the channel routing models that used the forecasting results as input. In general, the bias in flow observation is much less than that in observed or forecast precipitation. Thus, the PFF analysis of the LNM reach taking no account of precipitation input uncertainty could be accepted. The main methodology of HUP is described briefly below.

*H*

_{0}is the observation (with errors) that needs to be forecasted; the variables

*H*and

_{n}*S*(

_{n}*n*= 1, 2, …,

*N*) are the truth and the predictant from the deterministic model while

*h*and

_{n}*s*are their values, respectively; and

_{n}*N*is the leading time. At any time step

*n*, the posterior PDF of

*H*could be deduced according to the Bayesian theorem with the conditions of

_{n}*H*=

_{0}*h*and

_{0}*S*=

_{n}*s*: This equation expresses the posterior PDF of

_{n}*H*,

_{n}*ϕ*, as the Bayesian combination of the prior PDF

_{n}*g*and the likelihood function

_{n}*f*with the total probability formula. The functions of

_{n}*g*and

_{n}*f*describe the natural uncertainty of samples as a prior information and the hydrologic uncertainty of DFF, respectively. Obviously, the posterior PDF includes the information from both priori knowledge and samples. Equation (6) shows that the posterior PDF of

_{n}*H*

_{n}is calculated based on the observation

*H*

_{0}at the start time of forecast and the corresponding forecasted value from DFF,

*S*

_{n}(at time

*n*). Therefore, HUP provides the results of PFF based on the outputs from DFF.

*Q*represents the standard normal distribution while

*q*is the corresponding PDF;

*W*and

_{n}*X*represent the normal quantiles of

_{n}*H*and

_{n}*S*respectively;

_{n}*Γ*and are their marginal distribution functions, respectively. In the transformation space, the prior distribution and likelihood function for

*W*and

_{n}*X*could be estimated, and then the posterior PDF is solved according to Equation (6). The procedure for determining prior distribution and likelihood function is described below.

_{n}*Deduction in the transformation space.*Substitute the prior PDF and likelihood function into Equation (6), one obtains the transformed posterior PDF of*W*: with the following substitutions:_{n}

#### PCA–HUP model

As mentioned above, there exist linear assumptions of prior distribution *X*_{n} and likelihood function *W*_{n} in the transformation space (Equations (8) and (10)). Therefore, this explicit linear relationship among independent variables in likelihood function would cause the multicollinearity of the regression equation. In the HUP model, the parameters of regression equation are estimated by the linear square method. This may lead to the instability of regression equation and further decrease the forecast accuracy of HUP due to the multicollinearity. In order to eliminate the effects of multicollinearity, the original solving algorithm (least square method) of likelihood function in HUP could be replaced with PCA. According to the thought of PCA that extracts the components from a set of variables (relating to parameter estimates), PCA could be used to conduct data compression and statistical analysis of the likelihood function instead of the original least square method in HUP. Generally, the specific procedure of PCA is described as below.

*L*(

_{g}*L*=

_{g}*L*

_{g}_{1},

*L*

_{g}_{2}, …,

*L*) are the characteristic vectors,

_{gp}*g*= 1, 2, …,

*p*;

*F*

_{1}is the first principal component;

*F*

_{2}is the second principal component;

*F*is the

_{p}*p*th principal component. Generally, the number of principal components,

*n*, could be determined using the variance contribution rate .

## RESULTS AND DISCUSSION

### Model application in DFF

#### Model setup in the LNM reach

In the DFF of LNM reach, flood events may be processed with different routing models that depend on the flood types (normal or overbank). Generally, the total calculation of flood at the Tongguan station could be processed as below. For the normal floods, the Muskingum segmentation method was applied to all three calculating channel reaches (flow at Hejin station is added to Longmen station directly). According to the analysis of flood observations, overland floods could only be considered for the Longmen–Tongguan reach and the Huaxian–Tongguan reach. The routing calculation for overland floods of these two reaches are conducted through the empirical method of storage–flow median line and the layered Muskingum method, respectively.

For overland floods in the Longmen–Tongguan reach, the entire channel reach is divided into four sub-reaches according to the location of water level stations (Figure 3). The method of storage–flow median line is applied to each sub-reach. In the first sub-reach, the inflow is the sum of flows at Longmen and Hejin stations while the outflow would be the inflow for the second sub-reach. The next sub-reaches adopt the same treatment and finally output the hydrograph at the Tongguan station.

At the Tongguan station, a total of 43 large flood events (with peaks of >3,000 m^{3}/s) occurred in the flood seasons between 1981 and 2007 were used in this study (see the descriptions in the next sub-section ‘Model calibration and validation’). Among these events, there are 24 and 19 floods mainly generated from the flow of Huaxian station and Longmen station, respectively. There are 15 normal floods and nine overbank floods from the Huaxian station, and the numbers of these two flood types from Longmen station are 14 and 5, respectively. The time step is 1 hour in the calculation of flood forecasting. In DFF, three indices were used to evaluate the model performance: (a) Nash–Sutcliffe coefficient (*NS*, *NS* = 1 for a perfect model match); (b) relative error of peak flow (*RE*_{peak}, %); and (c) bias of peak occurrence time (*BIAS*_{POT}, hour).

#### Model calibration and validation

In this paper, the model parameters were calibrated using the manual method based on model application experiences. Three objective functions (*NS*, *RE _{peak}*, and

*BIAS*) were used to find the optimal parameter set. Figure 4 shows the comparison between the hydrographs from the Muskingum segmentation method and the observations for the normal floods. In the calibration, the range of

_{POT}*RE*of floods from Huaxian station were between −33 and 11% with an average absolute bias of 9%, while the corresponding range for floods from Longmen station was between −6 and 10% with an average absolute value of 5%. This suggested that most flood events had the qualified bias of peak value (±20%). In China, [−20%, +20%] are the permissible error thresholds for evaluating the model accuracy in hydrological forecasting as stated in the official standard (

_{peak}*Standard for hydrological information and hydrological forecasting*(

*P.R.China*), GB/T 22482-2008, released on Nov. 04, 2008). For the index of

*BIAS*

_{POT}, its range of floods from Huaxian station was between −1 and 24 h with the average absolute value of 6.3 h, and the

*BIAS*

_{POT}range of floods from Longmen station was between −4 and 2 h with an average absolute value of 3.2 h. Thus, the results show that as a whole the Muskingum segmentation method has a favorable performance in the calibration of DFF of normal floods. The average

*NS*values were 0.95 and 0.98 for the floods from Huaxian and Longmen stations, respectively. Such high

*NS*value was an evidence of the good match of DFF results and the observations. In the validation, the range of

*RE*of floods from Huaxian station was between −30 and 15% with the average absolute value of 18%, and the corresponding range of floods from Longmen station was between −5 and −1% with an average absolute value of 3%. The range of the

_{peak}*BIAS*

_{POT}of floods from Huaxian station was between −20 and 26 h with average absolute bias of 14 h, while the corresponding range and average absolute value for floods from Longmen station were [−21, −2] and 7.8 h, respectively. The average

*NS*values were 0.95 and 0.86 for floods from Huaxian and Longmen stations, respectively. Although an average error of 14 hours of flood peak seemed to be unsafe to the public, the flood control decision making depended upon the integrated assessment of forecasting accuracy (such as

*NS*,

*RE*, and

_{peak}*BIAS*in this study). It was found that the integrated accuracy of DFF result could be acceptable. Moreover, the flood forecasting in the study reach had been a troublesome task for many years, and the flood forecasting accuracy in current operational forecasting system of the study reach was relatively poor. Consequently, despite the relatively large value of

_{POT}*BIAS*in validation, the results still suggested the applicability of the Muskingum segmentation method in DFF for normal floods.

_{POT}For the application of the layered Muskingum method, six and three overbank floods from the Huaxian station were used for calibration and validation, respectively, as shown in Figure 5(a). Figure 5(b) shows the DFF results of the method of storage–flow median line for the overbank floods from the Longmen station. In the calibration, the range of *RE _{peak}* for floods from Huaxian station was between −31 and 3% with an average absolute value of 16%, and the corresponding range and average absolute value of Longmen station were [−1, 19%] and 11%, respectively. For the index of

*BIAS*

_{POT}, the ranges for overbank floods from Huaxian and Longmen stations were [−18, 9] and [−2, 4] h, respectively, while their average absolute values were 7.3 and 2 h, respectively. The average

*NS*values were 0.93 and 0.95 for floods from Huaxian and Longmen stations, respectively, showing good agreement between the DFF results and observations. There was an obvious underestimation of higher flood events in Figure 5(a). The possible reason may be the larger bias in observed high flow than observed low and medium flow in the middle reaches of Yellow River. Yellow River is one of most serious channel erosion of rivers around the world, and the middle reaches are the main sediment load sources. The channel erosion is more serious for large floods than normal flood, and this produces the larger uncertainty in observed data of high flow than normal floods. In addition, in the regions with infiltration-excess runoff mechanism, the flood forecasting problem is very complex, especially for large floods. This may also be a possible reason for the underestimation in calibration. As a whole, the results proved the practicability of layered Muskingum method in DFF of overbank floods in the Huaxian–Tongguan reach. In the validation, the range of

*RE*of floods from Huaxian station was between 2 and 6% with an average absolute value of 4%, while the range for Longmen station was between −7 and −4% with an average absolute value of 6%. It was found that the DFF result of overbank floods satisfies the accuracy requirement (<20%). For the index of

_{peak}*BIAS*

_{POT}, its range for Huaxian station was between −2 and 9 h with an average absolute value of 4.3 h, while the corresponding range and average absolute value were [0, 1] and 0.5 h, respectively. The average

*NS*values were 0.99 and 0.97 for Huaxian and Longmen stations, respectively. The results suggested that this empirical method based on the reservoir flood routing calculation could be used in the operational DFF.

### Parameter estimation of PCA-HUP

The same datasets of flood events for calibration as the DFF analysis were used for the parameter estimation of PCA-HUP. The marginal distribution function *Γ _{n}* of

*H*could be obtained based on observed flow data {

_{n}*h*} at the Tongguan station. As described above,

_{n}*Γ*was estimated through the PCA algorithm within the PCA-HUP based on the theoretical log-Weibull distribution and the empirical distribution of

_{n}*H*. PCA-HUP can provide probabilistic forecasting results of different leading times. In this paper, we showed the parameter estimations of lead times

_{n}*n*= 1, 6 and 10 hours as an example for comparisons. The PFF results with the corresponding lead times of DFF were also presented, and the

*Q*

_{50%}hydrographs from PFF were selected to compare with the DFF results. The estimates of PCA-HUP parameters are listed in Table 1. The fits of empirical distribution and theoretical log-Weibull distribution for

*H*

_{0},

*H*

_{n},

*S*

_{n}at the lead times

*n*= 1, 6 and 10 hours are plotted in Figure 6. It was found that the log-Weibull distribution has a good fit to the observed frequencies according to the Kolmogorov–Smirnov test. As a consequence, it indicated that the selection of log-Weibull distribution and the parameter estimates are reasonable.

n(h)
. | H_{0}. | H_{n}. | S_{n}. | ||||||
---|---|---|---|---|---|---|---|---|---|

α
. | β
. | ς
. | α
. | β
. | ς
. | α
. | β
. | ς
. | |

1 | 1.91 | 6.16 | 4.68 | 2.08 | 5.99 | 14.37 | .76 | 5.24 | 12.65 |

6 | 1.19 | 6.88 | 16.36 | 2.37 | 5.70 | 14.08 | 3.37 | 4.65 | 11.42 |

10 | 0.50 | 7.58 | 18.01 | 2.64 | 5.44 | 13.60 | 3.64 | 4.39 | 10.81 |

n(h)
. | H_{0}. | H_{n}. | S_{n}. | ||||||
---|---|---|---|---|---|---|---|---|---|

α
. | β
. | ς
. | α
. | β
. | ς
. | α
. | β
. | ς
. | |

1 | 1.91 | 6.16 | 4.68 | 2.08 | 5.99 | 14.37 | .76 | 5.24 | 12.65 |

6 | 1.19 | 6.88 | 16.36 | 2.37 | 5.70 | 14.08 | 3.37 | 4.65 | 11.42 |

10 | 0.50 | 7.58 | 18.01 | 2.64 | 5.44 | 13.60 | 3.64 | 4.39 | 10.81 |

The observed flow series, *H*, is transformed to the normally distributed flow series, *W*. The correlation analysis between the adjacent value series of *H* and *W*, i.e. *H*_{1} ∼ *H*_{0} and *W*_{1} ∼ *W*_{0}, is shown in Figure 7. The good agreements with high correlation coefficients of R^{2} = 0.98 for both *H*_{1} ∼ *H*_{0} and *W*_{1} ∼ *W*_{0} at *n* = 1 hour, respectively, show the obvious characteristic of Markov process. The correlation coefficients were R^{2} = 0.76 and R^{2} = 0.77 for *H*_{6} ∼ *H*_{0} and *W*_{6} ∼ *W*_{0} at *n* = 6 hours, respectively, and R^{2} = 0.55 and R^{2} = 0.57 were for *H*_{10} ∼ *H*_{0} and *W*_{10} ∼ *W*_{0} at *n* = 10 hours, respectively. The correlation coefficients decayed with the leading time increasing obviously. However, all correlation coefficients were in acceptable ranges, and this proved the reasonability of the normal linear assumption in the estimation of prior distribution.

The PCA algorithm was applied to solve the linear relationship between *X _{n}* and

*W*in Equation (10). The parameter estimates for the leading time of 1, 6 and 10 hours are listed in Table 2. As an example, the three-dimensional correlation analysis among

_{n}*W*

_{0},

*W*and

_{n}*X*at

_{n}*n*= 1, 6 and 10 hours are plotted in Figure 8. It was obvious that the linear relationships between the scatters were weakening with the increasing of the lead-times. The linear correlation could be found from a visual inspection and to a certain degree, supported the linear assumption of Equation (10) that has been accepted by many previous studies.

n(h)
. | d_{n}
. | a_{n}
. | b_{n}
. | σ_{n}
. |
---|---|---|---|---|

1 | 0.0594 | 0.7937 | 0.0000 | 0.0160 |

6 | −0.0240 | 0.8715 | −0.0001 | 0.2304 |

10 | −0.0585 | 0.8946 | −0.0002 | 0.4302 |

n(h)
. | d_{n}
. | a_{n}
. | b_{n}
. | σ_{n}
. |
---|---|---|---|---|

1 | 0.0594 | 0.7937 | 0.0000 | 0.0160 |

6 | −0.0240 | 0.8715 | −0.0001 | 0.2304 |

10 | −0.0585 | 0.8946 | −0.0002 | 0.4302 |

### Analysis of PFF

*CR*) and relative band-width (

*RB*) (Xiong

*et al.*2009; Li

*et al.*2017). The formulae for

*CR*and

*RB*are expressed as: where

*n*is the number of observed discharges at hourly time step enveloped by 90% confidence interval;

_{c}*n*is the number of all the observed discharges at hourly time step;

*b*is the band-width of the prediction bound at time

_{i}*i*, and

*O*is the corresponding observed discharge.

_{i}The PCA-HUP model deals with the flow results of 43 floods from DFF and outputs the posterior PDFs of flow at each time step, and the datasets of flood events for calibration and validation is the same as the DFF analysis. In this paper, for a specific flood, the uncertainty bound of 90% confidence interval is used for the forecast bound, and we chose a hydrograph of 50% quantile (same as the expected value) as an example as the deterministic forecast from PFF (*Q*_{50%}) which could be compared to the DFF result. The comparisons between DFF and PFF (*Q*_{50%}) are listed in Table 3.

Flood events . | Calibration . | Validation . | ||||
---|---|---|---|---|---|---|

A
. _{RE} | A
. _{BIAS} | A
. _{NS} | A
. _{RE} | A
. _{BIAS} | A
. _{NS} | |

DFF | ||||||

hxn | 9 | 6.3 | 0.95 | 18 | 14.0 | 0.95 |

lmn | 5 | 3.2 | 0.98 | 3 | 7.8 | 0.86 |

hxo | 16 | 7.3 | 0.93 | 4 | 4.3 | 0.99 |

lmo | 11 | 2.0 | 0.95 | 6 | 0.5 | 0.97 |

PFF | ||||||

hxn | 8 | 6.1 | 0.96 | 10 | 16.8 | 0.97 |

lmn | 15 | 2.2 | 0.97 | 14 | 6.8 | 0.90 |

hxo | 13 | 15.0 | 0.95 | 4 | 3.7 | 0.99 |

lmo | 9 | 3.3 | 0.95 | 17 | 1.0 | 0.97 |

Flood events . | Calibration . | Validation . | ||||
---|---|---|---|---|---|---|

A
. _{RE} | A
. _{BIAS} | A
. _{NS} | A
. _{RE} | A
. _{BIAS} | A
. _{NS} | |

DFF | ||||||

hxn | 9 | 6.3 | 0.95 | 18 | 14.0 | 0.95 |

lmn | 5 | 3.2 | 0.98 | 3 | 7.8 | 0.86 |

hxo | 16 | 7.3 | 0.93 | 4 | 4.3 | 0.99 |

lmo | 11 | 2.0 | 0.95 | 6 | 0.5 | 0.97 |

PFF | ||||||

hxn | 8 | 6.1 | 0.96 | 10 | 16.8 | 0.97 |

lmn | 15 | 2.2 | 0.97 | 14 | 6.8 | 0.90 |

hxo | 13 | 15.0 | 0.95 | 4 | 3.7 | 0.99 |

lmo | 9 | 3.3 | 0.95 | 17 | 1.0 | 0.97 |

Table 3 and Figure 9 show the PFF results of 29 normal floods from both Huaxian and Longmen stations with lead time of the duration of each event. In the calibration, the range of *CR* was between 45 and 100% with an average value of 92% for normal floods from the Huaxian station. The range of *CR* for normal floods from Longmen station was between 88 and 100% with an average value of 96%. For the measure of *RE _{peak}*, the range of the

*Q*

_{50%}hydrographs in PFF was between −26 and 5% with the average absolute bias of 8% for normal floods from the Huaxian station. The range of

*RE*for normal floods from Longmen station was between −26 and −6% with an average absolute bias of 15%. Results show that the

_{peak}*Q*

_{50%}hydrograph in PFF had a higher accuracy of

*RE*than DFF for normal floods from the Huaxian station but a lower accuracy for Longmen station. For the measure of

_{peak}*BIAS*, the range of the

_{POT}*Q*

_{50%}hydrographs in PFF was between 0 and 28 h with the average absolute bias of 6.1 h for normal floods from the Huaxian station. The range of

*BIAS*for normal floods from Longmen station was between −1 and 3 h with an average absolute bias of 2.2 h. Results show that the

_{POT}*Q*

_{50%}hydrograph in PFF had nearly the same accuracy of

*BIAS*

_{POT}with DFF for normal floods from the Huaxian station but a higher accuracy for Longmen station. The average absolute

*NS*of these two cases (i.e. normal floods from the Huaxian and Longmen stations) were 0.96 and 0.97, respectively, which shows nearly the same accuracy as the DFF results. Thus, the results show that the PCA-HUP model had a favorable performance in the calibration of PFF of normal floods. In the validation, the range of

*CR*was between 97 and 100% with an average value of 99% for normal floods from the Huaxian station. The range of

*CR*for normal floods from Longmen station was between 54 and 99% with an average value of 83%. For the measure of

*RE*, the range of the

_{peak}*Q*

_{50%}hydrographs in PFF was between −16 and 14% with an average absolute bias of 10% for normal floods from the Huaxian station. The range of

*RE*for normal floods from Longmen station was between −25 and −7% with an average absolute bias of 14%. Results show that the

_{peak}*Q*

_{50%}hydrograph in the calibration and validation had a higher accuracy of

*RE*than DFF for normal floods from the Huaxian station but lower than the corresponding value of the Longmen station. For the measure of

_{peak}*BIAS*

_{POT}, the range of the

*Q*

_{50%}hydrographs in PFF was between −29 and −10 h with an average absolute bias of 16.8 h for normal floods from the Huaxian station. The range of

*BIAS*

_{POT}for normal floods from Longmen station was between −21 and 0 h with an average absolute bias of 6.8 h. Results show that the

*Q*

_{50%}hydrograph in the validation had a slightly lower accuracy of

*BIAS*than DFF for normal floods from the Huaxian station but a slightly higher accuracy for Longmen station. The average absolute

_{POT}*NS*of these two cases (i.e. normal floods from the Huaxian and Longmen stations) were 0.97 and 0.90, respectively, which were slightly higher than the corresponding values of DFF. As a whole,

*Q*

_{50%}hydrographs in PFF provided slightly better deterministic forecasts than DFF for normal floods.

Table 3 and Figure 10 show the PFF results of 14 overbank floods from both Huaxian and Longmen stations with lead time of the duration of each event. In the calibration, the range of *CR* was between 62 and 100% with an average value of 87% for overbank floods from the Huaxian station. The range of *CR* for overbank floods from Longmen station was between 85 and 92% with an average absolute bias of 88%. For the measure of *RE _{peak}*, the range of the

*Q*

_{50%}hydrographs in PFF was between −26 and −4% with an average absolute bias of 13% for overbank floods from the Huaxian station. The range of

*RE*

_{pea}_{k}for overbank floods from Longmen station was between −11 and 5% with an average absolute bias of 9%. The

*Q*

_{50%}hydrographs in PFF also show better accuracy in the measure of

*RE*(narrower varying range) than DFF for the overbank floods from the Huaxian and Longmen stations. For the measure of

_{peak}*BIAS*

_{POT}, the range of the

*Q*

_{50%}hydrographs in PFF was between −29 and 18 h with an average absolute bias of 15 h for overbank floods from the Huaxian station. The range of

*BIAS*

_{POT}for overbank floods from Longmen station was between −2 and 6 h with an average absolute bias of 3.3 h. Results show that the

*Q*

_{50%}hydrograph in the validation had a poorer accuracy of

*BIAS*than DFF for floods from Huaxian and Longmen stations. In addition, the same value of average

_{POT}*NS*was 0.95 for PFF of overbank floods from both Huaxian and Longmen stations. The accuracy comparison between PFF and DFF was similar to the case of normal flood. In the validation, the range of

*CR*was between 97 and 100% with an average value of 99% for overbank floods from the Huaxian station. The range of

*CR*for overbank floods from Longmen station was between 95 and 100% with an average value of 98%. For the measure of

*RE*, the range of the

_{peak}*Q*

_{50%}hydrographs in PFF was between −6 and 5% with an average absolute bias of 4% for overbank floods from the Huaxian station. The range of

*RE*for overbank floods from Longmen station was between −21 and −13% with an average absolute bias of 17%. Results show that the

_{peak}*Q*

_{50%}hydrograph in the validation had a lower accuracy of

*RE*than DFF. For the measure of

_{peak}*BIAS*, the range of the

_{POT}*Q*

_{50%}hydrographs in PFF was between −3 and 7 h with an average absolute bias of 3.7 h for overbank floods from the Huaxian station. The range of

*BIAS*

_{PO}_{T}for overbank floods from Longmen station was between 0 and 2 h with an average absolute bias of 1 h. Results show that the

*Q*

_{50%}hydrograph in the validation had nearly the same accuracy of

*BIAS*with DFF. The average absolute

_{POT}*NS*of these two cases (i.e. overbank floods from the Huaxian and Longmen stations) were 0.99 and 0.97, respectively, which shows the same conclusion. As a whole, it could be concluded that the

*Q*

_{50%}hydrograph in PFF could provide a good accuracy of deterministic forecast results for both normal and overbank floods.

### Model performance decay with lead time increasing

In this paper, the DFF and PFF results with lead times n = 1, 6 and 10 hours were calculated for the analysis on the performance decay with the lead time increasing. Table 4 shows the statistical results of DFF of 43 floods with lead times n = 1, 6 and 10 hours. As can be seen, the range of RE_{peak} was between −5 and 1% with an average value of 1% for n = 1 h, the range of RE_{peak} was between −22 and 8% with an average value of 6% for n = 6 h and the corresponding range was between −28 and 8% with an average value of 9% for n = 10 h. For the measure of BIAS_{POT}, the range of lead time n = 1 was [−10, 2] h with an average absolute bias of 1.3 h, the ranges of n = 6 and 10 hours were [−24, 56] and [−20, 45] h with an average absolute bias of 7.0 and 8.5 h, respectively. In addition, the range of NS was between 0.73 and 0.99 with an average value of 0.98 for n = 1 h, the corresponding range of n = 6 h was between 0.65 and 0.99 with an average value of 0.96 and the corresponding range of n = 10 h was between 0.60 and 0.99 with an average value of 0.95. Results show that the accuracies of the three indices all decrease with the lead time increasing. As a consequence, it can be concluded that the performance of DFF decays with the lead time increasing.

n(h)
. | R
. _{RE} | A
. _{RE} | R
. _{BIAS} | A
. _{BIAS} | R
. _{NS} | A
. _{NS} |
---|---|---|---|---|---|---|

1 | [−5,1] | 1 | [−10,2] | 1.3 | [0.73,0.99] | 0.98 |

6 | [−22,8] | 6 | [−24,56] | 7.0 | [0.65,0.92] | 0.96 |

10 | [−28,8] | 9 | [−20,45] | 8.5 | [0.60,0.95] | 0.95 |

n(h)
. | R
. _{RE} | A
. _{RE} | R
. _{BIAS} | A
. _{BIAS} | R
. _{NS} | A
. _{NS} |
---|---|---|---|---|---|---|

1 | [−5,1] | 1 | [−10,2] | 1.3 | [0.73,0.99] | 0.98 |

6 | [−22,8] | 6 | [−24,56] | 7.0 | [0.65,0.92] | 0.96 |

10 | [−28,8] | 9 | [−20,45] | 8.5 | [0.60,0.95] | 0.95 |

In addition, Table 5 shows the statistical results of PFF of 43 floods with lead times *n* = 1, 6 and 10 hours. The range of *RB* was between 0.12 and 0.29 with an average value of 0.2 for *n* = 1 h, the corresponding range for *n* = 6 h was between 0.33 and 0.79 with an average value of 0.59 and the corresponding range for *n* = 10 h was between 0.37 and 1.15 with an average value of 0.68. Results showed that the accuracy of *RB* decreased with the lead time increasing. For the measure of *CR*, the range was [67, 100%] with an average value of 94% for *n* = 1 h, the corresponding ranges for *n* = 6 and 10 hours were [55, 100%] and [50, 100%] with average values of 91 and 90%, respectively. It can be seen that the *CR* decreased with the lead time increasing. For the measure of *RE _{peak}*, the range of

*n*= 1 h was between −5 and 1% with an average absolute bias of 1%, the corresponding range of

*n*= 6 h was between −25 and 8% with an average absolute bias of 6% and the corresponding range of lead time

*n*= 10 h was between −29 and 8% with an average absolute bias of 10%. The range of

*BIAS*for

_{POT}*n*= 1 h was [−10, 2] h with an average absolute bias of 1.5 h, the corresponding ranges for

*n*= 6 and 10 hours were [−25, 58] and [−21, 46] h with an average absolute bias of 7.4 and 9.8 h, respectively. In addition, for the measure of

*NS,*the range was between 0.73 and 0.99 with an average value of 0.98 for

*n*= 1 h, the corresponding range for

*n*= 6 h was between 0.68 and 0.99 with an average value of 0.97 and the corresponding range for

*n*= 10 h was between 0.61 and 0.99 with an average value of 0.96. Results showed that the accuracies of these three indices decreased with the lead time increasing. As a whole, the performance of PFF decayed with the lead time increasing.

n(h)
. | R
. _{RB} | A
. _{RB} | R
. _{CR} | A
. _{CR} | R
. _{RE} | A
. _{RE} | R
. _{BIAS} | A
. _{BIAS} | R
. _{NS} | A
. _{NS} |
---|---|---|---|---|---|---|---|---|---|---|

1 | [0.12,0.29] | 0.20 | [67,100] | 94 | [−5,1] | 1 | [−10,2] | 1.5 | [0.73,0.99] | 0.98 |

6 | [0.33,0.79] | 0.59 | [55,100] | 91 | [−25,8] | 6 | [−25,58] | 7.4 | [0.68,0.92] | 0.97 |

10 | [0.37,1.15] | 0.68 | [50,100] | 90 | [−29,8] | 10 | [−21,46] | 9.8 | [0.61,0.95] | 0.96 |

n(h)
. | R
. _{RB} | A
. _{RB} | R
. _{CR} | A
. _{CR} | R
. _{RE} | A
. _{RE} | R
. _{BIAS} | A
. _{BIAS} | R
. _{NS} | A
. _{NS} |
---|---|---|---|---|---|---|---|---|---|---|

1 | [0.12,0.29] | 0.20 | [67,100] | 94 | [−5,1] | 1 | [−10,2] | 1.5 | [0.73,0.99] | 0.98 |

6 | [0.33,0.79] | 0.59 | [55,100] | 91 | [−25,8] | 6 | [−25,58] | 7.4 | [0.68,0.92] | 0.97 |

10 | [0.37,1.15] | 0.68 | [50,100] | 90 | [−29,8] | 10 | [−21,46] | 9.8 | [0.61,0.95] | 0.96 |

Graphs of the indices involved in both DFF and PFF are shown in Figure 11. It can be seen that the accuracies of all these three indices decreased with the lead time increasing. The same conclusion that the performances of DFF and PFF decayed with the lead time increasing can also be obtained. Moreover, it found that the accuracies of Q_{50%} were slightly lower than DFF in RE_{peak} and BIAS_{POT}, while Q_{50%} had a slightly higher accuracy than DFF in NS. As a whole, PFF could provide deterministic forecasting results with good accuracy.

To explore the relationship between *CR* and *RB*, the values of these two indices under different confidence intervals are listed in Table 6. It can be seen that as the confidence intervals narrowed, the *CR* and *RB* with different lead times all decreased. From Figure 12, it was found that *RB* had a strong linear correlation with *CR*. This indicated that the *RB* increased with the *CR* increasing. What is more, the regression lines of *RB* moved down with the lead time increasing, this also indicated that the performance of PFF decayed with the lead time increasing.

Confidence intervals (%) . | n = 1 h. | n = 6 h. | n = 10 h. | |||
---|---|---|---|---|---|---|

CR (%)
. | RB
. | CR (%)
. | RB
. | CR (%)
. | RB
. | |

90 | 100 | 0.15 | 100 | 0.42 | 100 | 0.45 |

80 | 100 | 0.12 | 97 | 0.33 | 95 | 0.35 |

70 | 97 | 0.10 | 95 | 0.27 | 89 | 0.29 |

60 | 88 | 0.08 | 84 | 0.22 | 82 | 0.23 |

50 | 76 | 0.06 | 76 | 0.17 | 63 | 0.19 |

40 | 63 | 0.05 | 61 | 0.13 | 51 | 0.15 |

Confidence intervals (%) . | n = 1 h. | n = 6 h. | n = 10 h. | |||
---|---|---|---|---|---|---|

CR (%)
. | RB
. | CR (%)
. | RB
. | CR (%)
. | RB
. | |

90 | 100 | 0.15 | 100 | 0.42 | 100 | 0.45 |

80 | 100 | 0.12 | 97 | 0.33 | 95 | 0.35 |

70 | 97 | 0.10 | 95 | 0.27 | 89 | 0.29 |

60 | 88 | 0.08 | 84 | 0.22 | 82 | 0.23 |

50 | 76 | 0.06 | 76 | 0.17 | 63 | 0.19 |

40 | 63 | 0.05 | 61 | 0.13 | 51 | 0.15 |

The measures were calculated for all the 43 flood events.

## CONCLUSIONS

In this paper, the performance of the developed PCA-HUP model was tested based on the DFF results in LNM of the middle reach of Yellow River, China. Several deterministic routing approaches used in the operational flood forecasting, including the Muskingum methods and the empirical method of storage–flow median line, were applied to 43 large floods including both normal and overbank floods. The DFF outputs were considered as the input of PFF (i.e. PCA-HUP model). The deterministic hydrograph from PFF was compared to DFF for evaluating the accuracy in large floods. In addition, the model performances of DFF and PFF with lead time increasing were also investigated. The main conclusions of this study are as follows:

The practical routing methods were used to provide the DFF input for PCA-HUP. Three indices (

*NS*,*RE*and_{peak}*BIAS*) were used to evaluate the model performance. The average_{POT}*NS*values were ≥0.86 and ≥0.97 for normal and overbank floods, respectively. Results proved the applicability of the DFF method in the study river reach.Within the ‘model-free’ framework of BFS, PCA-HUP provides the posterior PDFs of forecasted flow based on the DFF results from the practical routing methods. Thus, both uncertainty information (e.g. 90% confidence interval) and deterministic forecasts (e.g.

*Q*_{50%}hydrograph) could be obtained from the PDFs of forecasted flow. The average value of CR of observations within 90% confidence interval (*CR*) was greater than 82% for all cases (normal and overbank floods at different stations for both calibration and validation periods). The average*NS*values were ≥0.90 and ≥0.97 for normal and overbank floods, respectively, according to the*Q*_{50%}hydrograph. It indicated that the*Q*_{50%}forecast could provide slightly better deterministic forecasting results than DFF in the study reach.A comparative investigation was conducted for the analysis of the DFF and PFF performances at lead times of

*n*= 1, 6 and 10 hours. Results showed that the performances of both DFF and PFF decayed with the lead time increasing.In the investigation of the performance decay of PFF with lead time increasing, an exploration of the relationship between the

*CR*and*RB*was also presented*.*Results showed that*RB*had a strong linear correlation with*CR*. This finding would help the decision maker analyze the abundant information of PFF results.

As a consequence, PCA-HUP can be used for the operational forecasting of both normal and overbank floods in the middle reach of Yellow River. When combined with the deterministic models, PCA-HUP could provide both deterministic forecast results (e.g. 50% quantile hydrograph) and fundamental uncertainty information of forecast. This study could be useful for flood control and water resources management under changing environments.

## ACKNOWLEDGEMENTS

This study was supported by the National Key R&D Program of China (2016YFC0402706, 2016YFC0402709), National Natural Science Foundation of China (41730750, 51179046), Public Welfare Industry Special Fund Project of Ministry of Water Resources of China (201401034, 201501004), and the Fundamental Research Funds for the Central Universities of China (2016B10914).

## REFERENCES

*.*

*.*