ABSTRACT
The UK standard for estimating flood frequencies is outlined by the flood estimation handbook (FEH) and associated updates. Estimates inevitably come with uncertainty due to sampling error as well as model and measurement error. Using resampling approaches adapted to the FEH methods, this paper quantifies the sampling uncertainty for single site, pooled (ungauged), enhanced single site (gauged pooling) and across catchment types. This study builds upon previous progress regarding easily applicable quantifications of FEH-based uncertainty estimation. Where these previous studies have provided simple analytical expressions for quantifying uncertainty for single site and ungauged design flow estimates, this study provides an easy-to-use method for quantifying uncertainty for enhanced single site estimates.
HIGHLIGHTS
Bespoke bootstrap methods for quantifying uncertainty for ungauged and enhanced single site FEH design flow estimation.
Comparison of flood estimation uncertainty across catchment types.
Simple equations to derive variance and standard error for enhanced single site design flow estimates.
INTRODUCTION
- 1.
describe bootstrap methods for estimating fse for the FEH08 SS, UG and ESS cases;
- 2.
use the methods of point one to apply and compare the uncertainty across these cases at sites considered suitable for pooling (National River Flow Archive);
- 3.
compare the uncertainty across different catchment types; and
- 4.
develop a simple expression for quantifying uncertainty for the ESS estimates.
The paper has four main parts: firstly, a summary of the data used in the study; secondly, a section describing the fse and the bootstrap methods applied to estimate it for the SS, UG and ESS cases. This method section also details the approach to compare fse across catchment types and the analytical expression for calculating fse in the ESS case. Thirdly, a summary of all the results is provided, and lastly, some concluding remarks.
STUDY AREA AND DATA
Parameter . | Min. . | 1st Qu. . | Median . | Mean . | 3rd Qu. . | Max. . |
---|---|---|---|---|---|---|
QMED | 0.1 | 14.81 | 49.22 | 100.45 | 134.76 | 1,027.88 |
L-CV | 0.068 | 0.168 | 0.2 | 0.2101 | 0.243 | 0.544 |
L-SKEW | −0.208 | 0.115 | 0.191 | 0.19 | 0.263 | 0.69 |
Parameter . | Min. . | 1st Qu. . | Median . | Mean . | 3rd Qu. . | Max. . |
---|---|---|---|---|---|---|
QMED | 0.1 | 14.81 | 49.22 | 100.45 | 134.76 | 1,027.88 |
L-CV | 0.068 | 0.168 | 0.2 | 0.2101 | 0.243 | 0.544 |
L-SKEW | −0.208 | 0.115 | 0.191 | 0.19 | 0.263 | 0.69 |
The analysis undertaken for this study, using the NRFA data, was undertaken using Base R (R Core Team 2019) and the UKFE R package (Hammond 2020).
QUANTIFYING UNCERTAINTY OF DESIGN FLOWS
BOOTSTRAPPING TO APPROXIMATE THE SAMPLING DISTRIBUTION
- 1.
Randomly select from the sample with replacement N times (where N is the samples size)
- 2.
Repeat step 1 M times to create M bootstrapped samples
- 3.
Calculate the design flow for each of the M samples to approximate the sampling distribution of the design flow
- 4.
Create log-transformed residuals by subtracting the log mean of the sampling distribution from each of the log-transformed design flows
- 5.The fse is then the exponent of the standard deviation of the log-transformed residuals (Equation (3)).
Faulkner & Jones (1999) and Burn (2003) applied a balanced resampling approach where, in place of steps 1 and 2 above, the AMAX series is repeated M times, permuted and split into M samples. The balanced resampling ensures that each value in the AMAX series appears an equal number of times in the union of resampled datasets. For this study, balanced resampling was compared with random resampling by calculating the single site QMED fse applying both methods to each of the 545 AMAX series twice (providing four distributions of fse). A Kruskal–Wallis test found no significant difference between the four fse samples (p-value = 0.86). To maintain intersite correlation within the pooled group, Burn (2003) also applied vector resampling, whereby each year across the AMAX series in a pooling group, is resampled together. Intersite dependence within a pooling group increases the uncertainty in comparison to the same pooling group with no dependence. This is because there is less information when deriving the L-moment ratios. Conversely, calculations of uncertainty would appear to decrease. To see why we can take an extreme example where every site in a pooling group is perfectly correlated. The L-CV and L-SKEW would be the same for each site (there would be less variance), but the sample size has remained the same. The standard error is a function of variance and sample size and is decreased by a reduction in variance or a larger sample size. Therefore, a pooling group with perfectly correlated sites provides no more confidence in our growth curve estimation than one of the sites alone, but an estimation of uncertainty would appear to decrease in comparison. Maintaining intersite dependence within the bootstrapping procedure underestimates the uncertainty for groups with dependence, as does non-vectorised bootstrapping. Fundamentally, intersite dependence undermines the benefits of RFA, hence Hosking & Wallis (1997, p. 8) list independence at different sites as an assumption of the index flood-based RFA. Thankfully, no bias is caused by a violation of this independence assumption and no two AMAX samples within a pooling group will be perfectly correlated. Therefore, the addition of a further site (assumed to have the same scaled distribution as the subject site) to a pooling group, will always provide useful information. However, it would be sensible to consider highly dependent catchments when forming the pooling group, because replacing a catchment with high dependence with an independent site will reduce the uncertainty. In this study, random resampling has been applied and the estimation of fse error assumes intersite independence.
THE METHOD FOR UNGAUGED CATCHMENTS
This approach to approximating the ungauged QMED estimate sampling distribution is used as part of the process for calculating fse for UG pooling analysis. For calculating the fse for a UG pooled estimate, the following steps were applied (where N equals 500):
- 1.
Bootstrap each site of the pooling group individually to create N new samples of the same size
- 2.
Create N new pooling groups from the bootstrapped samples of each site
- 3.
Undertake a weighted (based on pooling group weightings) random selection of a single site from each of the N pooling groups and calculate the growth factor for each
- 4.
Sample from the QMED sampling distribution (Equation (5)) N times
- 5.
Multiply the results of step 4 by the results of step 3 to derive N estimates which approximates the sampling distribution
- 6.
Apply Equation (3) to the results of step 5 to derive the fse.
For the ungauged pooled estimates (of the 545 catchments detailed above – assumed ungauged), the default FEH08 pooling groups were used and if the site was urban, an urban adjustment was applied to the ungauged QMED estimate and growth curve (Wallingford HydroSolutions 2016). Where a donor or two donors were applied, the closest rural sites (Bayliss et al. 2006) were used. The single donor method was that of Environment Agency (2008) and the two-donor method was that of Kjeldsen (2019). The GLO distribution was used for estimating the growth curve (Equation (1)) with the L-moments method and UG weighted L-CV and L-SKEW (for more detail, see Environment Agency 2008).
METHOD FOR GAUGED CATCHMENTS
The approach for the ESS case is more straightforward and the following steps were applied (where N equals 500).
- 1.
Bootstrap each site of the pooling group individually to create N new samples of the same size
- 2.
Create N new pooling groups from the bootstrapped samples of each site
- 3.
Estimate N growth curves, one for each pooling group using the ESS weighting method
- 4.
Bootstrap the at-site sample N times and calculate the median for each sample to approximate the gauged QMED sampling distribution
- 5.
Sample from the QMED sampling distribution N times
- 6.
Multiply the results of step 5 by the results of step 3 to derive N design flow estimates
- 7.
Apply Equation (3) to the results of step 6 to derive the fse.
For the pooled ESS estimates (of the 545 catchments detailed above), the default FEH08 pooling groups were used. If the gauged catchment was urban, it was included in the pooling group and deurbanised before an urban adjustment was made to the final growth curve (Wallingford HydroSolutions 2016). The GLO distribution was used for estimating the growth curve (Equation (1)) with the L-moments method and ESS weighted L-CV and L-SKEW (for more detail, see Environment Agency 2008).
AN EASY-TO-USE EQUATION FOR ESS VARIANCE
COMPARISON OF RESULTS ACROSS CATCHMENT TYPES
A comparison of uncertainty across small catchments (<25 km2) (Faulkner et al. 2012), permeable catchments (BFIHOST > 0.65) (Faulkner & Barber 2009) and urban catchments (URBEXT2000 > 0.03) (Bayliss et al. 2006) were undertaken to determine if there are significant differences between them and catchments which are larger, primarily rural, and are not considered permeable. To ensure that the sample size did not influence the comparison, the following steps were taken to approximate the sampling distribution of median fse for comparison (using catchment size as an example):
- 1.
Split the gauges into small and large catchments
- 2.
List the sample sizes for the small catchments
- 3.
Find, for each sample size in step 2, the large catchments with the same sample size
- 4.
Create a sample of large catchments, the same size as the small catchments sample (N), by randomly sampling from the results of step 3. List the fses from the sample of large catchments
- 5.
Repeat step 4500 times and derive a median fse for each of the 500 samples (providing the sampling distribution for the large catchments)
- 6.
Resample with replacement N*500 fse from the small catchment fses and calculate the median for each of the 500 samples (providing the sampling distribution for the small catchments)
- 7.
Compare the two distributions.
These steps were undertaken to compare the UG, SS and ESS 100-year fse across the different catchment types.
RESULTS AND DISCUSSION
Results for ungauged uncertainty
Table 2 provides the resulting fse for ungauged QMED estimates and the mean fse for the UG pooled estimates of longer return periods.
No. Dons . | 2 . | 5 . | 10 . | 20 . | 50 . | 100 . | 200 . | 500 . | 1,000 . |
---|---|---|---|---|---|---|---|---|---|
0 | 1.46 | 1.469 | 1.488 | 1.513 | 1.560 | 1.606 | 1.661 | 1.752 | 1.835 |
1 | 1.42 | 1.432 | 1.449 | 1.475 | 1.523 | 1.569 | 1.627 | 1.721 | 1.804 |
2 | 1.41 | 1.412 | 1.431 | 1.458 | 1.507 | 1.553 | 1.613 | 1.706 | 1.789 |
No. Dons . | 2 . | 5 . | 10 . | 20 . | 50 . | 100 . | 200 . | 500 . | 1,000 . |
---|---|---|---|---|---|---|---|---|---|
0 | 1.46 | 1.469 | 1.488 | 1.513 | 1.560 | 1.606 | 1.661 | 1.752 | 1.835 |
1 | 1.42 | 1.432 | 1.449 | 1.475 | 1.523 | 1.569 | 1.627 | 1.721 | 1.804 |
2 | 1.41 | 1.412 | 1.431 | 1.458 | 1.507 | 1.553 | 1.613 | 1.706 | 1.789 |
Results for gauged uncertainty
Table 3 provides six number summaries of the ESS fse estimated across the 545 sites suitable for pooling for a range of return periods (T).
T . | Min. . | 1st Qu. . | Median . | Mean . | 3rd Qu. . | Max. . |
---|---|---|---|---|---|---|
5 | 1.017 | 1.055 | 1.071 | 1.082 | 1.094 | 1.599 |
10 | 1.021 | 1.060 | 1.078 | 1.089 | 1.101 | 1.685 |
20 | 1.025 | 1.068 | 1.084 | 1.096 | 1.108 | 1.675 |
50 | 1.030 | 1.078 | 1.095 | 1.107 | 1.121 | 1.621 |
100 | 1.033 | 1.085 | 1.105 | 1.117 | 1.131 | 1.685 |
200 | 1.038 | 1.095 | 1.116 | 1.128 | 1.142 | 1.671 |
500 | 1.047 | 1.108 | 1.130 | 1.143 | 1.160 | 1.657 |
1,000 | 1.055 | 1.119 | 1.142 | 1.156 | 1.172 | 1.666 |
T . | Min. . | 1st Qu. . | Median . | Mean . | 3rd Qu. . | Max. . |
---|---|---|---|---|---|---|
5 | 1.017 | 1.055 | 1.071 | 1.082 | 1.094 | 1.599 |
10 | 1.021 | 1.060 | 1.078 | 1.089 | 1.101 | 1.685 |
20 | 1.025 | 1.068 | 1.084 | 1.096 | 1.108 | 1.675 |
50 | 1.030 | 1.078 | 1.095 | 1.107 | 1.121 | 1.621 |
100 | 1.033 | 1.085 | 1.105 | 1.117 | 1.131 | 1.685 |
200 | 1.038 | 1.095 | 1.116 | 1.128 | 1.142 | 1.671 |
500 | 1.047 | 1.108 | 1.130 | 1.143 | 1.160 | 1.657 |
1,000 | 1.055 | 1.119 | 1.142 | 1.156 | 1.172 | 1.666 |
As can be seen in Figure 5, the benefits of ESS analysis over SS increases significantly after the 20-year return period. That is on average, sites with very few years of data would benefit at shorter return periods.
Result and example of use for an easy-to-use equation for ESS variance
As a comparison, the single site estimate of variance (Kjeldsen 2021) for the same AMAX (the L-SKEW is 0.1429, and β is 0.18133) is 933. ESS bootstrapped intervals were derived 30 times and the mean upper and lower 95% intervals were 156 and 217 m3 /s. When using Equation (11), more scrutiny should be applied where the AMAX statistics fall outside the range of those in Table 1. Although the author considers Equation (11) to be readily applicable, given the dispersion of the estimates as flows increase, it is acknowledged that the model could be improved. Consideration of weights in the pooling groups and the individual variance of estimates from pooling group members may be fruitful in this regard. Such investigation may also provide a variance estimator for individual ungauged pooling design lows as a function of pooled parameter estimates.
Results of uncertainty across catchment type
The increased uncertainty in rural catchments when compared with urban in the ESS case (Figure 11, right plot) appears to be due to the fse of the observed QMED in rural catchments being greater (Figure 12, right plot). Why this is the case, is not clear. It may be due to control structures, flood management schemes and abstractions for urban areas that reduce the variance in flow even at the 2-year level.
CONCLUSION
Bootstrap methods, bespoke to the FEH08 pooling procedures, have been detailed and applied to the SS, UG and gauged pooling (ESS) cases, to derive fses for design flow estimates. The resulting fse estimates have been compared across the FEH08 cases (SS, UG and ESS) and across catchment types for all sites considered suitable for pooling (while accounting for sample size). Building on the work of previous studies, it was established that there is a need and a trend for providing easy-to-use equations for estimating uncertainty for design flow estimates. Given the complexity of the FEH procedures, such equations have been a long time coming since the FEH of 1999. From the initial work of Kjeldsen (2015), Dixon et al. (2017) established quadratic expressions for fse estimates in the case of UG pooling groups. Here, updated versions have been established applying an approach that specifically considered the uncertainty associated with each pooling group. Kjeldsen (2021) established an easy-to-use equation for the SS case, assuming a GLO distribution. This SS variance equation was adapted here for use with design flow estimates in the case of gauged pooling groups (ESS). For the quantification of uncertainty, there is now an analytical and easy to apply set of equations for single site, ungauged and gauged FEH-based design flow estimation. Quantification of uncertainty, and importantly, simple to apply quantification of uncertainty, for estimated design flows, will enable flood risk management authorities and engineers to make better decisions when considering flood risk management infrastructure.
ACKNOWLEDGEMENTS
The author would like to thank all those responsible for the availability of the National River Flow Archive. The author would also like to thank the two anonymous reviewers for their helpful comments.
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories (https://nrfa.ceh.ac.uk/peak-flow-dataset).