## Abstract

Simulating rare widespread hydrological events can be difficult even with the use of modelled data such as the UKCP18 12 km regional climate projections. To generate larger event sets for application in catastrophe modelling, two statistical approaches are highlighted and applied to widespread GB-generated flooding events using a grid-based hydrological model and UKCP18 regional projections. An Empirical Copula method was applied on a national scale, generating over 600,000 events across two time-slices (1980–2010 and 2050–2080). This was compared to model-generated events and showed good matching across time-slices and ensemble members, although lacked some ability to describe the least-rare events. The Empirical Copula was also compared to an implementation of a conditional exceedance model. This model was much more computationally intensive so was restricted to Northwest England but offered the ability to be tuned more finely through choices of marginal distributions. Analysing over 11,000 events, it also matched well with the Empirical Copula and model-generated events but under-represented the smallest events. Both approaches require a broad dataset to draw from but showed reasonable efficacy. For simple statistics, the Empirical Copula shows the potential to be a powerful tool in exploring spatial structure over large regions or at a fine spatial resolution.

## HIGHLIGHTS

The Empirical Copula and the Heffernan–Tawn model both can be used effectively to generate national-scale extreme flooding event sets for risk analysis.

The Empirical Copula proved more stable over larger areas due to much lower computational costs.

The methods were used to investigate and highlight the differences in the spatial structure of widespread flooding events between 2010 and 2080.

## INTRODUCTION

Quantifying risk due to extreme high flow events on a regional or national scale requires knowledge of the correlation of floods occurring in many places at once. One common approach, often applied in catastrophe modelling, is to refer to a suite of observed or derived scenarios, and measure the frequency of key features, including the loss associated to an event, to understand the return period or likelihood of an event generating a similar amount of loss occurring. Sayers *et al.* (2002) discuss the benefits having a risk-centred approach to flood hazard management. However, naively using sets of observed events, or events modelled on current or previous climatic, meteorological or hydrological conditions may neglect the possibilities of changes over time. Climate change, and its impact on global hydrology, is a clear driver of this, and much work, such as the Third Climate Change Risk Assessment (Slingo 2021) makes use of long time series and ensembles of model runs to perform analysis of future scenarios.

Flood risk assessments sometimes suffer from a lack of modelling of spatial dependence, due to comprising a collection of local-scale risk assessments, rather than a harmonious national-scale one. Keef *et al.* (2009, 2011) and Villarini *et al.* (2011) showed that in the UK and United States that there is positive dependence between locations for the most extreme events, and so there has been a drive to perform risk analysis (via catastrophe modelling via event sets, or otherwise) on a national scale through the use of hydrological modelling to describe the impacts of flooding.

As discussed in Zanardo & Salinas (2022), a large event set is essential in catastrophe modelling to be able to determine the relative frequency of events with different levels of risk or loss, as there is no well-known natural approach to theoretically model the frequency of any specific class of event. There are several approaches to generating a large event set for risk analysis. A perturbed parameter ensemble (PPE) can be used (Kelder *et al.* 2020; Kay *et al.* 2021) to cover a wide set of possible outcomes, especially if only one hydrological model is used. Alternatively, stochastic weather generators can use random processes and Monte Carlo approaches to generate a large number of possible rainfall events to feed into a hydrological model (Wright *et al.* 2020; Matte *et al.* 2021). Instead of directly using high-dimensional or high-resolution observed or modelled events and time series, various methods apply interpolation (Towe *et al.* 2016) or kriging (Laaha *et al.* 2014) to estimate likely values of flow between observed locations, preserving hydrological coherence.

Climate models, however, are often focussed on short time windows (<100 years) and therefore each model run is likely to have a very small number of the most extreme events, which can often vary greatly in their characteristics (duration, area, peakedness), and so fail to capture the full range of possible extreme events. To this end, this paper proposes two statistical approaches (as opposed to hydrological process-based models) to specifically generate a much wider range of extreme events for use in risk analysis. In this paper, the focus is on method comparison rather than examining changes in spatial coherence over time. Two methods, the Heffernan–Tawn (HT) model and the Empirical Copula approach, will be defined and applied on a national and regional scale across Great Britain, and a comparative analysis was undertaken, using a grid-based hydrological model to produce a reference dataset to seed the two statistical methods, and form a reference dataset with which to compare the two approaches.

## DATA

### UKCP18 climate data

UK Climate Projections 2018 (UKCP18) provides information on potential changes in a range of climate variables over the 21st century, via a number of different products (Murphy *et al.* 2018; Lowe *et al.* 2018). These data have previously been used to analyse multiple hazards in the UK and how they are expected to differ in the future due to climate change (Kay 2021; Kay *et al.* 2021; Arnell *et al.* 2021).

The UKCP18 Regional projections (Met Office Hadley Centre 2018) comprise a 12-member PPE of the Hadley Centre Regional Climate Model (RCM), nested in an equivalent PPE of the respective Global Climate Model (GCM). This work uses ensemble members 01–15, excluding 02, 03, and 14. GCM ensemble members 02, 03, and 14 do not have an RCM nested within, so have no regional projections. Ensemble member 01 uses the standard parameterisation. The ensemble covers the period December 1980 to November 2080 under an RCP8.5 emissions scenario (Representative Concentration Pathway) (Riahi *et al.* 2011). The rainfall, temperature, and potential evapotranspiration (PE) used in the present work is on a 12-km spatial resolution and a daily timestep covering England, Scotland, and Wales for a synthetic 360-day year (30 days per month). The data were available re-projected from the native climate model grid to a 12 km grid aligned with the GB national grid, with the exception of the PE data, which was computed from other UKCP18 variables using a modified Penman–Monteith approach (Hough *et al.* 1997). To allow easy comparability between time-slices, observed flow time series were not used in this paper since these are not available for the 2051–2080 time-slice.

## METHODS

### Grid-to-Grid model

The Grid-to-Grid hydrological model (Bell *et al.* 2009) is a grid-based runoff-production and flow-routing model, where flow is routed from grid-cell to grid-cell. It models the river flow across Great Britain based on data such as meteorological, topographical, soil type, and land-cover data. It requires input time series of gridded precipitation, daily maximum and minimum temperature, and PE. In total, 19,914 grid-cells were identified on the river network with at least 50 upstream grid-cells (equating to a theoretical catchment area of 50 km^{2}). Daily flow time series for baseline (1981–2010) and future (2051–2080) time-slices for these 19,914 grid-cells.

The next step of the pipeline took these daily outputs of flow and extracted ‘widespread, extreme events’.

### Reference event extraction

Full details are provided in the companion paper (Griffin *et al.* 2022b), with a brief summary provided here. *At each grid-cell, a threshold (POT2) was chosen such that daily flow exceeded it on 2 days per year on average during the baseline time-slice. This corresponds to a daily probability of exceedance (PoE) of approximately 0.56%. The same threshold was used for both time-slices for comparability.* Events were defined as periods of consecutive days where more than 0.1% of the 19,914 river grid-cells across GB exceeded the threshold at the same time. PoEs were initially calculated using daily data. To convert from daily to annual return periods, a Poisson approximation was used based on a 360-day year: .

The events were then summarised by taking the maximum flow at each grid-cell over the duration of the event. Events were restricted to 2 weeks (14 days) to mitigate the possibility of two consecutive but independent events being counted as a single event. From this, annual PoEs and return periods were computed. In order to simulate national events, the Empirical Copula and the HT model need marginal distributions for each grid-cell that cover the entire flow regime. Since no clear probability distribution can accurately cover the full regime of flow while giving the appropriate heavy tails at the upper end, a mixed distribution was selected, as in Towe *et al.* (2019). This is a mixed distribution combining an empirical distribution describing each time series for values of flow below threshold, and a GPa distribution on those values above the threshold. In the rest of this paper, event extent is only defined by exceedances, so the choice of an empirical sub-threshold distribution does not affect the analysis of area or peak return period.

*et al.*(2009)), or a distribution that covers the whole period of flow such as Lognormal or Tweedie as discussed in Svensson

*et al.*(2017) for summaries of the daily mean flow. However, we feel that the fit in the upper tail for these distributions is not sufficient for the present work. This mixed distribution gives the full expression for the cumulative distribution function of flow,

*x*:where

*F*is the empirical cumulative distribution function at grid-point

_{i},_{EMP}*i*,

*F*is the cdf for the GPa distribution,

_{i,GPa}*u*is the flow threshold at that grid-point, and , since this investigation uses the POT2 threshold defined previously. The return period

_{i}*T*is given by 1/

*F*, as this paper assumes stationarity during each of the two time-slices, since both event methods outlined below lack the capability to incorporate non-stationarity into the event generation. This assumption of stationarity is a strong one and not necessarily realistic (Milly

*et al.*2008). There are several approaches to describing the ‘return period’ of an event under the assumption of non-stationarity based on the expected time to arrival, expected number of events per period, or a design life level which is based on a probability of failure over a pre-defined period (see Salas

*et al.*(2018) and the references therein).

To ensure a good fit of the GPa distribution for the most extreme events, the 60 largest independent peaks at each grid-point were found using the peak-extraction algorithm as used by the NRFA (https://nrfa.ceh.ac.uk/), which stipulates two requirements for events to be independent at a specific location:

Two peaks are separated by at least three times the time-to-peak (geometric mean of times from the start of rainfall to maximum flow) at that location. As an approximation, this was simplified in this analysis to have a time-to-peak of 7/3 days, so peaks were separated by at least 7 days.

The river flow between two peaks must drop below two-thirds the height of the lower peak.

Events were selected starting from the largest and excluding any dependent peaks before selecting the next largest remaining peak. For a specific event, the flood extent area is the maximum number of 1 km^{2} grid-cells simultaneously above their threshold during a selected event.

For regional events, the reference event set was divided by hydrometric area (Figure 2). In each region, only events which saw threshold exceedances in the specified region were kept, leading to a slightly smaller event set. Since events without exceedances do not contribute to either of the statistical models below, this reduced reference event set did not reduce performance.

### Statistical event generation

In both of the following methods, the events were derived by choosing a ‘seed’ event from the reference set, which was used to simulate a new event more specifically. Since these simulated events were not based on a full-time series and so not technically at any specific time (other than baseline or future), we assigned the season and duration of the simulated events to be the season and duration of the ‘seed’ event.

#### Empirical Beta Copula

The first approach that was implemented to generate new events on a national scale was the Empirical Beta Copula (EC) method (Rüschendorf 2009; Segers *et al.* 2017). This approach, which has not been used in hydrological modelling before, is a fast method to generate widespread flooding events which scales well to these very large datasets. Events in the ‘reference’ (G2G) dataset (of N events) are ranked at each grid-point *i*, so each grid-point has one rank (1 to *N*) for each event *n*. Additionally, a mixed distribution with cdf *G _{i}* as above was fitted to the event set at each grid-point (not the whole time series). A set of

*N*uniform random numbers

*u*in [0,1] are generated for each grid-point and ordered according to the ranks of the reference set. To simulate a new event, an event n is drawn uniformly at random, and the values are the simulated values of flow for that event. Estimated PoEs are based on the mixed distributions

_{i,n}*F*based on the whole time series. This procedure generated 600,000 events: two sets of 25,000 events for each ensemble member, based on baseline and future time-slices.

_{i}#### HT model

The HT model (Heffernan & Tawn 2004) is a conditional exceedance model, designed to describe the probability of one site experiencing a threshold exceedance, given an exceedance is being experienced at another specific site. It was implemented using a modified version of the *texmex* package (Southworth *et al.* 2020), modified to accommodate the larger number of points being investigated; the number of parameters being fitted increases quadratically compared to the number of grid points, and so computational complexity increases similarly. In the present study, there were limitations on computation power, which restricted the work to just the NorthWest (NW) region, as in Figure 2.

*et al.*(2018) used and developed this model using a multivariate Normal distribution, but for the extremes of interest, the GPa is more appropriate. The marginal was then transformed to a one-parameter Laplace distribution, and the spatial dependence structure was fitted using two parameters for each location-pair (

*i*,

*j*), and an estimate of residual distribution. The set of marginal (at-site) distributions along with the spatial dependence structure was used to generate new events as follows. Suppose location 1 is assumed to have the most extreme flow (in terms of PoE) and that flow is above the threshold. A residual

*z*is generated from a high-dimensional normal distribution and an observed flow is simulated. Then, observations at all other locations are generated on the same basis which are then scaled to Laplace observations according to:where

*a*and

_{i}*b*are parameters estimated for each site

_{i}*i*. Simulated events for which

*y*

_{1}is largest are kept, otherwise, they are discarded. This procedure was repeated for each desired event: the designated ‘most extreme’ flow was drawn for each event based on the distribution of most extreme flows in the observed event set.

Using the HT method, 400 extra widespread events were simulated for the NW region, for each ensemble member for baseline and future time-slices (12 × 2 × 400 = 9,600 events). Neal *et al.* (2013) undertook a similar exercise with a bootstrapping approach, rather than the use of an RCM ensemble.

## RESULTS

### National results

^{2}. This is a key advantage to using simulation methods to generate larger sets of extreme events for risk analysis: supplying events that have different combinations of extent and severity not necessarily observed over short time-slices. Additionally, the computational cost (in time) per event was notably lower for the Empirical Copula method than being directly generated from a hydrological model. In practice, there is a trade-off to be made between the number of events required for risk analysis, and the hydrological plausibility, which increases with the number of reference events. Figure 4(b) shows the difference between baseline and future events for the two methods. Both show similar overall patterns, but the EC method seems to show a greater increase in future rare events compared to the reference events. The reference event set also shows a strong decrease in the number of small-extent, low-return period events.

### Regional case study

*cf.*Figure 3(b)); future work in drawing out regional differences would be valuable.

The EC method also appears to match well in terms of area and return period, although both methods struggle to simulate the ‘smallest’ extreme events, due to the conditioning on at least one location exceeding its at-site threshold. The HT method seems to match the reference event set in terms of spread of area-return period combinations (Figure 8) but on each individual margin, there are noticeable differences.

## DISCUSSION AND CONCLUSIONS

Generating widespread extreme events for risk calculation analysis is an exercise in optimisation: the trade-off between getting enough events to give the full breadth of possibilities and having the time and processing power to achieve it. At one end of the scale is the UNSEEN programme (Kelder *et al.* 2020), which has made excellent use of the available processing power, and at the other is risk analysts just making use of observed events, but with immediate results. This paper presents two options which offer a half-way house, generating a wide set of events using a fraction of the processing power and time. In context, each 30-year hydrological model simulation takes on the order of a day, whereas the EC method can generate many more extreme events in a matter of minutes given the same computational resources.

Using the UKCP18 Regional projection data under an RCP8.5 scenario, the Grid-to-Grid hydrological model was run for the 12-member RCM ensemble for two periods: 1980–2010 and 2050–2080. To generate a greater number of possible widespread events, the Empirical Copula and HT model were used on a national and regional scale, respectively.

The analysis shows that there are some differences between the two statistical approaches and how well they match the G2G-derived reference event set (as close as one can get to a ‘truth’). The HT approach (with a more parametric model characterising the inter-site extremal dependence) has the possibility to work extremely well, if the best model can be identified (type of marginal, etc.). The standard R package for undertaking this modelling (*texmex*; Southworth *et al.* 2020) has limitations, but the model itself (as shown here and in Neal *et al.* 2013) has the potential to be effective The EC approach (which directly uses an empirical view of the rank dependence structure) shows the ability to better explore the state space of possible widespread extreme events. However, since choices still have to be made (marginals and thresholds) the use of the EC approach to extrapolate greatly beyond the extremes of the reference event set is not recommended.

A fuller analysis of the differences between baseline and future behaviour of widespread extreme events is covered in Griffin *et al.* (2022b), but a few key points can be observed here. First, there is a slight trend towards widespread events having a greater area but a similar maximum pointwise return period. On a national scale there is also an increased occurrence of the most extreme events (with a return period exceeding 100 years, based on thresholds derived from the baseline period). However, in a closer inspection for the NW region, the most extreme events in terms of return period do not change in distribution between baseline and future, suggesting most of the change in events is in spatial spread or in the structure of the more frequent events. It may also suggest that the big changes in widespread flooding dynamics are occurring outside the NW region.

The Empirical Copula approach shows the ability to simulate a wide range of statistically possible widespread events (under various choices and assumptions), but it would be interesting, and prudent, to test the plausibility of such simulated events by using them as initial/boundary conditions for hydraulic models on a smaller scale; or whether such events could be replicated directly from a hydrological or hydraulic model. There is already a natural extension of the HT model to multi-day events (Keef *et al.* 2009), so it would be of interest to see how well the Empirical Copula approach could be extended to such events. Applications to other types of hazards, such as widespread drought or direct analysis of extreme precipitation, could be fruitful. It would be enlightening to investigate methods of incorporating non-stationarity into both the Empirical Copula and HT methods through the use of non-stationary marginals or copulas (Xu *et al.* 2019). This could also more easily allow the use of design life levels for a better description of return period under non-stationarity.

Future work could investigate whether the Empirical Copula and HT methods work as well on true observed data, however for comparability between time-slices this is not investigated here. Since the principles of the methods are independent of the hydrology, one would expect similar performance, but it would be enlightening to check this. Additionally, it should be noted that all the present work is based on modelled data, and so the application of EC and HT to observed data would be interesting, whether the loss of variance in events would be detrimental to capturing the spatial dependence. Adjusting this for future scenarios is also a highly non-trivial extension.

## ACKNOWLEDGEMENT

Funding for the project was provided through the UK Climate Resilience Programme supported by UK Research and Innovation, the UK Met Office, and from the Natural Environment Research Council.

## AUTHOR CONTRIBUTIONS

E.S. and P.S. managed the project, A.K. ran the hydrological modelling, A.G. ran the event extraction and summary, and performed the statistical analysis. All authors assisted in writing and editing the manuscript.

## DATA AVAILABILITY STATEMENT

All relevant data are available from an online repository or repositories. All relevant data are available from the Environmental Data Centre (Griffin *et al.* 2022a).

## CONFLICT OF INTEREST

The authors declare there is no conflict.

## REFERENCES

*Science*

*Hydrol. Earth Syst. Sci. Discuss. (preprint)*doi:10.5194/hess-2019-358