An updated national-scale assessment of trends in UK peak river flow data: how robust are observed increases in flooding?


 A cluster of recent floods in the UK has prompted significant interest in the question of whether floods are becoming more frequent or severe over time. Many trend assessments have addressed this in recent decades, typically concluding that there is evidence for positive trends in flood magnitude at the national scale. However, trend testing is a contentious area, and the resilience of such conclusions must be tested rigorously. Here, we provide a comprehensive assessment of flood magnitude trends using the UK national flood dataset (NRFA Peak Flows). Importantly, we assess trends using this full dataset as well as a subset of near-natural catchments with high-quality flood data. While headline conclusions are useful for advancing national flood-risk policy, for on-the-ground flood-risk estimation it is important to unpack these local changes to determine how climate-driven trends compare with those from the wider dataset that are subject to a wide range of human disturbances and data limitations. We also examine the sensitivity of reported trends to changes in study time window using a ‘multitemporal’ analysis. We find that the headline claim of increased flooding generally holds up regionally to nationally, although we show a much more complicated picture of spatio-temporal variability. While some reported trends, such as increased flooding in northern and western Britain, appear to be robust, trends in other regions are more mixed spatially and temporally – for example, trends in recent decades are not necessarily representative of longer-term change, and within regions (e.g. in southeast England) increasing and decreasing trends can be found in close proximity. While headline conclusions are useful for advancing national flood-risk policy, for flood-risk estimation it is important to unpack these local changes, and the results and methodological toolkit provided here could provide such supporting information to practitioners.


INTRODUCTION
In early 2020, the UK experienced one of the most severe nationally significant flood events of recent decades (Parry et al. ; Sefton et al. in press). These floods came only 3 months after similarly devastatingand record-breaking -flooding in northern and central England (Muchan et al. ). The summer of 2019 also saw more localised, but dramatic, flooding in similar areas, notably Yorkshire and Lincolnshire.
The term 'unprecedented' has been widely used in connection with these flood events, but one does not have to look far back to find previous 'unprecedented' flooding The first two decades of the 21st century, in general, have been characterised by many major flood events (see also Hannaford ; Table 1)  Traditionally, flood-risk estimation methods assume stationaritythat is, a statistical process with parameters, for example, mean and variance, which do not shift over time (e.g. Slater et al. ). It is important to quantify any apparent non-stationarity in flood records to underpin the development of robust approaches to flood design that can incorporate observed changes to flood regimes over time.  (Griffin et al. 2020).
A key focus of this study is testing the resilience of the reported headline message of positive trends in flooding.
Trend detection is a contentious area, and the barriers to observation-based trend analyses are widely reported (e.g.
low signal-to-noise ratios commonly seen in hydrological

METHODOLOGY
The methodology we adopt is based on the standard NRFA trend testing approach outlined by Harrigan et al. (a). In brief, the methodology is as follows. We apply monotonic trend tests to the UK-wide floods dataset and examine atsite trends and spatial patterns using several fixed study periods. We also subset this dataset according to the membership of the Benchmark network and the membership of standard hydrometric regions. We then examine sensitivity to the study period by using a 'multitemporal' analysis that quantifies trends between all possible start and end years in a record. We apply this to the regional groupings of stations and to a selection of very long (>70 years) hydrometric records to provide context for the recent, fixed study periods. The following sections detail this process.

Station selection criteria
To understand the long-term changes in UK flooding, the primary dataset used is the NRFA Peak Flow Dataset Version 8,  Stations were accepted for each period if at least 27 valid AMAX were available and 10% or less of AMAX values were missing during that period.
The resulting dataset gives good spatial coverage across the UK, although it is important to note the sparser coverage in Scotland (especially in the west) which simply reflects the currently available Peak Flows network in these areas.
The dataset was stratified in three ways to allow analyses Then, within each region, all records with at least 27 years of valid AMAX data and suitability for either pooling or QMED estimation were accepted. Records with more than 10% missing AMAX values were permitted to maximise the number of AMAX available to contribute to each regional representative year. For each year, the regional AMAX was the median value of all standardised AMAX accepted for that year.
UKBN2 stations are classed as near-natural and with generally good quality data and, thus, are appropriate for identifying climate-driven hydrological trends. However, given the difficulties of finding stations of good quality across the full flow range, the network is stratified into several categories depending on hydrometric performance and artificial influences on the flood and low-flow regimes (Harrigan et al. a). Thus, stations that received a score of 2 (suitable) or 1 (caution) for high flows were included.
The latter are more likely to be subject to some degree of disturbance, but typically the degree of influence is not well known. There are 16 'caution' sites (compared to 98 'suitable'), and they were included to ensure good geographical coverage.

Trend analysis
The method for trends analysis was the rigorous, standar- This is unlikely to be a major issue in our study given the low number of positive serial correlation tests.

Decadal-scale variability and multitemporal analysis
As is widely noted in the literature ( In the multitemporal approach, trends are calculated for all AMAX series for all possible start and end years (with a minimum period length of 27 years) for a total of 231,245 periods. These individual station-by-station multitemporal analyses are not reported in this paper given the sheer amount of data, but can be explored in Griffin et al. (2020). For brevity in this paper, we show multitemporal analyses for each regional median series. Matrix plots are produced, showing start years along the x-axis and end years along the y-axis, where each cell corresponds to a single trend result, coloured according to the MK Z statistic.
As with the LOESS plots, given the wide range of start dates, the multitemporal analyses were started in 1961.

Long hydrometric records
The multitemporal approach is even better suited to longer series to understand how representative the post-1960-and 1970 periods typically used in trend analysis are of much longer-term variability. As noted above, multitemporal analyses by a station for the full period of record are available for all sites used in this study, using these same graphics (see Griffin et al. 2020).
To examine changes over a much longer period, nine of the longest available NRFA Peak Flow records were selected, with approximately one per region selected to give good spatial coverage (bearing in mind the low number of available sites with pre-1960 start dates in the dataset). These were selected by comparing the longest records in each region and appraising them for long-term consistency and quality, while still maintaining as long a record as possible. The selection is presented in Table 2.
In some cases, records extend back many decades (generally to the 1920s), but the longest available records from western Scotland began in 1955in this cases, little is added to the regional-scale multitemporal analysis, but they are included for completeness. For all plots used in this paper, for presentation purposes only the post-1920 period is shown, to avoid plots being dominated by whitespace, even though the full Thames record extends to 1882 and the Wye to 1908.
For the Dee and the Clyde, no AMAX data were available in the NRFA Peak Flows series from 2005 onwards, as they have yet to be updated. For these sites, AMAX were extracted from a separate source, the NRFA Highest The BFI is a measure of the proportion of the river runoff that derives from stored sources; the more permeable the rock, superficial deposits and soils in a catchment, the higher the baseflow and the more sustained the river's flow during periods of dry weather (https://nrfa.ceh.ac.uk/derived-flow-statistics). b The average annual rainfall over the catchment for 1961-1990. This statistic is derived from the SAAR map for 1961-1990, a 1-km grid based on data from the Met Office, rather than a catchment rainfall series (https://nrfa.ceh.ac. uk/rainfall-statistics).

Results per station
Of the full set of 753 stations, 587 met the criteria for the The relative slope of the Theil-Sen function ranged from À72 to þ117% for the short periods and À60 to þ99% for the long periods. TSA always followed the same sign as the MKZ score for each station, except in cases where one or the other was zero, and tended to be greater where the MKZ score was greater, with some exceptions where shallower relative slopes could be considered more significant than steeper relative slopes. The similarity between the sign of MKZ score and the sign of TSA meant that clusters of negative and positive TSA followed the same spatial patterns as positive and negative MKZ scores.
In addition to the fixed short and long periods, analyses were also performed for an arbitrary 'full' period-of-record.
While this means the at-site results are less comparable in space, it does give a view of trends over the whole available period (as would be used by many if not most practitioners).
For the full period, MKZs ranged from À2.9 to þ5.9, and TSA ranged from À85% to þ107%. In total, 66.9% of trends were positive (15.9 and 22.2% at the 5 and 10% sig- There are relatively few gauged catchments in the UK, Results per region

Selected long records
The DCV in the long hydrometric records is shown in Figure 5. These plots confirm that the DCV that influences trends in the regional series post-1960 is also prevalent in the earlier decades. Such plots clearly show evidence of

DISCUSSION
The results of the present study accord with previous research, reaffirming generally positive trends as being the main outcomes for large areas of the UK, especially northern and western regions. In this sense, this study builds on the headline conclusions of the review of Hannaford () and agrees with subsequent work on more updated flood records, as cited in the introduction to this paper. We also find thatin general termsthese positive trends are mostly resilient to changes in the study approach. In particular, a broadly similar message emerges from the full series of more 'noisy' anthropogenically influenced stations compared to those in the UKBN. Furthermore, changes to the time period of analysisespecially changes to the start date of analysis, up to the most recently available datado not especially change the finding that there is more compelling evidence for an increase in flood magnitude in the UK than for a decrease or no change. These headline findings add to a growing evidence base that suggests that traditional flood frequency analysis approaches, which assume stationarity, may be called into question (e.g. Faulkner et al. a, b).
However, beneath this headline message, this study has examined trend responses for over 700 individual catchments and has examined sensitivity to time window by computing trends for over 200,000 possible start and end dates. Unsurprisingly, it reveals a much more complex picture of spatial and temporal variability in flood magnitude.
First, the national and regional picture is more nuanced than the 'increasing in north and west' headline. For the long fixed period, which represents the best trade-off between study period length and spatial coverage, there is generally an increasing trend across northern and western  In review) or even global (Svensson & Hannaford ) scale. More work is needed to understand the atmosphereocean mechanisms that drive flood variability in the UK, on a range of timescales, to support such enhanced floodrisk estimation approaches.

CONCLUDING REMARKS
This paper has provided an up-to-date assessment of flood trends at the national scale. Our results are comparable with previous studies, but we demonstrate the resilience of these findings to important methodological considerations.
However, we also show significant granularity in the regional and national picture and sensitivity to chosen study periods.
To this end, we add a considerable value for flood practitioners who must balance local-scale information with this wider national picture. Given the variation in trend responses, we recommend that trend analysis should be undertaken in catchments of interest as a part of flood frequency estimation studies. We provide the outputs of this study in an accessible format and in an interactive tool that allows closer appraisal (Griffin ). However, significant obstacles to application remain, not least around the perennial question of attribution of observed changes.