Abstract
Observational trend analysis is fundamental for tracking emerging changes in river flows and placing extreme events in their longer-term historical context, particularly as climate change is expected to intensify the hydrological cycle. However, human disturbance within catchments can introduce artificial changes and confound any underlying climate-driven signal. The UK Benchmark Network (UKBN), designated in the early 2000s, comprised a subset of National River Flow Archive (NRFA) stations that were considered near-natural and thus appropriate for identification and interpretation of climate-driven hydrological trends. Here, the original network was reviewed and updated, resulting in the UKBN2 dataset consisting of 146 near-natural catchments. Additionally, the UKBN2 provides user guidance on the suitability of each station for the assessment of low, medium, and high flows. A trend analysis was performed on the updated UKBN2 dataset and results show that while the strength and direction of changes are dependent on the period of record selected, previously detected patterns of river flow change in the UK remain robust for longer periods (>50 years), despite the recent prevalence of extremes. Such a quality assured observational dataset will provide a foundation for future scientific efforts to better understand the changing nature of the hydrological cycle.
INTRODUCTION
It is expected that anthropogenic climate change will intensify the global hydrological cycle as the world continues to warm (IPCC 2013), thereby increasing the frequency and severity of extremes such as floods (Hirabayashi et al. 2013) and droughts (Prudhomme et al. 2014), although strong regional variability and uncertainties in projections exist (Arnell & Gosling 2016). The recent UK Climate Change Risk Assessment report (ASC 2016) identified both increased flooding and water scarcity among the UK's most important climate change risks. The notable hydrological volatility experienced in the early decades of the 21st century (Hannaford 2015; for a fuller description of these episodes see the National River Flow Archive (NRFA) website: http://nrfa.ceh.ac.uk/occasional-reports) has exposed the UK's vulnerability to hydrological extremes and thus there is a clear scientific and socio-economic need to understand the changing nature of these extremes.
There is a growing body of work using large ensemble modelling approaches suggesting extreme hydrological events can be attributable, in part, to a direct human influence on climate (e.g., Schaller et al. 2016). Given inherent uncertainties introduced through the climate-hydrology modelling chain, and the complex and still poorly understood role of catchments in modifying climate signals, observations remain the foundation for any scientific understanding on climate change impacts on river flows, particularly when justifying costly adaptation plans. However, there are also many challenges involved with detection of a robust long-term climate change signal within observed river flow time-series (Hannaford 2015). For example, the climate change signal is much weaker than background natural decadal climate variability (DCV), especially in ocean influenced mid-latitude regions such as the UK (Wilby 2006). Additionally, artificial disturbances within catchments (such as urbanisation, deforestation, dam construction, and river engineering) have been shown to substantially alter flow regimes (e.g., Vorogushyn & Merz 2013; Harrigan et al. 2014; Prosdocimi et al. 2015) thus confounding trend detection and attribution. Disentangling the many interacting drivers of change in river flows is a major research challenge, but a first step is using river flow data that are sensitive to climate-driven changes.
Reference Hydrologic Networks (RHNs) provide such fit-for-purpose data as only catchments that can be considered ‘near-natural’ with long and good quality flow records are included. Whitfield et al. (2012) and Burn et al. (2012) review the development and status of national RHNs with the UK Benchmark Network (UKBN) being one of the most established of those contributing to the global RHN effort. The UKBN comprises a subset of gauging stations within the national hydrometric network that is most suited for identification and interpretation of long-term climate-driven hydrological variability and change. The UKBN is of fundamental importance in this regard, given the high population density and long history of settlement and water exploitation in the UK compared to many other countries; human influences on river flow regimes are pervasive, and in many catchments changes in long-term runoff patterns bear little relation to climate variability (Hannaford & Marsh 2006). Benchmark catchments can be considered reasonably free from human disturbances such as urbanisation, river engineering, and water abstractions, and hence can be used for detection of climate-driven changes in river flow. The first iteration of the UKBN, henceforth UKBN1, was designated 15 years ago by Bradford & Marsh (2003) and included 122 catchments that met four primary criteria: (i) relatively natural flow regimes, (ii) good and consistent hydrometric data quality, (iii) relatively long records (ideally > 25 years) and (iv) were representative of UK hydroclimatic conditions with good geographical coverage. The core aim of the UKBN is to strengthen national capability to identify and quantify long-term trends and variability in runoff patterns and hydrological extremes. As well as being valuable for the national and international research community, this information is potentially useful for a wide range of practical applications including strategic water resources planning, environmental regulation, flood risk and engineering design, and climate change adaptation planning.
Since the designation of the network, it has been used extensively in trend studies on changes in UK runoff, low flows and droughts, high flows and floods, and seasonal flows (see the review of Hannaford (2015) and references therein). Although designated primarily to support hydrological change detection, RHNs are appropriate for a wide range of applications that require near-natural flow regimes – particularly understanding the climatic processes (Lavers et al. 2010, 2015) and catchment properties (Chiverton et al. 2015) influencing river flows. The network has also been used for quantifying trends in other variables such as water temperature (Orr et al. 2015). The network has fed into various international initiatives, including efforts to quantify river flow trends on the island of Ireland (Murphy et al. 2013), European (Stahl et al. 2010) and, more recently, intercontinental (Hodgkins et al. 2017) scales. UK and European studies using the network have also been cited in global assessments (e.g., IPCC 2013).
Given that 15 years have passed since the original UKBN, there is a clear need to review the stations within the network to ensure they still meet benchmark status and re-evaluate stations that were initially excluded from UKBN1 due to short record lengths. Additionally, Bradford & Marsh (2003) acknowledged in the original benchmark designation that compromises had to be made to ensure good spatial coverage, as relatively few gauged catchments in the UK have near-natural flow regimes and, of these, even fewer are gauged by stations with the ability to measure the full range of flows accurately. This is particularly an issue in the densely populated areas of central, southern, and eastern England. Recognising this challenge, there is an opportunity for a more explicit classification of benchmark status of individual catchments according to suitability for low, medium, and high flows.
Furthermore, many of the previous UK trend studies reviewed within Hannaford (2015) use data ending in the early- to mid-2000s and so it is not known whether the previous reported patterns of change persist when more up-to-date flow data are used, especially given the prevalence of recent hydrological extremes. With the above in mind, the aim of this paper is two-fold:
Review and update the original UKBN of river flow stations, including a new classification of benchmark status at low, medium, and high flows.
Perform a standardised trend analysis on the updated UKBN dataset.
Accordingly, the Designation of the UKBN2 dataset section outlines the review process behind the designation of the updated Benchmark Network. The Trend analysis methods section describes the hydrological indicators, catchments, and trend tests used. The trend analysis is presented in the Results section and interpreted in the Discussion section before suggestions for further research are offered in the Concluding remarks section.
DESIGNATION OF THE UKBN2 DATASET
UKBN review process
Stations within the first iteration of the UKBN were selected from the national hydrometric network following a project beginning in the late-1990s. The benchmark review process involved evaluation of detailed station metadata, inspection of hydrographs, and consultation with those responsible for maintenance and collection of hydrometric data who have vital local knowledge of individual site conditions. The designation of the updated UKBN reported here, henceforth UKBN2, used the original benchmark criteria and UKBN1 stations of Bradford & Marsh (2003) as its foundation but employed a more extensive range of metadata now available on the NRFA, exploiting a number of developments over the last 15 years, including: more detailed station metadata, incorporation of the NRFA Peak Flow database (http://nrfa.ceh.ac.uk/peak-flow-data) which includes gauging station rating curves, improved knowledge of artificial influences (AIs) such as water abstractions and discharges, and increased NRFA spatial and statistical analysis capabilities (Dixon et al. 2013). Given the extent and diversity of both qualitative and quantitative information, as well as the need to exercise expert judgement in many places, a completely objective application of the benchmark criteria was not possible, as was the case for the original UKBN1 designation. Nevertheless, the decision-making process was supported by a systematic framework underpinned by key evidence sources (Table 1). In step one of the appraisal, stations contained within UKBN1 were allocated one of three initial categories: endorse, review, or omit based on the review process shown in Table 1(A–E). An additional 54 stations that potentially met benchmark criteria were also considered, as they now have a record length >25 years. Of the original 122 stations, 67 were endorsed, 48 required further review, and seven were omitted, resulting in a total of 176 (including the 54 ‘Candidate’ stations) considered in the overall UKBN2 appraisal.
Review process/Source of information . | Details . | How it was applied to benchmark criteria . |
---|---|---|
A General NRFAa station metadatab | Station description (gauge type, changes in gauging methods/structure over time); Hydrometric description (indicative hydrometric quality of gauge and indication of issues at extreme flow ranges, e.g., high flows bypassing gauging structure); Flow record description (particular measurement issues over time); Flow regime description (highlight AIs that affect runoff); Site photographs (assessment of site conditions, often during past extreme events) | Station failed/given caution if evidence of serious impacts from AIs/performance issues, or, query raised with MAs for further assessment/information |
B NRFA Peak Flowc data and metadata | Over 85% of UKBN2 stations are also peak flow stations so have access to rating curves and gauging schedule to assess high flow hydrometric performance/issues; AMAX and POT hydrographs were assessed | Station failed/given caution if site had too few gaugings/too much scatter in gaugings at high flow range, or, query raised with MAs for further assessment/information |
C Hydrometric Data Quality (HDQ) scores | Hannaford et al. (2013b) created a Hydrometric Data Quality (HDQ) score for catchments in England and Wales based on Lamb et al. (2003) Gauging Station Data Quality classifications (GSDQs). These metrics reflect hydrometric performance and data quality including modelled impact of AIs on low flows (i.e., impact of known abstractions, discharges and impoundments at Q95d) | Station failed/given caution if substantial evidence of AIs and clear impact on low flow regime, or, query raised with MAs for further assessment/information |
D Visual assessment of GDF and peak flow hydrographs | Assessment of GDFs and peak flows for evidence of hydrometric issues (e.g., high flow truncation, artificial patterns during low flows, effect of urbanisation on flashiness, and temporal homogeneity issues) | Station failed/given caution if non-natural flow response or clear temporal homogeneity issues, if supported by metadata, or, query raised with MAs for further assessment/information |
E Quantitative assessment of GDFs and peak flow time-series | Statistical tests for screening evidence of gradual (using the Mann–Kendall test and the Theil–Sen approach) and abrupt changes (using the Pettitt test) in river flow time-series for low (minimum flow and Q95), medium (Q50d and AMF), and high flows (Q05d and maximum flow) | Station failed/given caution if non-natural flow response or clear temporal homogeneity issues, if supported by metadata, or, query raised with MAs for further assessment/information |
F Expert consultation with MAs | Query sheet compiled for each MA region based on questions and issues identified in A–E | Station failed/given caution if query confirmed by MA or new information brought to light during this process |
G Synthesis: Identification of benchmark score and benchmark qualifier | Finally, information from A–F was collated along with maps on catchment representativeness/spatial coverage and reviewed together by the project team exercising expert judgement to arrive at the final selection of 146 stations. A benchmark score was given to low, medium, and high flow ranges for each station along with a brief benchmark qualifier to explain why not-suitable or caution flags were warrantede | Balanced most natural, best quality records at extremes, longest record length, and hydrological representativeness and spatial coverage. Application of criteria were more strict in regions with many stations and necessarily relaxed, within reason, for regions where few stations met criteria |
Review process/Source of information . | Details . | How it was applied to benchmark criteria . |
---|---|---|
A General NRFAa station metadatab | Station description (gauge type, changes in gauging methods/structure over time); Hydrometric description (indicative hydrometric quality of gauge and indication of issues at extreme flow ranges, e.g., high flows bypassing gauging structure); Flow record description (particular measurement issues over time); Flow regime description (highlight AIs that affect runoff); Site photographs (assessment of site conditions, often during past extreme events) | Station failed/given caution if evidence of serious impacts from AIs/performance issues, or, query raised with MAs for further assessment/information |
B NRFA Peak Flowc data and metadata | Over 85% of UKBN2 stations are also peak flow stations so have access to rating curves and gauging schedule to assess high flow hydrometric performance/issues; AMAX and POT hydrographs were assessed | Station failed/given caution if site had too few gaugings/too much scatter in gaugings at high flow range, or, query raised with MAs for further assessment/information |
C Hydrometric Data Quality (HDQ) scores | Hannaford et al. (2013b) created a Hydrometric Data Quality (HDQ) score for catchments in England and Wales based on Lamb et al. (2003) Gauging Station Data Quality classifications (GSDQs). These metrics reflect hydrometric performance and data quality including modelled impact of AIs on low flows (i.e., impact of known abstractions, discharges and impoundments at Q95d) | Station failed/given caution if substantial evidence of AIs and clear impact on low flow regime, or, query raised with MAs for further assessment/information |
D Visual assessment of GDF and peak flow hydrographs | Assessment of GDFs and peak flows for evidence of hydrometric issues (e.g., high flow truncation, artificial patterns during low flows, effect of urbanisation on flashiness, and temporal homogeneity issues) | Station failed/given caution if non-natural flow response or clear temporal homogeneity issues, if supported by metadata, or, query raised with MAs for further assessment/information |
E Quantitative assessment of GDFs and peak flow time-series | Statistical tests for screening evidence of gradual (using the Mann–Kendall test and the Theil–Sen approach) and abrupt changes (using the Pettitt test) in river flow time-series for low (minimum flow and Q95), medium (Q50d and AMF), and high flows (Q05d and maximum flow) | Station failed/given caution if non-natural flow response or clear temporal homogeneity issues, if supported by metadata, or, query raised with MAs for further assessment/information |
F Expert consultation with MAs | Query sheet compiled for each MA region based on questions and issues identified in A–E | Station failed/given caution if query confirmed by MA or new information brought to light during this process |
G Synthesis: Identification of benchmark score and benchmark qualifier | Finally, information from A–F was collated along with maps on catchment representativeness/spatial coverage and reviewed together by the project team exercising expert judgement to arrive at the final selection of 146 stations. A benchmark score was given to low, medium, and high flow ranges for each station along with a brief benchmark qualifier to explain why not-suitable or caution flags were warrantede | Balanced most natural, best quality records at extremes, longest record length, and hydrological representativeness and spatial coverage. Application of criteria were more strict in regions with many stations and necessarily relaxed, within reason, for regions where few stations met criteria |
dQn is the flow equalled or exceeded n% of the time.
einformation available for each station within the UKBN2 station list file (http://nrfa.ceh.ac.uk/benchmark-network).
AIs, artificial influences; AMAX, 15-minute annual maximum flows; AMF, annual mean flow; GDFs, gauged daily (mean) flows; MAs, Measuring Authorities; NRFA, National River Flow Archive; POT, 15-minute peaks-over-threshold flows.
It was apparent in the early stages of the review that compromises were needed in particular regions to achieve an adequate density of benchmark catchments. This primarily reflects both the ubiquitous nature of AIs on flow regimes, and the inherent difficulties of hydrometric measurement in the extreme flow ranges at many UK gauging stations; very few gauging stations can be considered truly ‘full range’ (Marsh 2002). For example, at low flows, hydrometric uncertainty arises due to insensitivity of measuring structures, or wide scatter in spot flow measurements (gaugings) used to derive rating curves, e.g., due to summer weed growth. Low flows are also the most heavily impacted by substantial surface and/or groundwater abstractions within the catchment. For high flows, common issues include unmeasured bypass flow and non-modularity (drowning) at gauging structures (Herschy 2008), or simply an insufficiency of gaugings to accurately define the high flow rating curve.
Given these challenges, the original aspiration (Bradford & Marsh 2003) of full-range benchmark catchments was a major constraint on the network. Recognising this limitation, and the often different uses and user communities for low flow and high flow assessments (e.g., Hannaford et al. 2013b), the UKBN2 model advocates a classification system that allows ‘sub-networks’ to be defined. To facilitate this, and help the user community assess the utility of individual benchmark station records in the presence of these hydrometric challenges, their suitability for analysis at low, medium, and high flow was evaluated.
Any evaluation of the ability of a station to effectively measure extreme flows requires local knowledge of site and catchment conditions. Step two in the benchmark review process (Table 1F) engaged personnel within each of the four UK Measuring Authorities (Environment Agency for England, Natural Resources Wales, the Scottish Environment Protection Agency, and the Rivers Agency for Northern Ireland). A query questionnaire was compiled for each station within the ‘review’ or ‘candidate’ benchmark categories in stage one that required deeper expert local knowledge on a site's capability of capturing low and/or high flows.
Bringing knowledge together from steps one and two, the final step (Table 1G) assigned each station a benchmark score based on suitability for analysis of low, medium, and high flows (2 = suitable, 1 = caution, and 0 = not-suitable). Thus a station scoring a maximum of 6 means it is suitable for use across the full flow regime. Where a station scores 1 or 0 for a category, a brief benchmark qualifier is provided to help end users understand why the time-series might not be suitable for analysis or requires caution, if for example, water abstractions, poor high flow performance/bypassing, or artificial regulation of flows from hydroelectric power schemes were particularly prevalent.
The new UKBN2 dataset
The UKBN2 appraisal identified 146 of the 176 stations under review as qualifying for benchmark status (Figure 1 and Table 2). Of these, 80 are considered benchmark across the full flow regime. However, these full range benchmark stations are distributed mainly in the less densely populated western and upland areas of the UK (Figure 1), leaving some strategically important network gaps in central, southern, and eastern England mainly due to the larger impact of water abstractions and discharges. However, adequate spatial coverage is maintained when using 132 stations classified suitable or caution for low flows and the 133 stations classified as suitable or caution for high flows. A primary objective of the UKBN2 benchmark scores and benchmark qualifiers is to guide users to the most appropriate sub-network of stations that meet their specific study needs, while highlighting where due care must be exercised when interpreting results from stations flagged as caution or not-suitable.
Benchmark score . | Low flow . | Medium flow . | High flow . | Full flow regime . |
---|---|---|---|---|
2 (Suitable) | 112 | 141 | 110 | 80 |
1 (Caution) | 20 | 5 | 23 | – |
0 (Not-suitable) | 14 | 0 | 13 | – |
Benchmark score . | Low flow . | Medium flow . | High flow . | Full flow regime . |
---|---|---|---|---|
2 (Suitable) | 112 | 141 | 110 | 80 |
1 (Caution) | 20 | 5 | 23 | – |
0 (Not-suitable) | 14 | 0 | 13 | – |
The UKBN2 catchments are mainly relatively small headwater catchments with a median area of 100 km2 (ranging from 3 to 1,500 km2) and median altitude of 182 m a.s.l. (ranging from 20 to 650 m a.s.l.). Over 92% of the catchments can be considered ‘essentially rural’ in terms of the Flood Estimation Handbook (FEH) degree of urbanisation criteria (i.e., <2.5% of catchment area urbanised (Institute of Hydrology 1999)). The number of UKBN2 stations active in each year is shown in Figure 2. The majority of stations were opened during the 1960s and 1970s with only four stations with data before 1950. The mean record length is 46 years, with a minimum of 21 years and maximum of 85 years. Gauged daily mean flow records have high completeness with a mean per cent missing value of 1.4%. However, five stations have records with >10% missing (the highest has 30% missing as the station was not operational for a 12-year gap), but have strategic value so the decision whether to exclude these in an analysis will depend on the context. Ten catchments have records with some degree of ephemeral behaviour (presence of zero flows) and a further ten are nested within a larger parent catchment (Figure 1). While these catchments are not appropriate for some applications, there is merit in including them in the network as some users may be particularly interested in the differential responses of headwater to lower catchment locations.
Users are directed to the UKBN section of the NRFA website: http://nrfa.ceh.ac.uk/benchmark-network for the vUKBN2.0 station list (includes basic metadata as well as benchmark scores and benchmark qualifiers for each station), additional user guidance, instructions for downloading the UKBN2 dataset, and for tracking future updates to the network. A version control system has been implemented to ensure reproducibility of subsequent analyses through time and we envisage that on each major update, a routine trend analysis using the methodology outlined below will be undertaken.
TREND ANALYSIS METHODS
The second aim of this paper is to develop a standardised trend analysis procedure to apply routinely to the Benchmark Network, based on established methods within the hydroclimatic literature, with a first application on the newly designated UKBN2 dataset. Various trend assessment methods have been applied to UKBN1 previously. Here, we set out the following as a rigorous, standardised approach focusing on three components aimed at understanding spatio-temporal changes in river flow:
Trend analysis using two fixed periods (short and long) to identify the spatial nature of changes in river flows.
Assessment of temporal variability of changes in light of the known influence of DCV.
Investigation of persistence of trends for the full available time-series.
Hydrological indicators and catchment selection
For each year, a set of 12 hydrological indicators used in Hannaford & Buys (2012) were extracted from gauged daily flow data (last retrieved from the NRFA on 2nd February 2017) covering the full flow regime:
annual low flow: Q95, Q90;
annual medium flow: Q70, Q50 (median), annual mean flow (AMF), Q30;
annual high flow: Q10, Q05;
seasonal mean flow: winter (DFJ), spring (MAM), summer (JJA), autumn (SON).
Qn is the flow threshold exceeded n% of the time in each year. We acknowledge that Q95 (and Q90)/Q10 (and Q05) do not necessarily characterise drought/flood events but are, nonetheless, useful indicators for assessing the tendency for changing extremes based on daily flow data. Applications of trend analysis to peak flow (e.g., AMAX, POT) and drought indicators (e.g., the Standardised Streamflow Index) are different cases with particular requirements (i.e., censored data, high numbers of zeros, high spatial and temporal persistence). These are already the focus of other initiatives and so are not considered part of the standard benchmark trend testing methodology advocated here.
Indices were computed on time-series for the full period of record for each of the 146 UKBN2 gauges. Missing data can lead to spurious values of indicators and hence misleading trends. While gap filling is desirable (Harvey et al. 2012) and is part of the NRFA quality control process (Dixon et al. 2013), in practice, it has not been extensively carried out for historic time-series. Missing data were handled by applying a strict rule that less than 10% of data could be missing in any year or season for a flow index to be returned, otherwise the particular year/season was given a missing value flag.
It is not appropriate to analyse all stations for all indices given some stations are flagged as not-suitable for analysis at particular flow ranges. For stations given a benchmark score of 0 (not-suitable) for a range, the indicator was excluded from analysis in the remainder of the paper (i.e., no high flow indicators were calculated for stations with a high flows score of 0). In addition to the full period of record analysis, two set periods (short and long) were chosen, optimising spatio-temporal distribution of stations, for a relative comparison. A 30-year (short) period was selected from calendar years 1985–2014 and a 50-year (long) period from calendar years 1965–2014. As missing values can affect the resulting trends in various ways depending on the extent of missing values and position within a series (Slater & Villarini 2017), a further missing data criterion was applied at this stage whereby a maximum of 10% of indicator values could be missing in either fixed period (i.e., five (three) years for the long (short) period). To allow for as many stations as possible to be included in both periods, particularly in the long period where data are more sparse, stations with start years within two years of the target 1965 and 1985 start years, and/or within one year of the target 2014 end year, were also included in the analysis. However, the combined number of missing values in a series and number of years from relaxed start and end years did not exceed a total 10% limit of the respective long and short period length.
Finally, to avoid potentially ‘double counting’ statistically significant trends, only non-nested catchments were used. In cases where the larger parent catchment is flagged as not-suitable for analysis of either high or low flows, but the nested catchment was, the nested catchment was used instead. Application of the above criteria resulted in 116 (short period) and 42 (long period) stations for low flow, 125 (short period) and 46 (long period) for medium flow, and 113 (short period) and 43 (long period) for high flow trend analysis.
Trend analysis tests
Evidence for monotonic trends was assessed using the Mann–Kendall (MK) test (Mann 1945; Kendall 1975), a non-parametric rank-based method that is widely applied in analyses of streamflow (e.g., Hannaford & Marsh 2008; Villarini et al. 2011; Murphy et al. 2013). The standardised MK statistic (MKZs) follows the standard normal distribution with a mean of zero and variance of one. A positive (negative) value of MKZs indicates an increasing (decreasing) trend. Statistical significance was evaluated with probability of Type 1 error set at the 5% significance level. A two-tailed MK test was chosen, hence the null hypothesis of no trend (increasing or decreasing) is rejected when |MKZs| >1.96 using traditional statistical testing.
The MK test requires data to be independent (i.e., free from serial correlation or temporal autocorrelation) as positive serial correlation increases the likelihood of Type 1 errors or incorrect rejection of a true null hypothesis (Kulkarni & von Storch 1995). All indicators were checked for positive lag-1 serial correlation at the 5% level using the autocorrelation function (ACF) on detrended series. The linear trend used to detrend the original time-series was estimated using the robust Theil–Sen approach (TSA) (Theil 1950; Sen 1968). Block bootstrapping (BBS) was used to overcome the presence of serial correlation and involves application of the MKZs statistic to block resampled series that preserve any short-term autocorrelation structure. Following guidance from Önöz & Bayazit (2012) regarding the optimal block length given the sample size and magnitude of temporal autocorrelation coefficient, a block length = 4 was chosen and applied only when a series had statistically significant serial correlation. A robust estimate of the significance of the MKZs statistic was generated from a distribution of 10,000 resamples where the null hypothesis of no trend is rejected when MKZs calculated from original data is higher than the 9,750th largest (statistically significant increasing trend) or lower than the 250th smallest (statistically significant decreasing trend) MKZs value from the resampled distribution under a two-tailed test at the 5% level (Murphy et al. 2013). Results are presented in Tables 3 and 4 for both traditional and BBS MK tests to highlight the impact serial correlation plays, if any, on the statistical significance of trend results. Note that the BBS column in Tables 3 and 4 counts both statistically significant results from the traditional MK test (for non-significant serially correlated series) and for significantly serially correlated series using BBS with L = 4 and is also used for reporting statistically significant trends in the maps in Figures 3 and 4.
. | Indicator . | . | Increasing (sig.; BBS sig.) % . | Decreasing (sig.; BBS sig.) % . | Magnitude (± bounds) % . | Sig. serial correlation % . |
---|---|---|---|---|---|---|
Low | Q95 | 116 | 71.6 (6.9; 5.2) | 27.6 (0.9; 0.9) | 11.2 ( − 2.6, 27.6) | 12.9 |
Q90 | 116 | 75.0 (4.3; 2.6) | 24.1 (0.9; 0.9) | 13.8 ( − 0.2, 23.5) | 8.6 | |
Medium | Q70 | 125 | 72.8 (3.2; 2.4) | 27.2 (0.0; 0.0) | 11.2 ( − 2.5, 22.3) | 9.6 |
Q50 | 125 | 71.2 (1.6; 1.6) | 28.8 (0.0; 0.0) | 9.0 ( − 1.8, 21.5) | 10.4 | |
AMF | 125 | 83.2 (8.0; 7.2) | 16.8 (0.0; 0.0) | 10.7 (1.8, 20.4) | 14.4 | |
Q30 | 125 | 74.4 (4.0; 3.2) | 25.6 (0.0; 0.0) | 10.0 ( − 0.4, 21.8) | 14.4 | |
High | Q10 | 113 | 77.0 (7.1; 7.1) | 23.0 (0.0; 0.0) | 11.1 (0.0, 28.6) | 6.2 |
Q05 | 113 | 79.6 (5.3; 5.3) | 20.4 (0.0; 0.0) | 13.0 (3.3, 24.9) | 0.9 | |
Season | DFJ | 125 | 88.8 (8.0; 8.0) | 11.2 (0.0; 0.0) | 14.3 (5.5, 25.1) | 0.0 |
MAM | 125 | 20.8 (0.0; 0.0) | 79.2 (16.0; 16.0) | −20.1 ( − 33, −2.9) | 4.8 | |
JJA | 125 | 75.2 (1.6; 1.6) | 24.8 (0.0; 0.0) | 13.3 ( − 0.1, 22.2) | 1.6 | |
SON | 125 | 92.0 (4.0; 4.0) | 8.0 (0.0; 0.0) | 23.2 (11.6, 33.6) | 0.0 |
. | Indicator . | . | Increasing (sig.; BBS sig.) % . | Decreasing (sig.; BBS sig.) % . | Magnitude (± bounds) % . | Sig. serial correlation % . |
---|---|---|---|---|---|---|
Low | Q95 | 116 | 71.6 (6.9; 5.2) | 27.6 (0.9; 0.9) | 11.2 ( − 2.6, 27.6) | 12.9 |
Q90 | 116 | 75.0 (4.3; 2.6) | 24.1 (0.9; 0.9) | 13.8 ( − 0.2, 23.5) | 8.6 | |
Medium | Q70 | 125 | 72.8 (3.2; 2.4) | 27.2 (0.0; 0.0) | 11.2 ( − 2.5, 22.3) | 9.6 |
Q50 | 125 | 71.2 (1.6; 1.6) | 28.8 (0.0; 0.0) | 9.0 ( − 1.8, 21.5) | 10.4 | |
AMF | 125 | 83.2 (8.0; 7.2) | 16.8 (0.0; 0.0) | 10.7 (1.8, 20.4) | 14.4 | |
Q30 | 125 | 74.4 (4.0; 3.2) | 25.6 (0.0; 0.0) | 10.0 ( − 0.4, 21.8) | 14.4 | |
High | Q10 | 113 | 77.0 (7.1; 7.1) | 23.0 (0.0; 0.0) | 11.1 (0.0, 28.6) | 6.2 |
Q05 | 113 | 79.6 (5.3; 5.3) | 20.4 (0.0; 0.0) | 13.0 (3.3, 24.9) | 0.9 | |
Season | DFJ | 125 | 88.8 (8.0; 8.0) | 11.2 (0.0; 0.0) | 14.3 (5.5, 25.1) | 0.0 |
MAM | 125 | 20.8 (0.0; 0.0) | 79.2 (16.0; 16.0) | −20.1 ( − 33, −2.9) | 4.8 | |
JJA | 125 | 75.2 (1.6; 1.6) | 24.8 (0.0; 0.0) | 13.3 ( − 0.1, 22.2) | 1.6 | |
SON | 125 | 92.0 (4.0; 4.0) | 8.0 (0.0; 0.0) | 23.2 (11.6, 33.6) | 0.0 |
Direction and significance from Mann–Kendall (MKZs) and magnitude calculated with the relative Theil–Sen approach . Magnitude of change is based on the median with spread (± bounds) given by interquartile range. Per cent of stations statistically significant using block-bootstrapping (BBS) are also shown along with the proportion of series with statistically significant serial correlation.
. | Indicator . | . | Increasing (sig.; BBS sig.) % . | Decreasing (sig.; BBS sig.) % . | Magnitude (± bounds) % . | Sig. serial correlation % . |
---|---|---|---|---|---|---|
Low | Q95 | 42 | 52.4 (4.8; 2.4) | 47.6 (2.4; 2.4) | 1.2 ( − 9.4, 14.8) | 31.0 |
Q90 | 42 | 52.4 (2.4; 0.0) | 47.6 (0.0; 0.0) | 1.7 ( − 8.6, 15.5) | 28.6 | |
Medium | Q70 | 46 | 63.0 (8.7; 6.5) | 37.0 (0.0; 0.0) | 1.5 ( − 8.5, 13.5) | 32.6 |
Q50 | 46 | 47.8 (4.3; 4.3) | 52.2 (2.2; 2.2) | −0.9 ( − 12.5, 6.2) | 26.1 | |
AMF | 46 | 73.9 (13.0; 13.0) | 26.1 (0.0; 0.0) | 6.8 (0.1, 13.8) | 21.7 | |
Q30 | 46 | 56.5 (10.9; 8.7) | 43.5 (0.0; 0.0) | 1.8 ( − 7.6, 10.2) | 19.6 | |
High | Q10 | 43 | 86.0 (16.3; 16.3) | 14.0 (0.0; 0.0) | 11.3 (5.2, 21.1) | 14.0 |
Q05 | 43 | 88.4 (27.9; 27.9) | 11.6 (0.0; 0.0) | 13.5 (7.9, 23.8) | 11.6 | |
Season | DFJ | 46 | 87.0 (19.6; 19.6) | 13.0 (0.0; 0.0) | 12.7 (4.1, 26.2) | 0.0 |
MAM | 46 | 30.4 (0.0; 0.0) | 69.6 (0.0; 0.0) | −10.7 ( − 19.0, 1.6) | 0.0 | |
JJA | 46 | 54.3 (2.2; 2.2) | 45.7 (2.2; 2.2) | 1.4 ( − 11.2, 18.5) | 4.3 | |
SON | 46 | 82.6 (4.3; 4.3) | 17.4 (0.0; 0.0) | 16.7 (5.5, 25.5) | 0.0 |
. | Indicator . | . | Increasing (sig.; BBS sig.) % . | Decreasing (sig.; BBS sig.) % . | Magnitude (± bounds) % . | Sig. serial correlation % . |
---|---|---|---|---|---|---|
Low | Q95 | 42 | 52.4 (4.8; 2.4) | 47.6 (2.4; 2.4) | 1.2 ( − 9.4, 14.8) | 31.0 |
Q90 | 42 | 52.4 (2.4; 0.0) | 47.6 (0.0; 0.0) | 1.7 ( − 8.6, 15.5) | 28.6 | |
Medium | Q70 | 46 | 63.0 (8.7; 6.5) | 37.0 (0.0; 0.0) | 1.5 ( − 8.5, 13.5) | 32.6 |
Q50 | 46 | 47.8 (4.3; 4.3) | 52.2 (2.2; 2.2) | −0.9 ( − 12.5, 6.2) | 26.1 | |
AMF | 46 | 73.9 (13.0; 13.0) | 26.1 (0.0; 0.0) | 6.8 (0.1, 13.8) | 21.7 | |
Q30 | 46 | 56.5 (10.9; 8.7) | 43.5 (0.0; 0.0) | 1.8 ( − 7.6, 10.2) | 19.6 | |
High | Q10 | 43 | 86.0 (16.3; 16.3) | 14.0 (0.0; 0.0) | 11.3 (5.2, 21.1) | 14.0 |
Q05 | 43 | 88.4 (27.9; 27.9) | 11.6 (0.0; 0.0) | 13.5 (7.9, 23.8) | 11.6 | |
Season | DFJ | 46 | 87.0 (19.6; 19.6) | 13.0 (0.0; 0.0) | 12.7 (4.1, 26.2) | 0.0 |
MAM | 46 | 30.4 (0.0; 0.0) | 69.6 (0.0; 0.0) | −10.7 ( − 19.0, 1.6) | 0.0 | |
JJA | 46 | 54.3 (2.2; 2.2) | 45.7 (2.2; 2.2) | 1.4 ( − 11.2, 18.5) | 4.3 | |
SON | 46 | 82.6 (4.3; 4.3) | 17.4 (0.0; 0.0) | 16.7 (5.5, 25.5) | 0.0 |
RESULTS
Fixed period trends
In low, medium, and high flow indices for the 1985–2014 short period, positive trends are prominent (Table 3). Over 70% of stations report an increasing trend in 11 of the 12 indices, with spring (MAM) mean flow the exception, showing strong and statistically significant decreases (16% of stations under BBS). Eight per cent of stations show statistically significant increases in winter (DJF) mean flow resulting in a median trend magnitude of +14.3% (+5.5%, +25.1%) across the network. For the 1965–2014 long period, increasing trends continue to dominate the majority of medium and high flow indices (Table 4). Similarly to the short period, almost 70% of stations show a decrease in MAM, and while none are statistically significant the UK-wide trend magnitude is −10.7% (−19.0%, +1.6%). The number of stations with increasing and decreasing trends is more even for low flows (Q95, Q90), as well as Q70, Q50, Q30 and JJA; thus, overall increasing and decreasing trend magnitudes tend to cancel each other out resulting in UK-wide median trend magnitudes in low flow indices of just +1 to 2% over 1965–2014. However, statistically significant increases are found in AMF, Q30, Q10, Q05 and DJF, ranging from 8.7% of stations (Q30) to the highest 27.9% of stations (Q05) under BBS, with median trend magnitudes ranging from +1.8% (−7.6%, +10.2%) to +13.5% (+7.9%, +23.8%), respectively.
Trends are mapped for selected low (Q95), medium (Q50, AMF), and high flow (Q05) indices for both fixed periods in Figure 3. Spatial patterns of trends in the short period (top row) show a spatially consistent increase across the UK, although few of these are statistically significant. On the other hand, for the long period, low flow (Q95) and median flow (Q50) trends (bottom row) show a marked spatial gradient with increases in the north and west and decreases in the south and east of Britain. AMF appears to follow a similar pattern to that of high flows (Q05), but with fewer statistically significant trends, with strongest increasing trends for catchments in Scotland over 1965–2014. Patterns in long period seasonal mean trends (Figure 4) for summer (JJA) reflect the north-west/south-east gradient found in low lows (Q95), whereas decreasing trends in spring (MAM) flows occur across the majority of Britain. The wetter winter (DFJ) and autumn (SON) seasons follow broadly the pattern of observed strong increasing trends in AMF and Q05, particularly in Scotland. Overall, the pattern of changes found in AMF are clearly biased towards patterns in wetter seasons and high flows.
It was found that accounting for serial correlation was important. Almost all indices in both short and long periods had stations with statistically significant serially correlated series (at the 5% level). These were most prominent in low flow indices (∼30% of stations in the long period for Q95). There are several cases where the number of statistically significant increasing/decreasing trends was reduced when block-bootstrapping was applied to serially correlated series (Tables 3 and 4).
Temporal variability analysis and persistence of trends
While it is necessary to analyse trends using fixed periods for a relative comparison of direction, magnitude, and spatial patterns, these are just snapshots of the temporal evolution of changes over time – as demonstrated by the marked differences in trends between the two fixed periods (i.e., Figure 3). Apparent from the standardised and smoothed series in Figure 5 (left column) is widespread consistency of decadal scale variability across the flow regime from Q95 (top left) to Q05 (bottom left). There is a marked transition from low to high flows in the 1970s as well as an increase in flows in the early 2000s. These have consequences in terms of placing results from the two fixed periods in context of the overall variability (the start years of the two fixed periods are marked as vertical dotted black lines). The dependency of trends on period of record is captured in the trend persistence analysis (Figure 5, right column) whereby MKZs values for series are highly variable through time. For low flows (Q95), trends with start years in the early 1970s result in strong increases, but stations with longer records show this is an artefact of the period used and instead longer-term trends are not increasing strongly in low flows. This is in contrast to high flows (Q05) where longer records tend to show stronger increasing trends.
DISCUSSION
Our results from the trend analysis of the updated UK Benchmark Network (UKBN2) using recent UK-wide flow records show no fundamental discrepancies with previously published flow patterns in annual low, medium, and high, or seasonal mean flow indices, despite the prevalence of notable hydrological extremes in the most recent decade. Overall indicators of water availability (AMF, Q50, and seasonal mean flows) for benchmark stations are consistent with Hannaford & Marsh (2006) and the UK national outflow series (Marsh & Dixon 2012). AMF has increased across the UK but mostly in Scotland, and follows a similar pattern to winter and autumn mean flow. There is a marked spatial north-west to south-east gradient for summer trends (Figure 4(c)) with much of England showing decreases and increases in the north-west. The decreasing trend in spring found by Hannaford & Buys (2012) for Britain and Murphy et al. (2013) for the Island of Ireland was also found here and we echo calls for improved understanding of the drivers of these seasonal changes in river flow, which may have important implications for water management and ecology. Generally, the results reinforce earlier findings but strengthen them given the use of a more rigorous, updated Benchmark Network.
One of the most societally relevant impacts of climate change is the expected increase in flooding due to increased precipitation intensity in a warming climate. Compelling evidence from the literature is emerging for flood-related variables in maritime-influenced upland areas in the north and west, including detected increases in observed winter precipitation (Dadson et al. 2017), extreme precipitation (Jones et al. 2014), and high flow and flood indices (Hannaford & Marsh 2008). While it is acknowledged the high flow indices used here do not explicitly characterise flooding, detected changes support the conclusion of a tendency for an increase in high flows over the past 50 years, and this signal is robust when longer records were considered (Figure 5(h)). Nevertheless, there is also remarkable evidence of DCV in the flow series. Trends are not part of a simple linear increase, but form a multitude of flood-rich and flood-poor episodes over time. The increase in high flows in the past decade or so (Figure 5(g)) appears to be part of a flood-rich period from the late-1990s onward (see Wilby & Quinn 2013), largely driven by decadal-scale clustering of flood-generating cyclonic and westerly weather types, which have been linked mostly to changes in the North Atlantic Oscillation (NAO) (Hannaford & Marsh 2008; Svensson et al. 2015). Understanding the drivers and evolution of these periods of flood propensity should be a research priority, especially quantifying the role climate change might play in altering the dynamics as well as interactions with catchment properties; the UKBN2 provides a climate sensitive dataset to contribute to this and results from such analyses as performed here will help inform current science policy-making discourse (e.g., Dadson et al. 2017).
Low flows show few statistically significant decreasing trends, as found in previous studies (Hannaford & Marsh 2006; Hannaford & Buys 2012). However, while not statistically significant, decreasing 50-year trend magnitudes in the English lowlands are in the −10–30% range for several catchments (Figure 3(e)), especially for summer (Figure 4(c)), and might be important for water management. Wilby (2006) showed the signal-to-noise ratio for basins in the UK is low, particularly in summer, and that robust statistically detectable trends are not expected for several decades yet. This is further highlighted in Figure 5(a) with strong evidence of DCV, and hence trends are sensitive to the period of record analysed (Figure 5(b)). This is most prevalent for records beginning in the 1960s and 1970s, which is the case for the majority of UK trend studies as hydrometric network expansion coincided with a period of a particularly high degree of natural variability.
In addition to the previous limitations, few catchments in the densely populated region of southern and eastern England can be considered pristine in the strictest sense, so caution must be exercised in interpreting changes in low flows in this region. Nonetheless, the consistent temporal and spatial pattern across low, medium, and high flow indices (Figure 5, left column) is encouraging and suggests that even in the English lowlands river flows are generally reflecting changes driven by climate, rather than from artificial sources (e.g., from groundwater and/or surface water abstraction) which, while controlled as far as possible in the benchmark designation, cannot be ruled out in the catchments flagged as ‘caution’. However, it is challenging based on these results alone to provide clear guidance regarding potential long-term implications for water resources management, so future work that combines innovative observational and modelling approaches using several lines of hydroclimatic inquiry is still needed. It is also noted that the majority of studies examine changes in low flows, rather than actual drought ‘events’, and so such event-based analyses should be another research priority.
While RHNs are vital in hydroclimatology, there are growing calls for the need to also improve our understanding of how the hydrological cycle is responding to rapidly changing human systems (Montanari et al. 2013; Van Loon et al. 2016). By the very nature of RHNs, such impacts are removed or controlled, by definition. This can be seen as an inherent limitation of RHNs, of which end users must be aware when designing their analyses: RHNs typically quantify changes in small, headwater catchments away from the downstream population centres that are most likely to be affected, socio-economically, by any changes in hydrological extremes. For example, there are no UKBN2 catchments >1,500 km2 so any study, including the trend analysis here, will be biased towards medium and small catchments, particularly in the south and east of England as abstractions and discharges are less prevalent in headwater catchments. There is also a dearth of very small catchments; only four catchments within the UKBN2 dataset have areas <10 km2 and only one of those can be considered upland (elevation >300 m a.s.l.). Therefore, processes operating only at these scales would not be captured. On the other hand, RHNs can provide a near-natural baseline for comparing with human-influenced sites (e.g., using paired ‘impacted’ catchments as in Prosdocimi et al. 2015) or for modelling studies, so can play a vital role even in efforts to quantify human disturbances on the hydrological cycle.
The second iteration of the UKBN has made several improvements since UKBN1, but there are many potential further improvements that could be made to future iterations of the dataset and to how users access it. For example, we anticipate future analytical efforts will undertake comprehensive homogeneity testing and infilling, while a particular focus will be efforts to improve the assessments of AIs. One of the most challenging aspects of the UKBN update was the fragmented quality and availability of information on AIs, especially access to water abstractions and discharges. While some datasets were consulted (e.g., Hannaford et al. 2013b), information is typically based on model estimates of impacts, and not available widely across the UK. Hence, benchmark qualifiers in UKBN2 are necessarily brief and qualitative. Finally, we hope to improve access to benchmark data and analyses through an online NRFA trend facility – following the example of the Australian Bureau of Meteorology's data and trend explorer (Zhang et al. 2016).
CONCLUDING REMARKS
The first designation of the UKBN has proven a valuable dataset that has fed into many national and international scientific studies, several of which are relied upon for making policy and water management decisions on future flood design and long-term drought planning. Results from the trend analysis of the updated UKBN2 have reinforced previous findings. We recognise the UKBN will always remain a work-in-progress as new information about gauging stations and the catchments they drain comes to light, or new techniques for assessing benchmark suitability developed. A benchmark version control system has been instigated to ensure minor and major network changes are recorded in a transparent way, the datasets are easily accessible, and studies using previous versions reproducible. Further information about the UKBN2 and how to access the data can be found here: http://nrfa.ceh.ac.uk/benchmark-network.
A community effort involving both those who collect the data (Measuring Authorities) and those who use it (e.g., researchers and practitioners), would make the process of UKBN evolution and updating more efficient and comprehensive. We hope by releasing the UKBN2 we present an opportunity for the hydrological community to provide ideas, novel methods, and feedback on the current version. The metadata holdings of the NRFA and the knowledge of NRFA and Measuring Authority experts are only one set of performance criteria; there is no doubt a wealth of other local knowledge, and a wide range of initiatives generating useful information about these catchments and the gauging stations that monitor them (e.g., ongoing studies of rating uncertainty, e.g., Coxon et al. (2015); national-scale modelling studies that could potentially provide naturalised data estimates and degrees of influences, e.g., Rudd et al. (2017)). We therefore invite users to provide information on these catchments, or others that may be candidate benchmark catchments, via contacting the NRFA ([email protected]). We are also interested to understand the range of uses of the network, and invite users to engage with the NRFA team about current and future applications of the dataset.
ACKNOWLEDGEMENTS
This work was funded by NERC National Capability funding to CEH. Statistical analyses and graphics were carried out using the open source R programming language. We thank Measuring Authority personnel for their cooperation and contributions during the UKBN2 review process, and their ongoing support of the Benchmark Network initiative. We thank Cath Sefton and Simon Parry (CEH) for their feedback on the network. We also thank the wider NRFA team for periodic input to and discussion about the evolution of the Benchmark Network. Finally, we thank the three anonymous reviewers for their comments that greatly improved the paper.