A quality-control framework for sub-daily ﬂ ow and level data for hydrological modelling in Great Britain

The absence of an accessible and quality-assured national ﬂ ow dataset is a limiting factor in the sub-daily hydrological modelling in Great Britain. The recent development of measuring authority APIs and projects such as the Floods and Droughts Research Infrastructure (FDRI) programme aims to facilitate access to such data. Basic quality control (QC) of 15-min data is performed by the data collection authorities and the National River Flow Archive (NRFA). Still, there is a need for a comprehensible and veri ﬁ able quality-control methodology. This paper presents an initial assessment of the available data and examines what needs to be done for the applicability of the data at a national scale. The 15-min ﬂ ow series has many inconsistencies and inconsistencies with the NRFA annual maximum values. When producing a QCed data-set, decisions regarding the retention of data values need to be taken and recorded. Furthermore, QC should remove and rectify erroneous values, such as negative and above world record ﬂ ows; and an assessment of homogeneity and truncated values in the stations could be bene ﬁ cial to ﬂ ag suspect data. The complex chain for production and changeability of ﬂ ow and level data makes data curation and governance imperative to assure the longevity of the dataset.


INTRODUCTION
It was no surprise that the poll held during the Floods and Droughts Research Infrastructure (FDRI) panel discussion at BHS2022 pointed out that most hydrologists in the room identified improving the quality of hydrological data as the highest priority among hydrologists in the UK.With the turn of the century, hydrological records became longer and more available.With more data, a shift in hydrology has occurred from a demand for more observational data and a 'value of data' approach (Beven & Binley 1992;Beven 1993) to a heavier focus on the uncertainties in the input data before model fitting (Beven 2006).Currently, data quality is considered one of the key factors for the improvement of hydrological predictions and modelling (Beven 2019;Blöschl et al. 2019;Wagener et al. 2021).However, data measurements are surrounded by epistemic uncertainties (Beven 2021) that are often propagated or extrapolated in the generation of datasets that are suitable inputs for hydrological modelling (Mcmillan et al. 2012;McMillan et al. 2018McMillan et al. , 2022)).
The digital revolution has facilitated data sharing and has allowed the use of computer-intensive techniques in science.Paradoxically, it might have made science less reproducible.Components like data, models, code, and instructions must be made accessible to fully reproduce a hydrological experiment.Although authors and institutions strive to enhance reproducibility, their efforts lag behind the rapid data revolution.While Journal policies such as data and code availability upon request improve experiment replicability, they remain insufficient for the majority of cases (Stagge et al. 2019).Trust in experiment reproducibility is dwindling; a survey noted that 90% of scientists believe that there is a reproducibility crisis in academia (Baker 2016).This extends to hydrological sciences where renowned hydrological journals host only 0.6-6.8% of fully reproducible articles, a key issue being the availability of the data used (Stagge et al. 2019).
Assuring datasets to have a certain degree of quality while making them accessible to the public is a challenge in hydrology.Manually checking large datasets is a time-consuming procedure that is subject to human errors.Hence, together with manual checks at a local level, done by station operators, performing automatic quality-control checks is necessary before making a dataset available.These checks can be very simple, like checking if gauged flow or rainfall is above a theoretically impossible threshold or avoiding physically impossible values (Coxon et al. 2015;Gudmundsson et al. 2018;Lewis et al. 2018b;Crochemore et al. 2020;Lewis et al. 2021), to more complex checks such as checking if rainfall gauges are consistent with nearby gauges (Lewis et al. 2021).
Another outcome of the digital revolution has been an incremental increase in the temporal and spatial resolution of data, improving the knowledge of physical processes in hydrology.Notably, the use of sub-daily rainfall observations has improved our understanding of short-duration rainfall extremes and their increasing intensity with global warming (Barbero et al. 2017;Prein et al. 2017;Guerreiro et al. 2018;Fowler et al. 2021).However, this increase in rainfall extremes does not necessarily translate to an increase in flood hazards (Hettiarachchi et al. 2019;Sharma et al. 2021), as there are multiple complex drivers of floods such as snowmelt and antecedent soil moisture conditions (Massari et al. 2014;Arheimer & Lindström 2015;Wasko & Nathan 2019).In smaller/flashy catchments, evidence of flood hazard increases due to climate change is more pronounced, as other drivers play a less important role in their physical processes (Wasko & Sharma 2017;Wasko & Nathan 2019).Despite this, one of the key issues in flood modelling in smaller catchments is the absence of reliable sub-daily resolution flow or level data.This is because the daily flood peaks tend to underestimate the true flood peak and potentially rapid rates of rise (Archer & Fowler 2021), even when the data are disaggregated into smaller time-steps, especially in smaller catchments (Chen et al. 2017;Beylich et al. 2021).
Using sub-daily resolution data for flood modelling could deliver a step change in identifying future flood events and mitigating flood damage.The UK already has a range of sub-daily rainfall datasets available, such as INTENSE (Gauge data -Blenkinsop et al. 2018) and CEH-GEAR (Gridded data - Tanguy et al. 2021), that are quality-controlled (Blenkinsop et al. 2017;Lewis et al. 2018b) and open-source.Historically, the use of sub-daily flow data has been limited to flood estimation methodologies in the Flood Estimation Handbook (FEH), which rely primarily on annual maxima (AMAX) extracted from a sub-daily flow time series.The sub-daily records are available upon request to the relevant authority, with the data being quality-controlled at a local level.Nevertheless, the data are dispersed and there are no consistency or quality-control checks performed at a national level, or traceable data versions.Such procedures could be highly beneficial for improving the quality of flow and level of data in the UK and are complementary to projects aiming to make data in the UK more accessible, such as Floods and Droughts Research Infrastructure (FDRIhttps://www.ceh.ac.uk/our-science/projects/floods-anddroughts-research-infrastructure-fdri).
There is an evident shortage of sub-hourly flow/level data publicly/easily available in the UK, which will further be presented in the next section; then, some issues encountered with the datasets will be assessed; next, the reasons for these inconsistencies will be explained; finally, the article will discuss the challenges that need to be addressed to use the 15-min flow data of UK for hydrological modelling.

CURRENTLY AVAILABLE UK DATA
The UK has a wide variety of rainfall and flow products (Figure 1 and Supplementary material, Annex 1).Sub-hourly rainfall datasets are available in several formats, i.e., gauging, gridded and future projections, with potential to be used in a range of hydrological models.Nevertheless, their use, at a national scale, is limited, in part, by the availability of quality-assured streamflow datasets to validate these models.Continuous hydrological models at a national scale, such as Grid-to-Grid (Bell et al. 2009), FUSE-GB (Lane et al. 2019), LSTM-GB (Lees et al. 2021), and SHETRAN-GB (Lewis et al. 2018a) are set up at a daily temporal resolution, because quality-assured flow datasets, such as CAMELS-GB (Coxon et al. 2020) and the National River Flow Archive (NRFA) are only available at daily time-steps.
Similarly, statistical and hydrological models are utilized for the generation of future flow datasets, e.g., Future-Flows and e-Flag (Figure 1- Haxton et al. 2012;Hannaford et al. 2022).These will subsequentially be used in predictive applications, such as defining climate change flow allowances (Kay 2021) and identifying potential trends in flows with climate change (Collet et al. 2018).The new UK future climate projections, UKCP18, have been made available at sub-daily temporal resolution, with new convection-permitting climate models applied to downscale regional to local projections -UKCP Local (Fosser et al. 2019).Still, sub-daily future flow predictions are limited by the calibration of hydrological models, and consequentially, the lack of a sub-daily flow database.Thus, even with the enhanced resolution offered by the UKCP Local, future flow models are still being run on a daily timescale, given the calibration data limitation.
In Great Britain, 15-min flow time series is recorded by the Environment Agency (EA-England), the Scottish Environment Protection Agency (SEPA-Scotland), and Natural Resources Wales (NRW-Wales).In a first step, the 15-min data are quality controlled, periodically, at the agency level.Then, the data are transformed into an AMAX and a peak-over-threshold (POT) dataset by the NRFA.These are the series used in industry-standard methodologies for flood predictions in the UK, with a further effort to quality controlling these peaks, manually, by the agencies and UKCEH.The procedure entails an annual check of a subset of the whole data available in the archive.Finally, an additional quality assessment is conducted to determine the station's suitability to different analysis.A gauge is labelled as appropriate for QMED if the measurement error for QMED values does not exceed 30%.For stations considered for pooling, measurements of AMAX1, AMAX2, and AMAX3the top three annual valuesare considered.If these values are deemed accurate, that is, having errors for AMAX2 and AMAX3 below 30%, and AMAX1 being precise enough, the data are designated as suitable for pooling.Despite being semi-qualitative with no level of confidence associated with it (Wallingford HydroSolutions 2016), these checks play a crucial role in UK flood design analysis.They enable the categorization of stations based on their confidence levels in higher flows: differentiating between highly confident pooling stations, stations with a reasonable degree of confidence in higher flows suitable for QMED analysis, and stations providing inaccurate data for flows surpassing QMED.
For continuous data analysis, the 15-min records are available upon station request done to the agencies.Nevertheless, it's important to note that the quality control and assurance measures implemented for peak flow values within the NRFA archive do not extend to the 15-min station time series.This discrepancy arises due to the focus of NRFA's quality-control efforts exclusively on peak flow values, rendering them inapplicable to the continuous time series data.
Recently, access to the continuous data have recently been facilitated by the development of APIs, such as the SEPA, NRW, and EA (NRW 2016; SEPA 2022; Environment Agency 2023).However, these APIs are still subject to specific data limitations.For instance, the EA API lacks the inclusion of all stations available in the NRFA annual maximum archive.The SEPA API imposes an initial restriction on the volume of data that can be downloaded within a single day.Also, the NRW API exclusively provides access to the most recent year's 15-min data.

Data used
This study uses 15-min flow and level data from the UK agencies, and the NRFA AMAX flow and level series.
The SEPA 15-min data were downloaded from their APIs, with a provided access key giving access to a larger number of daily downloads.With the API, all the available gauging stations were downloaded, totalling 315 flow and 390 level stations, with 274/273 level/flow stations identified in the NRFA archive.NRW and the EA have provided data upon request, with raw datasets coming from WISKI, a software for hydrologic data storage.Due to measuring authority time and workforce constraints, only peak flow stations, both suitable for pooling and QMED, from the EA and NRW were requested: 607/74 flow and 608/76 level stations were identified from the EA/NRW provided downloads.The data obtained were the continuous/semi-continuous flow and level time series for these stations, alongside the quality-control code given by them at the agency level.Finally, the latest version of the NRFA AMAX series was downloaded from their website.

Pre-treatment of data
Before use, the 15-min time series from the agencies was standardized and joined.Often, local agencies store data with different headers and nomenclature, e.g., some EA agencies use different columns for date and time, while others use a date-time column.Another occurrence, in very large time series, were splits of data into different .csvfiles that often-contained repeated dates.Finally, the time series of flow and level was capped until the year of 2021, the last available full year of data; and to two decimal cases, for standardisation and data storage purposes.

Mismatch analysis
The next step was focusing on identifying and understanding the reason why mismatches in the 15-min time series occur.A mismatch is categorized as when a time series has duplicate date-time values with different flows or level values.Two types of mismatches were analyzed: iduplicate dates with different values in the 15-min time series; iian analysis of AMAX values that did not match the NRFA AMAX values.

Data summary by country and date
Table 1 summarizes the stations available in the level and flow datasets.There is a noticeable difference among Scottish (SEPA) and English/Welsh (EA/NRW) data.The first was sourced directly from their APIs, and notwithstanding the shorter median length of the dataset, the Scottish data had fewer gaps and fewer stations with .10%data missing.There were also no mismatches within the Scottish time series.English and Welsh data, gathered upon request to the EA and NRW, presented similar statistics in terms of median length, gaps, and percentage of stations with .10%missing data (Table 1).
The UK has digital 15-min data tracing back to the 1930s (two level stations and one flow station), with an exponential growth of station coverage from the 1950s to the 1990s.Overall, hydrological flow and level data becomes more consistent in the 1980s, when spatial coverage reaches 80% of its current capacity; and most of the dataset does not have any temporal gaps in it (Table 2).A further increase in data completeness is perceptible from the 1980s to the 1990s, with stations going from an average of 83 (level) and 82 (flow)% of temporal coverage data to 94%.From the 2000s, the data reaches the spatial and temporal completeness levels as of today, with more than 90% of the stations with no value gaps within the decade.There is no definitive answer on why the improvements in the number of stations record and completeness happened, but some important changes in UK hydrometry and hydrology have happened in the end of last century: the release of the flood study reports, in 1975, and subsequent investments in gauging stations; the transition from charted data to digital records, that occurred mostly in the 1980s; an increased focus, from agencies, in accurately recording high flows for flood studies, rather than only low flows for water quality are all pointers of an increase in interest in the upkeeping flow and level records.

Mismatches
SEPA data were obtained directly from their APIs, as a continuous time series, without duplicates.In contrast, the other two agencies had duplicate records in their datasets.The 15-min data from Welsh and English measuring authorities are categorized into two types of files: an irregular (EA) or 0-s (NRW) time series, covering the earlier period until 2003/04 when the current storage system was implemented.These series, extracted from charted data, occasionally have flows with higher resolution than 15 min, especially during high peaks.However, there is no consistent pattern for these occurrences; they vary from station to station.
The second part of the dataset comprises of systematically regular 15-min data, extending to the present day.Since the resolution of both time series differsone maintains a constant 15-min resolution, while the other generally is recorded at 15min intervals, but with irregular time-steps during peaksthey have not been merged and are stored separately.The issue of duplicate records arises due to temporal overlaps between the 15-min data and the irregular time series.This inconsistency becomes problematic when the same flow or level series do not present the same value at the date time-step.
Among the 681 stations in the EA and NRW dataset, 556 of them had at least one occurrence of duplicate date-times.In most cases, these duplicates contained identical information.However, there were 143 stations with the same date-time indexes that held different flow records.Most mismatches took place between the 1970s and the 2000s (Figure 2), with a progressive decrease in the percentage of mismatched stations from the 1970s to the late 1990s (Figure 2).A significant increase in mismatches is observed in the early 2000s, mainly in 2003, coinciding with the transition of data to the new system of storage, WISKIS.After 2004, only three stations presented overlaps.

DISCUSSION
Why do we need a quality-assured sub-daily flow and level dataset at a national scale?
In the UK, 15-min flow data have wide-ranging use in the calibration of industry-standard models.The FEH statistical method uses 15-min data to derive their POT series (Robson & Reed 1999).This is a necessity, as more than 75% of NRFA catchments have a time to peak (T p ) smaller than 8.25 h (Kjeldsen 2007).On these occasions (T p , 24 h), daily intervals are insufficient to capture instantaneous peak flows happening during a flood in a catchment.UK sub-daily flow time series has also seen some applicability in academia.For example, Prosdocimi et al. (2015) used subdaily flow time series to investigate the effects of urbanization in extreme floods, by comparing similar catchments that only differed in their urbanization level.Hourly flow data for the River Axe hourly was used to identify how agricultural land use change could impact runoff in the region (Climent-Soler et al. 2009).Still on the River Axe, rates of rise and their potential changes with land use have also been studied (Archer et al. 2010).
Nevertheless, the lack of an easily-accessible and quality-assured sub-daily flow dataset hampers the capability of large-scale national studies.In the U.S., such a dataset (Showstack 2007) has facilitated studies at national and regional scales, for instance: the flashiest catchment types and cities prone to flash floods have been identified in the continental U.S., with good matches found between places prone to floods and flood fatalities (Smith & Smith 2015), and a study of the seasonality of floods, allowing the correlation of location, timing, and drivers across the continental U.S. (Villarini 2016).
Furthermore, works on UK future hydrology have pointed to an overall increase of extreme events, with drought increases being more significant than floods (Collet et al. 2018).However, these studies use a daily timescale, while hourly rainfall extremes have been shown to increase at a higher rate than daily events (Guerreiro et al. 2018;Fowler et al. 2021); consequently, these studies might be underestimating flood events.A sub-daily national flow dataset could improve our understanding of future floods in the UK, showing the benefits of producing an open-source, quality-assured dataset.Issues with the datamismatches Some mismatches have been manually checked, aiming to identify patterns in the dataset and to understand if there is an identifiable 'truer' value.There is a high variability in the reason for mismatches, some examples: failure of the instrument in high flows, such as a shaft encoder slip, rectified in one file but not in the other (NRFA station 33015-2003-01-04); manual modifications that were done in one time series and not in the other, such as vertical shifts (NRFA station 47008-2003-02-17), more often recorded in the 15 min time series and not in the irregular one, as in the sampled station; typos in dates (NRFA station 52014-1997-10-01); values that were computed but still unchecked in one sheet, while having been checked in the other (NRFA station 52006 -2000-November); and rating curve information mistranslations, in which the same station had non-mismatched level data and mismatched flow data (NRFA station 8426-2005-01-07).After 2003, not only the number of mismatches and mistakes in the data are greatly reduced, but also the errors are more systematic and detectable.For instance, NRFA station 47007-2004 to 2015, has one sheet with complete data, matching NRFA POT and AMAX flows while the second has missing data/not matching NRFA.
Additionally, the irregular time series often has repeated date-time values with different flows/levels.They reflect recordings that were taken at smaller uneven time-step intervals, with two recordings in the same minute.In most cases there is no indication that these records are incorrect, with a 'good' quality code by the agency and no visible issues when the flood wave is plotted.In the digitization process, in the purpose of depicting flood waves more accurately, charted data have sometimes been digitized in smaller than 15-min intervals.When the change was abrupt, the same time-step can have two different flow records.Nevertheless, in some cases, there will be an abrupt change in the magnitude of the flow, accompanied by a change in the quality code, from good to suspect, indicating errors in the measurements (NRFA station 68007-1992-09-18).
Regarding NRFA AMAX mismatches, even though both time series originate from the same 15-min dataset, the NRFA AMAX archive has additional quality-control checks and is regularly checked for the identification and removal of 'flawed' data.Before being part of the NRFA AMAX archive, the 15-min flow and level stations go through: (1) A selection process within the environmental agencies, aiming to remove stations that have unreliable values for high flows; (2) A verification of the quality of these stations periodically, e.g., SEPA verifies the level stations on a monthly basis and flow stations annually; (3) Additional quality-control by the NRFA, aiming for consistency in the AMAX and POT values; (4) Following the NRFA Quality Control (QC), some stations will be discarded while others will have their values edited to better reflect reality.Therefore, the mismatches between the NRFA AMAX values and the 15-min records are indicators that after further QC the peak flow of the station has been modified.
Integrating NRFA peak flow checks into the 15-min time series poses a challenge, since manual checks correct peak flow values but do not provide continuous flow event corrections.While extra quality-control efforts have been applied to these peaks, they cannot be systematically applied to the whole continuous record.The only automated outcome of these corrections is a flag in the 15-min record, indicating these discrepancies.Another challenge in the integration of the NRFA dataset to the continuous flow dataset available is the fluidity of the NRFA data, in which part of the stations are quality checked every year, and a new version, with corrected values, released to public.Issues with the data -Other checks Some stations in the 15-min dataset present values that are not physically realistic, that is, negative or higher than world record flow and level values.From initial manual inspection, these do not necessarily need to be discarded.Negative level values that are very close to 0 can reflect limitations in the instrument of measurement; while world record values could be a mistake in decimal places.Automatic flags to check and correct or discard these values are necessary.
A second quality step is using summary statistics, e.g., mean, min, max; and hydrologically-relevant indexes such as day of minimum/maximum streamflow (Gudmundsson et al. 2018), to identify potentially suspect gauges.Finally, an analysis of homogeneity using statistical tests (Gudmundsson et al. 2018;Crochemore et al. 2020) and the identification of high truncated values by checking streaks of repeated values (Lewis et al. 2021), can be used to further identify data-quality errors.
Continuous quality checks are performed manually on 15-min data in the UK by the NRFA and the agencies.Furthermore, projects such as the 2016 update to the National Risk Assessment of inland flooding risk (Aldridge et al. 2017), have produced occasional quality checks at a national level.Nevertheless, these have not produced a quality-control methodology that is nationally applicable, open-source, easily updatable and verifiable.
The 15-min dataset needs governance and curation For a reliable 15-min QCed dataset, decisions on which values to keep and to flag as suspect will need to be taken (Figure 3).Furthermore, data governance and curation are imperative processes to assure data longevity and usability.Governance and curation processes should include: detailed documentation on every modification that has been done to the data; metadata with the maximum amount of information on the data source, e.g., provenance (EA, NRW, SEPA, and irregular or 15-min), station type, agency quality-control code; metadata information with the quality-control checks done on the data; the code used for processing the data; formatting of the data aiming for easy accessibility and understanding.
The benefits of having this information available include: the possibility of easily extending the dataset to more stations or time periods; identification of mistakes and incongruencies in the dataset on a station or systematic scale; possibility of modification of the dataset according to user needs such as, a change in the resample timescale and addition of previously removed data; and a deeper user understanding of the capabilities and usability of the dataset.

CONCLUSIONS
The UK has sub-daily flow and level data recorded at a national level; nevertheless, no sub-daily national product is available to the public.Making such a dataset is fundamental for cutting-edge research, allowing the development of high-resolution continuous hydrological models.It is expected that the increase in the temporal resolution of models will help in understanding hydrological processes from the past and modelling potential changes from climate change, especially in high flow scenarios.
Making a comprehensible continuous sub-daily flow and level national dataset is a challenge.There is a lot of variation in how the data are kept by the different measuring authorities and according to the date of the records.To make the dataset trustworthy, the raw data must be cleaned and standardized; then, a national QC procedure should be applied to remove erroneous and to flag suspect data.Having an intelligible procedure is crucial to the upkeeping and improving the quality of the flow and level data.The measuring authorities time series are under constant review and modifications; hence, a procedure that allows data modification both ways is needed: wrong or suspect values detected by the automatic QC procedure should be translated to the agency time series; and in the other way, the detection of incorrect values by the authorities should also be translated to the QCed time series.From that perspective, data governance and curation with complete information on the procedures applied to the data can guarantee the dataset and the QC process longevity and reproducibility.
Finally, keeping the QC procedure open-source, verifiable and changeable is important in the optic of potential modifications to rating curves.A flow dataset, the most common input for hydrological model calibration, is not observationally based, being rather an estimation based on a rating curve.Rating curves are often changed based on observed changes in the channel.A flexible and updatable QC procedure could consider these future changes within the flow dataset.

Figure 1 |
Figure1| Summary of daily and sub-daily rainfall and flow datasets available in the UKfor more information on a particular dataset please look at the superscript number and the reference in Supplementary material, Annex 1.

Figure 2 |
Figure 2 | Percentage of stations where mismatches (NRFA and within time series) occur per year.

Figure 3 |
Figure 3 | Framework to quality control the datasets of the measuring authorities.

Table 1 |
Summary of stations and mismatches in the 15-min flow and level dataset for the countries of Great Britain

Table 2 |
Decadal summary and completion rate (mean, median completion, and stations with full data) of the 15-min flow and level dataset (H ¼ level data, Q ¼ flow data)