The replacement of models by emulators is becoming a frequent approach in environmental science due to the reduction of computational time, and different approaches exist in the water resources modelling literature. In this work, an emulator to mimic a hydrological model at catchment scale is proposed, taking into account the effect of land use on the hydrological processes involved in water balance. The proposed approach is novel for its combination of techniques. The dimension of the temporal model output is reduced via principal component analysis, and this reduced output is approximated using Gaussian process emulators built on a conditioned Latin hypercube design to reflect constrained land use inputs. Uncertainty from both the model approximation and the dimension reduction is propagated back to the space of the original output. The emulator has been applied to simulate river flow in a rural river basin located in south west England, the Frome at East Stoke Total, but the methodology is general. Results showed that the use of the emulator for water resources assessment at catchment scale is an effective approach, providing accurate estimates of the model output as a function of land use inputs, for a reduction of the computational burden.
Environmental modelling, particularly for the simulation of geological and climatic processes, usually requires the use of computationally expensive and long runtime codes. This is an issue for analyses that require a large number of model simulations, such as a sensitivity analysis or model calibration, and such analyses can become prohibitive or even impossible using the original model. Recently, the development of new methodologies has been explored, in order to approximate complex environmental models with simpler and more computationally efficient models (e.g. Castelletti et al. 2009; Fu et al. 2010; Lima et al. 2018). One of the most frequently used approaches is emulation or metamodelling. According to the definition given by Castelletti et al. (2012), an emulator can be considered as a lower-order approximation of a complex, process-based model, designed to be used as a surrogate in order to overcome the large computational burden. In general, the purpose of an emulator is to replicate the functions of one system with a different system that attempts to reproduce its behaviour (Abrahart & See 2007; Stanfill et al. 2015). A variety of different emulators have been developed and tested in a wide range of applications, such as river basin and groundwater management (Solomatine & Ostfeld 2008; Mirfenderesgi & Mousavi 2015; Roy et al. 2016; Christelis et al. 2017), water quality modelling (Castelletti et al. 2010), computational fluid dynamics (Moonen & Allegrini 2015), fire emission modelling (Katurji et al. 2015), and ecological modelling (Leeds et al. 2013).
Hydrological modelling is a field which benefits greatly from the use of emulators, not only for facilitating common tasks such as sensitivity analysis or calibration, but also for other applications, such as the investigation of effects of climate or land use change on water resources. Indeed, this kind of analysis requires a great number of model runs in order to assess the uncertainty related to predictions in potential future scenarios. Metamodelling approaches have been widely applied in various hydrological modelling problems, as reported by Razavi et al. (2012) who reviewed and analyzed a number of studies on emulators in the water resources modelling field. One of the first applications of emulation in hydrology can be attributed to Romanowicz et al. (1996), who developed a distributed flow routing model emulator to explore the accuracy in flood prediction. In order to predict water level in the Severn River basin (UK), Young et al. (2009) presented and applied a nonlinear Dynamic Model Emulator (DME) to reproduce the output of the HEC-RAS model (Brunner 2001). Fraser et al. (2013) proposed the application of a metamodelling approach to improve the performance of a physically-based model in the assessment of the implications of land use changes on peak flows at the catchment scale. Romanowicz & Osuch (2015) developed an emulator based on the MIKE11 model (DHI 1997), obtaining reliable flow predictions for a basin of the river Vistula in Poland. More recently, Machac et al. (2016) and Carbajal et al. (2017) investigated the application of emulation in urban drainage modelling. Machac et al. (2016) proposed an emulator to simplify the application of the Storm Water Management Model (SWMM) to the urban drainage system of the city of Adliswil (Switzerland). Carbajal et al. (2017) compared the performance of two different kinds of emulators, mechanistic and data-driven, to mimic the results obtained by the application of an urban drainage model of two small catchments in Switzerland, finding that data-driven emulation outperforms mechanistic emulation where the mechanistic prior is linear.
In this work, a statistical emulator has been developed to mimic and replace a hydrological model based on a soil moisture balance (Rushton et al. 2006). This model is quite simplistic in nature and has mainly been chosen to demonstrate the emulator methodology. The proposed approach is unique as it combines a range of techniques to develop an efficient emulator, namely: the use of principal component analysis (Jolliffe 2002) to reduce the high dimensional nature of the model output; the use of models from the field of mixture experiments (Cornell 2011) which are appropriate for the constrained land use model inputs; a novel design strategy called conditioned Latin hypercube sampling (CLHS) which combines traditional mixture design with Latin hypercubes; and the use of Gaussian process emulation (Kennedy & O'Hagan 2001; Rasmussen & Williams 2006) for accurate prediction of the model output as well as uncertainty quantification. The uncertainty from both the emulation of the model and the dimension reduction is also explicitly propagated back to the space of the original outputs. The proposed methodology has been applied to simulate river flow in a rural river basin in south west England, the Frome at East Stoke Total. This work focuses on a hydrological application, but the statistical emulator presented here is a general technique which could be used for a range of environmental models. The approach presented here is similar to the data-driven emulator of Carbajal et al. (2017), with the addition of the novel design strategy for the constrained inputs, explicit calculation of the uncertainty propagation, and a different test model.
The paper is organized as follows. In the next section, the area of study and datasets are presented and described. The proposed methodology is explained in the following section. Results are presented and discussed are then discussed and in the next section, applied methodologies and obtained results are briefly summarized.
STUDY AREA AND DATA
The study area is the Frome at East Stoke Total river basin, located in south west England. The catchment area is approximately 415 km2. In this area, land cover includes pasture (90%), arable (8.7%), woodlands (1%), and urban areas (0.3%). The prevalent crop in the arable areas is winter wheat. Figure 1 shows the digital elevation model of the basin, the outlet location and the spatial distribution of land cover.
The geology is characterized by Chalk, Upper Greensand and Gault, Lias and Oolites in headwaters area, while sand, gravels and clays are frequent in the lower part of the catchment. The predominant soil type in the river basin is Rendzina soil, covering 43.2% of the area, whereas the remaining soils are Eutric Cambisols (27.4%), Orthic Podzols (20.8%) and Eutric Gleysols (8.6%).
According to the Köppen classification, the climate of south west England is classified as oceanic, characterized by cool winters and warm summers. The average annual temperature is approximately 10 °C. Summer is the warmest season with an average annual temperature of 15–16 °C. Precipitations are distributed during the whole year. Average annual rainfall ranges between 1,000 and 1,100 mm/year.
The average annual flow at the outlet of the Frome river basin is approximately 6.7 m3/s. In the catchment, the natural flow regime is not affected by upstream water abstractions, nevertheless, a considerable amount of water is abstracted from the chalk area by bore holes and wells.
Daily series of precipitation, temperature, solar radiation, relative humidity and wind speed data have been obtained from a dataset based on the global data of National Centers for Environmental Prediction (NCEP) Climate Forecast System Reanalysis (CFSR), available at the website http://globalweather.tamu.edu/. In the website, the selected weather gauges (all located in the same site with coordinates 50°44′13.19″N and 2°29′59.98″W) are identified with the following IDs: p507-25 (rainfall in mm), t507-25 (maximum and minimum temperatures in °C), r507-25 (relative humidity as a fraction), s507-25 (solar radiation in MJ/m2) and w507-25 (wind speed in m/s). Daily river flow data recorded at the East Stoke Total station (9.2 mAOD) have been provided by the National River Flow Archive of UK. In this dataset, the river flow gauge is identified with the ID 44001 and data can be downloaded at the link https://nrfa.ceh.ac.uk/data/station/meanflow/44001. Land cover information in the area of study have been obtained from Feranec et al. (2016) (downloadable at https://land.copernicus.eu/pan-european/corine-land-cover/clc-2012?tab=download), while soil information has been extracted from the FAO/UNESCO Soil Map of the World (downloadable at www.fao.org/geonetwork/srv/en/metadata.show%3Fid=14116).
The applied model has been developed by Rushton et al. (2006). Based on soil moisture conditions, that are a function of catchment characteristics, such as the crop types in arable areas and the related growth stages, the model is able to simulate runoff and potential groundwater recharge.
The processes included in the soil moisture balance can be summarized in the following steps (Figure 2): precipitation reaches the ground surface and, based on rainfall intensity and soil moisture conditions, a fraction of it can generate runoff, whereas another part infiltrates the soil zone. The amount of infiltrated water becomes available to supply potential evapotranspiration or to be stored as groundwater. Every time the potential evapotranspiration demand has been met, any further infiltration of water will increase the amount of water stored in the soil. This process will continue until the stored water reaches the field capacity, that is the point at which the soil becomes free draining and groundwater recharge occurs. During dry periods, the soil dries because of water uptake by the crop. The deficit of soil moisture represents the amount of water needed to bring the soil back to the field capacity. Whenever enough water is available, actual evapotranspiration will occur at the potential rate.
The model application requires the calibration of six parameters, specifically, θFC, θWP, p, Zr, the bypass fraction BF and the factor for near surface storage FRACSTOR. Model outputs are very sensitive to changes of the BF parameter (Ireson & Butler 2013), representing the proportion of rainfall which becomes bypass flow, that is the amount of water passing directly to the aquifer through the fractures. Bypass flow is responsible for groundwater recharge during summer months, when soil moisture deficits prevent drainage from the soil. FRACSTOR is an empirical factor used to define the proportion of the increase in moisture content that becomes near surface soil storage.
The model previously described has been calibrated and validated. In order to identify the set of model parameters that give a good fit between observed and simulated discharges, a Monte Carlo procedure has been applied, consisting of a random generation of parameter values in fixed ranges of variation. Therefore, each model simulation for calibration has been characterized by a different combination of randomly generated parameters. The achievement of the best set of model parameters required a great number of simulations (10,000), since several combinations of parameters need to be tested.
The fact that the Rushton et al. (2006) model is fast enough to enable a Monte Carlo calibration approach contradicts the need for emulation somewhat. However, the focus of this paper is to outline the methodology of the emulator and its implementation. The aforementioned model is used for demonstration purposes and has been pre-calibrated in order to avoid a biased poor performance of the emulator. Indeed, statistical emulators are commonly employed in the calibration stage itself, but this requires a slightly different approach to the focus here. A number of calibration approaches have been proposed in the emulation literature, including Bayesian calibration (Kennedy & O'Hagan 2001; Wilkinson 2011) and history matching (Craig et al. 1996; Williamson et al. 2015).
Model set up and experimental design
With the calibration complete, the aim is to study the relationship between the remaining four land use input variables and the river flow output. The land use variables are defined as shares in the basin and will be denoted as (woodland), (arable), (pasture) and (urban), or using the vector . The model output is the daily river flow at T time points, denoted as the vector . The Rushton et al. (2006) model will be denoted functionally as . To reduce the computational burden, the model will be replaced by a fast-to-run statistical emulator, denoted . The methodology for building the statistical emulator is presented below under ‘Statistical emulation’. An analysis of the difference in computational time between the model and emulator is given below under ‘Comparing the computational time of model and emulator’. Daily river flow output of the Rushton et al. (2006) model for five settings of the four land use inputs are shown in Figure 3.
The statistical emulator must be constructed using n runs of the model in an experimental design, denoted . The experimental design literature for statistical emulation is vast, but in general it is good practice to run the model at a range of input settings which are space-filling in the region of interest, and Latin hypercubes (McKay et al. 1979) are a popular approach. The number of model runs, n, depends on the computational budget, but Loeppky et al. (2009) suggest the rule , where d denotes the input dimension ( here). The land use inputs for the Rushton et al. (2006) model are non-standard in that they are shares constrained to sum to one, i.e. . This is the focus of the field of mixture experiments (Cornell 2011) and several mixture designs have been suggested. However, the majority of mixture designs, e.g. simplex centroid designs and simplex lattice designs, are similar to factorial or gridded designs, which are known to be substandard for building (Gaussian process) statistical emulators (Urban & Fricker 2010). As far as the authors are aware, equivalents of Latin hypercube designs for mixture experiments do not exist in the literature. Here, a novel strategy for mixture design for emulators is proposed which uses the CLHS algorithm (Minasny & McBratney 2006). The algorithm is implemented in the clhs package (Roudier 2011) in the R programming language (R Core Team 2017). With the design strategy decided it remains to choose the number of training runs (design size). In this work, a CLHS design of size is chosen, following the rule suggested by Loeppky et al. (2009), from candidate mixture points. The experimental design is shown in Figure 4. The rule of thumb is satisfactory in many scenarios, and provides accurate emulator predictions here (see below under ‘Statistical emulation’). However, in cases where there are a large number of inputs or the model is too expensive, it may not be possible to achieve within the computational limitations. A solution in this case would be to reduce the input dimension, for example using active subspaces (Constantine 2015), or to implement smaller experiments over multiple iterations or waves (Williamson et al. 2015).
In summary, the experimental design is a CLHS mixture design of size . The output of the Rushton et al. (2006) model at the experimental design is the matrix , obtained by evaluating for . Since the river flow model output is strictly positive, the logarithm of the model output, , is used for subsequent statistical analysis.
The majority of statistical emulation methods are appropriate for scalar model outputs (Kennedy & O'Hagan 2001). To adhere to this and use standard emulation techniques, the dimension of the temporal river flow model output is reduced. In this work, principal component analysis is employed via the singular value decomposition (SVD) (Jolliffe 2002). Several other decompositions are possible, for example nonnegative matrix factorization (Carbajal et al. 2017). The dimension reduction procedure used here is as follows:
Subtract the column means from to obtain .
Using SVD (or equivalently an eigendecomposition) obtain a column matrix of eigenvectors, , and a diagonal matrix of eigenvalues, . The eigenvalues satisfy .
Select the first K eigenvectors and eigenvalues to explain a proportion P (e.g. 0.9, 0.95, 0.99) of variability in the data, by solving for : . Denote the first K eigenvectors and eigenvalues using the matrices and respectively. Denote the remaining eigenvectors and eigenvalues, , using the matrices and respectively.
Project the data by computing . The columns of are the principal components, denoted .
Transform back to the original dimension of the data by computing . Since a logarithmic transformation was used, an exponential transformation will also be needed here.
By using principal component analysis on it was found that eigenvectors could explain almost all of the variability in the data (). Usual practice is to select the number of eigenvectors to explain a high proportion of variability in the data (e.g. 0.9, 0.95, 0.99). Since almost all the variability could be explained by only two eigenvectors in our example it made sense to go beyond this rule. This is very uncommon and is probably due to the model output being similar across model runs, as we can see in Figure 3. As the land use input is varied the flow output is similar, save for a shift effect. Despite explaining a high proportion of variability, the principal component approach does not guarantee accurate reconstruction of the original data, especially if there is non-linearity present. However, it is shown that the reconstruction is accurate below under ‘Results and discussion’.
Recall that the Rushton et al. (2006) model, , is to be replaced with an emulator, . Using principal components the model output has been approximated by the two principal components, and . To construct a full emulator for the Rushton et al. (2006) model, functional relationships between the land use inputs and the principal components, denoted and , need to be replaced with independent emulators, namely and .
Reconstruction and propagation of uncertainty
Validating the statistical emulator
It is important to validate a statistical emulator to see if it provides an accurate approximation to the model. In this work, leave-one-out cross validation is used to this end (Sammut & Webb 2017). The emulators are validated on the reduced dimensional (principal component) space and the reconstructed space of the original model output. A procedure for leave-one-out cross validation for the emulator in Equation (7) is as follows (for ):
Remove the ith point from the experimental design and the ith point from the principal component output, denote the result as and respectively.
Construct a new Gaussian process emulator conditional on ; denote this . Instead of re-estimating the hyperparameters, simply fix them at the values estimated for the complete output . Denote the posterior mean and covariance functions of as and respectively.
Predict the principal component output at using the emulator. The predictive distribution is Gaussian with mean and variance .
A credible interval for the prediction can be constructed as , where denotes the quantile of the Student's t-distribution with degrees of freedom, and q is the number of hyperparameters ( here).
Plotting leave-one-out cross validation emulator predictions against the corresponding principal component output provides a useful diagnostic validation tool. It is important to also include credible intervals to assess whether the uncertainty of the emulator is well calibrated. In short, a well-calibrated emulator should have predictions close to the line with small uncertainty, or if not, the uncertainty should increase appropriately.
RESULTS AND DISCUSSION
To calibrate the model, daily river flow data recorded over the 2003–2010 period has been used. Table 1 shows the set of parameters that provided the best model performance.
|Parameter .||Description .||Calibrated value .||Measurement unit .|
|p||Depletion factor for RAW calculation||0.495||–|
|Zr||Root zone depth||0.539||m|
|FRACSTOR||Factor for near surface storage||0.081||–|
|Parameter .||Description .||Calibrated value .||Measurement unit .|
|p||Depletion factor for RAW calculation||0.495||–|
|Zr||Root zone depth||0.539||m|
|FRACSTOR||Factor for near surface storage||0.081||–|
At the daily scale, the calibrated model provided an NS coefficient equal to 0.71. The model performance improved at the monthly scale, reaching an NS value equal to 0.81. Figure 5 shows the comparison between the observed river flow and the river flow simulated by the calibrated model in the 2003–2010 period at the daily (Figure 5(a)) and monthly scale (Figure 5(b)).
According to the previously described results, the model is able to reproduce the time pattern of the daily and average monthly river flow and the model performance can be considered acceptable (Ritter & Muñoz-Carpena 2013). Annual average observed river flow is equal to 6.2 m3/s, while the simulated value is 6.3 m3/s.
The set of best parameters were used to validate the model. Model validation has been performed using data measured from 2011 to 2013. Validation NS coefficients were 0.64 and 0.70 for daily and monthly river flow respectively. These values of the efficiency index point out an acceptable model performance.
In this work, Gaussian process emulators are built for each principal component output using the DiceKriging package (Roustant et al. 2012) in R. The package implements maximum likelihood to estimate the hyperparameters , and for the kth principal component output. Hyperparameter estimates for the emulators of the two principal components, , , are given in Table 2.
The marginal log-likelihood of emulators , , with the four covariance function choices are given in Table 3. It is clear that the choice of squared exponential covariance function provides the maximum value of the log-likelihood out of the four options, meaning that the principal component output is very smooth. This is also reflected in the marginal log-likelihood values for the Matérn covariance functions, which increase when the regularity parameter is larger.
|Covariance function .||.||.|
|Covariance function .||.||.|
All covariance functions use automatic relevance determination. The Matérn covariance functions have regularity parameters 1, 3, and 5 respectively.
Leave-one-out cross validation diagnostic plots for the two emulators , , are shown in Figure 6 with 95% credible intervals. It is evident that both emulators accurately predict the principal component output at the withheld design points. In particular, for both principal components the emulator's predictions lie on or very close to the line. In terms of coverage, for the first principal component two of the 40 withheld points lie outside of the emulator's credible intervals, giving a coverage of 95%. For the second principal component, four of the 40 withheld points lie outside of the emulator's credible intervals so the coverage is 90%. In summary, both emulators approximate the principal component output accurately and are well calibrated.
The emulators validate well on the space of the two principal component outputs. However, to be of any use, emulator predictions must be reconstructed and propagated back to the original space of the Rushton et al. (2006) model output: the daily river flow. A procedure for doing this was given above under ‘Reconstruction and propagation of uncertainty’. It is useful to check whether the reconstructions accurately predict the daily river flow at unseen land use inputs, and RMSE and MAE metrics are used to this end. Calculating the RMSE for each of the 40 leave-one-out cross validation points gives values ranging from to (with a mean of ). Similarly, the MAE for each of the 40 leave-one-out cross validation points gives values ranging from to (with a mean of ). The RMSE and MAE are measured in the same units as the model output (m3/s). Given that the range of the Rushton et al. (2006) model output for the Frome basin is approximately between 0 and 25 m3/s, these small values of the RMSE and MAE indicate a very good match between the simulated and emulated daily river flows. The difference in simulated and emulated daily river flow for the smallest and largest RMSE values is shown in Figure 7. The difference is plotted because it was hard to see a difference between the simulated and emulated time series on an absolute scale. For the case of the smallest RMSE value, the emulator predicts the model output almost exactly. For the largest RMSE value, the emulator mainly overestimates the daily river flow in a seasonal pattern, but the error is small and the uncertainty increases accordingly. It was found that larger values of the RMSE tended to correspond with flow output with larger magnitude, indicating that these outputs are harder to predict. However, this relationship was small and is not investigated further here.
Comparing the computational time of model and emulator
With the emulator validation complete, it is useful to investigate the reduction in computational burden it provides. The time it took for the model and emulator to run the 40 experiments in the CLHS design was recorded for 100 repetitions, with the average time for completion compared. In this work, experiments were run on a computer with an Intel Core i7 processor running at 3.4 GHz using 32 GB of RAM, on a Windows 7 operating system. The average completion time for the model and emulator was 8.768 and 2.182 seconds respectively. This equates to approximately a four times speed up in computational time when using the emulator. This reduction is quite small but is not surprising considering that the Rushton et al. (2006) model is a fairly simple hydrological model, mainly used in this paper to demonstrate our emulation approach. Nevertheless, the difference in computational time would become substantial when performing tasks which require large numbers of simulations, or when using the emulator for more complex models.
Due to a rapid increase in the sophistication of environmental modelling, approaches to reduce the computational time of the models have become popular. In this context, statistical emulators are used to provide fast approximations of computationally intensive models to facilitate subsequent analysis. With enough runs of the original model, emulators often offer predictions of comparable accuracy for a fraction of the computational time and, importantly, quantify the uncertainty associated with replacing the model with an emulator.
In this study, a novel emulator methodology has been proposed to provide a fast approximation to a hydrological model available in the literature. The model itself was reasonably simple and fast to run, but provided a good test example to demonstrate the emulator methodology. While emulators have been used for hydrological models before, the approach given here was novel in a number of ways, for example in its utilization and combination of a range of techniques from various fields. Firstly, principal component analysis was used to reduce the dimension of the time series output of the model. Secondly, due to the constrained nature of the land use inputs, the field of mixture experiments was drawn upon: classical mixture models were used alongside a novel mixture design strategy which used a CLHS algorithm to generate a space-filling design tailored for building statistical emulators. Finally, Gaussian process emulators were employed to describe the relationship between the land use inputs and the principal component outputs. The emulator's predictions were then reconstructed back to the space of the daily river flow output of the model. Furthermore, uncertainty due to the use of an emulator and dimension reduction techniques were efficiently propagated.
The statistical emulator presented here was applied to a version of the hydrological model calibrated for the Frome basin using measurements from the East Stoke Total station. Under ‘Results and discussion’ above it was shown first that the calibrated model provided a good description of the observed river flow at a daily and monthly scale, via the Nash and Sutcliffe coefficient. Second, the proposed statistical emulator gave very accurate predictions of the calibrated model output as a function of four land use inputs, as tested by a number of leave-one-out cross validation exercises and diagnostics. Possible future directions of this work would be to test the proposed methodology on hydrological models calibrated to different river basins, or for more complex implementations of the model (for example with a larger number of land use inputs). It is anticipated that the approach outlined here would perform well in these scenarios.
Data and code for the hydrological model and emulator methodology are openly available at https://github.com/neowen7/water-resource-emulation.