Constraining the HBV model for robust water balance assessments in a cold climate

Robust projections of changes in the hydrological cycle in a non-stationary climate rely on trustworthy estimates of the water balance elements. Additional drivers than precipitation and temperature, namely wind, radiation, and humidity are known to have a significant influence on processes such as evaporation, snow accumulation, and snow-melt. A gridded version of the rainfallrunoff HBV model is run at a 1 × 1 km scale for mainland Norway for the period 1980–2014, with the following alterations: (i) the implementation of a physically based evaporation scheme; (ii) a net radiation-restricted degree-day factor for snow-melt, and (iii) a diagnostic precipitation phase threshold based on temperature and humidity. The combination of improved forcing data and model alterations allowed for a regional calibration with fewer calibrated parameters. Concurrently, modeled discharge showed equally good or better validation results than previous gridded model versions constructed for the same domain; and discharge trend patterns, snow water equivalent, and potential evaporation compared fairly to observations. Compared with previous studies, lower precipitation and evaporation values for mainland Norway were found. The results suggest that a more robust and more physically based model for climate change studies has been obtained, although additional studies will be needed to further constrain evaporation estimates.


GRAPHICAL ABSTRACT INTRODUCTION
Many hydrological models were developed for operational water resources management and have accordingly been built to rely only on input data that is commonly available, and to be easy to use. Any increase in model complexity should be justified by an increase in model performance, often measured according to the ability of the model to reproduce daily or monthly catchment runoff (Nash & Sutcliffe ; Lindström et al. ; Ferguson ).
Today, gridded input data are becoming more widely available, either from numerical weather prediction models, reanalysis data such as Era5 (Hersbach et al. ), gridded observational data, e.g. SeNorge2018 (Lussana et al. ), or hybrid products such as HySN (Erlandsen et al. ) and WFDEI (Weedon et al. ). Further, the non-stationarity of the current climate calls for hydrological models with a stronger physical basis and a higher robustness in a wide range of climates (Ferguson ; Clark et al. ).
Hydrological models range from the simplest, datadriven, lumped, and conceptually based water balance models, to those akin to land surface models, where the surface energy balance is solved numerically (see e.g. Kauffeldt et al. ). The different modeling strategies have complimenting merits (Hrachowitz & Clark ). For example, numerically solving the surface energy balance requires an increase in input data requiring higher storage and pre-processing capacity, as well as an increase in model integration time; however, it allows the computation of surface temperature, and imposing a closed surface energy balance, which further constrains the latent heat flux or evaporation estimates. In this paper, the term evaporation encompasses water loss from soil, leaves, lakes, and plant stomata (transpiration).
A large number of hydrological modeling studies involve replacing a rather simple conceptual process description with a more physically based one and comparing the results (Bruland et al. ; Zappa et al. ; Hegdahl et al. ), while other studies compare models of different complexities (e.g. Magnusson et al. ). In cases where a more physically based model was compared with a more conceptually based model and an increase in model performance was not found, it is difficult to say whether this was due to an ill-stated empirical equation or parameter being included, over-parameterization, or that the more physically based process description relied on input variables that were poorly estimated. Thus, an important question raised in Clark et al. () is 'to what extent is additional model complexity supported by the available information on geophysical attributes (topography, vegetation, soils, geology, and fine-scale meteorological data)?'.
A particular challenge for using conceptual, calibrated models for hydrological impact assessment is that parameter values can be overfitted to the climate conditions in the calibration period. Merz et al. () found that calibrated parameters representing snow and soil moisture processes were sensitive to the choice of the calibration period. Milly & Dunne () found that a temperature-index based evaporation parameterization may simulate considerably larger evaporation changes than net radiation changes might justify. Any change in vapor pressure deficit with climate change, or plant physiological mechanisms for preserving water, are also not accounted for in temperature-index based evaporation parameterizations. Besides evaporation-related calibrated parameters, other often calibrated parameters which potentially might be omitted from calibration, or considerably restricted in range, are the precipitation phase threshold temperature and the commonly used degree-day factors for calculating snow-melt.
Precipitation phase may be diagnosed using near-surface temperature and humidity (Jennings et al. ). The degree-day factor, which represents the amount of snowmelt per degree above freezing, may vary considerably depending on catchment, climate, and time-of-year (Kustas et al. ; Merz et al. ). Indirectly, it reflects biases in accumulated snowfall, sublimation and deposition processes not accounted for, spatial and temporal variation in incident longwave and shortwave radiation, vegetation shading, longwave radiation emitted by vegetation, and variation in surface albedo, to name a few. Accordingly, in climate change studies, for some processes in particular, there is a need to move from a simple and conceptually based description to a more robust, physically based one.
In Norway, a gridded version of the conceptual HBV rainfall-runoff model (Beldring et al. , from here on referred to as HBV-B03) has been used to study the effect of climate change on hydrology (see e.g. Hanssen-Bauer et al. ). Until recently HBV-B03 included calibrated, land cover-dependent parameters for precipitation phase diagnosis, the melting temperature of snow, the snow-melt degree-day factor, and for the temperature-based scaling of monthly climatological potential evaporation to provide estimates of evaporation. In Wong et al. (), the existing evaporation routine was discussed as a large source of uncertainty when analyzing end-of-century changes in summer droughts for Norway. Further, the lack of an inline computation of potential evaporation may be particularly unsuitable in cold climatessince potential evaporation is limited by the received incident radiation, which is bound to increase in a warmer climate with reduced snow cover and thus albedo.
There has been a recent effort to improve the physical basis of evaporation estimates in HBV-B03 by implementing a Penman-Monteith (Monteith ) potential evaporation routine. Simultaneously, the number of land cover classes represented by the model was increased from 7 to 19 to allow for more spatial heterogeneity related to natural vegetation cover and land use activity. These alterations are described in Huang et al. (), from here on this version of the model is referred to as HBV-H19. The inclusion of a more detailed land cover description combined with land use dependent, calibrated parameters controlling processes such as snow accumulation and ablation may lead to confounded or poorly constrained parameters, and thus disentanglement problems if the model were to be applied to study the effect of a perturbed land cover. Kustas et al. () suggested an enhanced degree-day factor parameterization, where the degree-day factor is restricted by an additive term relating snow-melt to net radiation. A snowmelt routine where the degree-day factor is restricted by a radiative term allows snow-melt to be influenced by land cover class via albedo, without the need of a land cover class-dependent calibration.
The implementation of more physically based process descriptions as described above is here facilitated by a newly established hybrid method, HySN, for producing gridded estimates of near-surface vapor pressure and incident radiation. HySN was derived by merging reanalysis data with the 1 × 1 km SeNorge data and showed high fidelity when compared with station observations (Erlandsen et al. ).
In this study, we aim at obtaining a robust and more physically based model for studies of changes in water balance elements in a non-stationary climate. The HySN method for deriving estimates of evaporation and incident radiation was paired with an improved version of the SeNorge temperature and precipitation fields, SeNorge2018 ii. a regionally calibrated, radiation-restricted degree-day factor; iii. a diagnostic temperature-and humidity-based threshold for diagnosing precipitation phase.

Study area
Mainland Norway stretches several latitudes, from 58 to 71 North, on the western coast of northern Europe (see Figure 1). Its coastline is lined with fjords, while further inland the Scandinavian Mountains divide the country's western and eastern regions. Norway's location on the eastern end of the North Atlantic and its prevailing westerly winds, together with the Scandinavian Mountains, leads to a high annual precipitation on its western coast, with distinctly lower precipitation rates received leeward of the mountain range. Around a third of precipitation falls as snow. About 38% of the land area is forest covered, while the land surface is dominated by bare rock and shallow deposits (see e.g. Figure 17.1, replication of the Geological Survey of Norway (NGU) sediment map in Weynants et al. ()). Relatively shallow soils with a low water storage capacity in large parts of the country make way for a rapid runoff response to precipitation (e.g. Beldring ), but also for moisture stressed conditions in periods of meteorological drought (Buckland et al. ).

Forcing data
The model is forced with gridded daily temperature and precipitation fields from SeNorge2018, and with surface incident shortwave radiation, surface net longwave radiation, and vapor pressure deficit derived following the HySN method, and with a 1 × 1 km resolution wind data set from the Norwegian Meteorological Institute (MET Norway). Details of the forcing data are given below.  Figure S1).

Klinogrid wind
The wind data set used to force HBV is a high-resolution, quantile-mapping-based gridded data set of near-surface wind speed developed at MET Norway. The daily wind data is available for October 1957 until May 2015 from http://thredds.met.no/thredds/catalog/metusers/klinogrid/ KliNoGrid_16.12/FFMRR-Nor/catalog.html (accessed 13 December 2019).

HySN5
HySN5 is a modified version of HySN, a high-resolution HYbrid SeNorge data set of daily near-surface humidity, surface incident shortwave and longwave radiation, and surface pressure. The data have the same temporal frequency and projection as the SeNorge data sets. It is described and compared with surface observations and other data sets in

HBV-H19
In HBV-H19 (Huang et al. ), HBV-B03 was modified by replacing its simple, temperature-based scaling of potential evaporation (E p ) with a Penman-Monteith big leaf approximation for estimating daily E p . E p parameters were set as fixed parameters based on the literature or physical empirical relationships in a look-up table; however, maximum interception storage was for most land cover classes calibrated according to the region and land cover class. The precipitation phase threshold temperature was set to 0 C, while the threshold temperature for snow-melt and the degree-day factor were calibrated according to the region and land cover class. The minimum and maximum degree-day factor allowed during calibration was 0.0001 and 0.01 m C À1 , respectively ( Table 1 in  The number of land cover and soil types represented by the model was increased to 19 and 12, respectively (see Huang et al. ). The model domain was divided into five calibration regions (see Figure 1(b)), which resulted from k-means clustering of two temperature and two precipitation indices.

HBV-E20
The model version used in this study, HBV-E20, builds upon HBV-B03 and HBV-H19. The land cover and soil type classification are based on the same data sets as in HBV-H19; however, the resulting classification is slightly modified for land cover, while the soil type classification was simplified, allowing a total of seven classes for all of Norway. Figure Table S3.
HBV-E20 further includes changes to the model's snowmelt routine. We implement a radiation-restricted degreeday factor based on Kustas et al. (). A radiation-based melt rate, in meters per day, obtained by converting the R n to snow-melt rate, is added to the common degree-day factor expression: where M is the melt rate per day in meters, C temp is a calibrated degree-day factor, and T 2 melt is the melt temperature of snow, λ f is the latent heat of fusion, 0.334 MJ kg À1 , R n is  in MJ m À2 day À1 , ρ w is the density of water (1,000 kg m À3 ), and C rad is a fraction scaling the radiation term, which is always less than unity (see Supplementary Table S3). C temp is allowed to vary between 0.0014 and 0.0030 m C À1 during the calibration, a considerably smaller range than allowed in e.g. HBV-H19 (0.0001 -0.01 m C À1 ). The inclusion of a radiative term makes it possible for the simulated snow cover to respond to changes in incoming shortwave and longwave radiation, as well as surface albedo. Further, adding a radiative term makes it less likely that an unreasonably large amount of snow remains over the summers, ultimately building up so-called 'snow towers' (see e.g. Figure 6 in Skaugen & Weltzien ()).
The adjustment makes it possible to run the model over consecutive years without zeroing out snow at the beginning of the hydrological year, which has been a common procedure to get rid of 'snow towers' in similar models (Skaugen & Weltzien ). The Supplementary Material includes an example of aggregated SWE in a simplified snow module when a traditional degree-day factor, with a constant melt rate of 2.5 mm C À1 day À1 is employed, and when the new, radiation-restricted degree-day factor is used (Equation (1)).
The snow module of the HBV model is further modified. ing humidity as a predictor of precipitation phase increases its accuracy, as snowfall is more likely in drier rather than more humid environments given the same T2.

Calibration and validation
The model was calibrated with the aim to minimize the regional mean catchment bias in discharge (simulatedobserved daily discharge) and to maximize the regional where r is the Pearson correlation coefficient, σ s and σ o are the simulated and observed standard deviation, and μ s and μ o are the simulated and observed mean. The optimal KGE, and highest possible value, is unity, while it has no lower limit. The KGE is a relative bias measure, so the two optimization goals used in model calibration are not entirely independent.
The model was jointly calibrated for catchments within each of the five calibration regions. The regions following HBV-H19 and are shown in Figure 1 For each of the five regions, only three above-ground parameters were calibrated: a multiplicative correction factor for precipitation, an additional multiplicative undercatch correction factor in case of snowfall, and the snow-melt degree-day factor (C temp ). The six soil parameters described in HBV-B03, FC, beta, perc, kuz, klz, and alpha were individually calibrated for each soil class, except for the glacier and till class, which were merged. Thus, the parameters of a maximum of six soil classes were calibrated for each region, depending on the number of soil classes represented within a region. Lower zone lake runoff response (klz) was set according to that of soil class bog.
The regional calibration was conducted using PEST:

Model-Independent Parameter Estimation and Uncertainty
Analysis (Doherty ).

Model evaluation
In order to assess the HBV-E20 model's ability to simulate relevant hydrological states, fluxes and temporal dynamics, results were evaluated against a variety of observational

RESULTS
The model was calibrated and validated in terms of regional mean KGE and bias of its daily discharge estimates. The Daily discharge, potential evaporation, maximum SWE, and monthly discharge trends

Daily discharge
The results of the model calibration and validation are given in   (Figure 3(a)). The model's simulated maximum SWE is on average 6 cm higher than the observations; however, as seen in Figure 3(b), there is a strongly significant, positive correlation (p < 0.000) between SWE bias and the difference between model grid cell and measurement altitude.
A reasonable agreement is found; however, the simulated E p tends to show higher values than the pan measurements

Model calibration and validation
The herein alterations to the gridded HBV model code implemented in HBV-E20, including a modified physically based potential evaporation routine, a radiation-restricted degree-day factor for snow-melt, and the introduction of humidity as a predictor of precipitation phase led to fewer calibrated parameters. A reduction in the number of parameters reduces the dimensions in parameter space and with that parameter uncertainty, contributing to a more robust model in general, and for climate change impact assessment in particular.
The model achieved a median KGE between 0.74 and 0.78, and a mean bias between À0.1 and À0.8 mm/day, depending on the period and catchments considered.
Though the primary goal of the alterations was to increase the physical robustness of the model under climate change, we find that the HBV-E20 model's performance in simulating daily discharge is equally good or better than previous gridded HBV versions applied for the same domain (see Huang et al.  and references therein). Additionally, the model showed a fair amount of skill in reproducing observed annual maximum SWE, with a correlation of 0.78 and a  (1967)(1968)(1969)(1970)(1971)(1972) shown within black circles overlaid the simulated E p for the land cover class 'Open' (1979)(1980)(1981)(1982)(1983)(1984). The lower box plots show the pan measurements in purple and the simulated Ep for collocated grid cells in green. Please refer to the online version of this paper to see this figure in color: https://doi.org/10.2166/nh.2021.132.
mean difference of 6 cm to the point observations. The improved availability of high-quality forcing data facilitated the implementation of model alterations and thus likely contributed to the improved model performance.
A reasonable agreement was seen between pan evaporation measurements (1967)(1968)(1969)(1970)(1971)(1972) and E p calculated for the land cover class 'Open' (representing short vegetation) (1979)(1980)(1981)(1982)(1983)(1984), with the latter amounting to 95% of the former. The pan evaporation measurements show higher regional variability than the estimates. This may, in part, be explained by the fact that the measurements represent points in the terrain with a varying degree of exposure (Hetager & Lystad ), while the estimates are provided    It should be noted that the estimates provided in the two climate synthesis reports were based on a simple, temperature-dependent E approximation (as outlined in HBV-

CONCLUSIONS
With the availability of updated and improved forcing data, a gridded version of the HBV model has been enhanced to include a physically based evaporation scheme, a net radiation-restricted degree-day factor approach for snow-melt, and a prescribed precipitation phase threshold based on These are the first steps among several that might be undertaken to improve process representation in the model providing more robust long-term water balance estimates in a changing climate. Additional constraints, which might further improve and constrain the physical process description and calibration of the model, include enhanced gridded precipitation data sets, further improved representation and parameterization of the land surface (soil, bedrock, and vegetation), and additional physically based model enhancements.