This study compared hydrological model performances under different sub-annual calibration schemes using two conceptual models, IHACRES and HYMOD. In several publications regarding sub-annual calibration, the authors showed that such an approach generally performed better than the conventional whole period method. Hence, there are advantages in dividing the data into sub-annual periods for calibration. However, little attention has been paid to the issue of how to calibrate the non-continuous sub-annual period. Unlike the conventional calibration which assumes time-invariant parameters for the calibration period, the model parameters vary in sub-annual calibration. We have explored two sub-annual calibration schemes, serial calibration scheme (SCS) and parallel calibration scheme (PCS). We assume that the relationships between the rainfall and runoff could be different for each sub-annual period and consider intra-annual variations of the system. The models are then evaluated for a different validation period to avoid over-fitting and the optimal sub-annual calibration period is explored. Overall, we have found that PCS performed slightly better than SCS and the optimal calibration periods are seasonal and bimonthly for IHACRES and biannual for HYMOD. Since there are pros and cons in both SCS and PCS, we recommend choosing the method depending on the purpose of the model usage.

## INTRODUCTION

Hydrological modelling is an essential tool for understanding the hydrological behaviour of a catchment (Madsen 2000; Wagener *et al*. 2003), and it is a complicated task (De Vos *et al*. 2010). The most common method for identifying the optimised model parameters is through calibration with the use of historical observation data. Objective functions, such as Nash–Sutcliffe efficiency (Nash & Sutcliffe 1970) are used to minimise the difference between the observed and simulated flows. This calibrating scheme is widely applied (Sorooshian 1991; Gan & Biftu 1996; Gupta *et al*. 1998, 2009). The validation is a standard practice in hydrological modelling (Andréassian *et al*. 2009) to test the model with the data outside of the calibration period to evaluate the model performance.

A recognised issue in hydrological modelling is uncertainties which are attributed to the model structural errors and parameterisation errors. The uncertainty due to model structure errors is generally quantified by using several different models, and numerous methods are proposed regarding quantification of the uncertainty of parameterisation problems. The time varying parameters, which may arise from catchment change (such as land use/cover change), climate variability and climate change (such as change of evapotranspiration dynamics of vegetation due to higher or lower temperatures) may be another source of uncertainty. Such changes may be significant within a year (e.g., contrasting vegetation cover between winter and summer in many parts of the world). Recently, there have been some studies about the stability of the model performances and the effect of parameter values (Xu 1999; Li *et al*. 2014; Patel & Rahman 2015; Yan & Zhang 2014). The basis for time varying model parameters can be explained by several reasons (Merz *et al*. 2011). First, the hydrological model has structure errors and the calibrated parameters may change for different time periods in order to compensate these problems with the model structures (Wagener *et al*. 2003). Second, catchment characteristic changes (Brown *et al*. 2005) such as land use and vegetation variations (Merz & Blöschl 2009) can also lead to a change of calibrated parameters. However, the correlation between parameters is complicated (Wagener 2007), which makes it hard to understand the reason for the parameter changes in time (Wagener *et al*. 2010).

Therefore, most parameterisation schemes in hydrological models are based on the assumption that the parameters do not change for the entire calibration period. There has been some research about the adequate data length for calibration (Xu & Vandewiele 1994; Zeng *et al*. 2016). In addition, some researchers have attempted to develop more accurate models by adapting sub-annual calibration, which is based on seasonal or monthly time periods (i.e., non-continuous periods) in order to better simulate temporal variations of the varying catchment conditions within the year. In such a scheme, intra-annual variation of the data is taken into account. For example, Luo *et al*. (2012) examined ten different parameterisation schemes at catchments in Australia. Their results have shown that calibrating the model for each individual month separately produces better model performance than the other schemes, and this is particularly evident for the dry months when the flow is low and difficult to forecast. Levesque *et al*. (2008) conducted seasonal calibration, in which summer and winter seasons were calibrated separately. When the model was calibrated based on the summer (dry period) data, the model performance improved considerably. However, there was no advantage when only the winter (wet period) data were used for calibration compared with the conventional calibration method which used the entire data over the whole period. Paik *et al*. (2005), Kim & Lee (2014), Zhang *et al*. (2015) and Kim *et al*. (2016) also used the seasonal calibration method.

Similar to the sub-annual discrete calibration, Hartmann & Bardossy (2005) investigated the transferability of hydrological models by dividing the observation period into different climatic conditions (i.e., warm, cold, wet and dry). They conducted the calibration only for the chosen years which were discontinuous, and the model was running continuously for the entire observation period. De Vos *et al*. (2010) proposed clustering time series according to hydrological similarities and allowing the parameters to vary over the clusters during calibration. Seiller *et al*. (2012) selected five non-continuous hydrologic years for four contrasting climate conditions: dry/warm, dry/cold, humid/warm and humid/cold. Calibration was done on each period and validation was conducted on contrasting climate conditions. The model was kept running continuously on the entire time series while the optimisation was done only for the chosen years. The above-mentioned studies demonstrate that model parameter values can vary over time in accordance with seasonal variations and indicate that there are some advantages in using this scheme, although further investigation and improvements are needed.

In this study, regarding sub-annual period calibration, we focus on two issues that should be resolved, which have not been considered yet in the literature. First, guidance on calibration schemes for non-continuous time periods should be provided. Most studies have conducted calibrations based on only the chosen sub-annual period (i.e., only the selected sub-annual period data are used for parameter optimisation in the objective function) while the entire time series data were used to run the model continuously. However, the downside of such an approach is that there are discontinuities in the time series of soil moisture when the separately calibrated flows from individual sub-annual periods are combined to examine the performance of the entire time period. Another calibration scheme is to optimise the entire varying parameters simultaneously. De Vos *et al*. (2010) divided the data into 12 clusters and the model parameters were set differently for each of the 12 clusters, which resulted in the number of degrees of freedom equal to 12 clusters times the number of parameters. Then, these parameters are optimised simultaneously. These different methodologies point towards the need for guidance on the non-continuous sub-period calibration scheme. Second, most non-continuous sub-period calibration studies are not interested in the model performance of the total time series but that of particular time periods (e.g., dry season or wet season). Therefore the questions are, ‘How can we assess the model efficiency of the entire time series using non-continuous sub-annual period calibration? Is it reasonable to just combine the flows calibrated separately?’, ‘In this case, what about the issue of discontinuity in soil moisture simulation?’ Third, different sub-annual calibration periods may present different model efficiencies due to underfitting or overfitting issues and there may be an optimal period which can be evaluated by using cross validation. So far, there is no consensus method in the literature regarding how to calibrate non-continuous sub-annual time periods. Given such a background, this paper explores the following questions:

(1) What schemes are applicable for non-continuous sub-annual calibration?

(2) Which schemes are both logical and practical?

(3) What are the pros and cons of these schemes?

(4) What is the optimal time period for sub-annual calibration?

In this study, two sub-annual calibration schemes are employed to explore these questions, i.e., serial calibration and parallel calibration schemes (PCS). The sub-annual calibration has been performed on five different time scales: annual, biannual, seasonal, bimonthly and monthly. We assume that the system is changing due to various reasons, such as vegetation change (e.g., seasonal change of deciduous vegetation and crop rotation), soil structure change (e.g., seasonal soil compaction by farm animals or farming machines), etc. Therefore the system response, i.e., the relationships between the rainfall and runoff may be different for different time scales and there may be an optimal time period for sub-annual calibration. In other words, different sets of varying parameters are optimised to consider intra-annual variations. As the number of sub-annual calibration period increases (i.e., from annual to monthly) the model is more likely to fit to the observations since it has more flexibility to cope with the change of the system. On the other hand, it is more likely to fit to the noise which is the well-known trade-off between the bias and variance in mathematical modelling. Therefore, the models are evaluated for different validation periods to overcome the overfitting (or over-parameterisation) issue and to explore the best sub-annual model.

Both calibration schemes are applied to one catchment located in the southwest of England. Since the main aim of this study is to introduce the concept and the logic of non-continuous sub-period calibration scheme, we believe only one catchment is sufficient to prove the concept. It is hoped that a wide application of the proposed methodology in other catchments under different conditions will help the hydrological community to find useful patterns on this important issue.

## CASE STUDY AREA AND THE HYDROLOGICAL MODELS

### Study area and data

^{2}and is a sub-catchment of the Exe catchment. The Exe catchment is located in the southwest of England with an area of 1,530 km

^{2}and an average annual rainfall of 1,088 mm. Figure 1 shows the overview of the Thorverton catchment area. Daily time series of the observed precipitation, potential evapotranspiration and flow data (1961–1990) over the Thorverton catchment was obtained from the UK Met Office and daily temperature data downloaded from the UKCP09 gridded observation data sets.

### Hydrological models

#### IHACRES

*et al*. 1993; Littlewood 1999; Letcher

*et al*. 2001; Kim & Lee 2014). The model is composed of a non-linear module and a linear module, as shown in Figure 2, and model parameters are listed in Table 1. A non-linear module converts rainfall to effective rainfall which is calculated from the following equations: where

*r*is the observed rainfall,

_{k}*C*is the mass balance,

*l*is the soil moisture index threshold and

*p*is the power on soil moisture, respectively. The soil moisture is calculated from: where is the drying rate given by: where is the drying rate at reference temperature,

*f*is the temperature modulation, is the reference temperature, and is the observed temperature. A linear module assumes that there is a linear relationship between the effective rainfall and flow. Two components in this module, quick flow and slow flow, can be connected in parallel or in series. In this study, two parallel storages in the linear module are used because they reflect the catchment conditions, and the streamflow at time step

*k*is defined by the following equations: where and are quick flow and slow flow, respectively, and and

*β*are recession rate and peak response, respectively. The relative volumes of quick flow and slow flow can be calculated from: Therefore, one parameter is determined if the other three parameters are known among the four linear module parameters (

*α*).

_{q}, α_{s}, β_{q}, β_{s}Module | Parameter | Description | Range |
---|---|---|---|

Non-linear | c | Mass balance | 0 to 0.04 |

τ _{w} | Reference drying rate | 0 to 50 | |

f | Temperature modulation of drying rate | 0 to 4 | |

l | Soil moisture index threshold | 0 to 50 | |

p | Power on soil moisture | 0 to 3 | |

Linear | α _{q}, α_{s} | Quick and slow flow recession rate | −1 to 0 |

β _{q}, β_{s} | Fractions of effective rainfall for peak response | − 1 to 0 |

Module | Parameter | Description | Range |
---|---|---|---|

Non-linear | c | Mass balance | 0 to 0.04 |

τ _{w} | Reference drying rate | 0 to 50 | |

f | Temperature modulation of drying rate | 0 to 4 | |

l | Soil moisture index threshold | 0 to 50 | |

p | Power on soil moisture | 0 to 3 | |

Linear | α _{q}, α_{s} | Quick and slow flow recession rate | −1 to 0 |

β _{q}, β_{s} | Fractions of effective rainfall for peak response | − 1 to 0 |

#### HYMOD

*et al*. 2001; Vrugt

*et al*. 2003; De Vos

*et al*. 2010). The model parameters are described in Table 2 and the model structure is illustrated in Figure 3. The cumulative distribution function of the water storage capacity

*C*is in the following form: where is the maximum soil moisture storage capacity in the catchment and controls the degree of spatial variability of the soil moisture capacity. The excess rainfall is treated as the runoff which is divided into quick flow and slow flow based on the partitioning factor

*a*. The runoffs are routed through three identical quick flow tanks and a parallel slow flow tank. The flow rates are determined by the recession coefficient for quick flow tank and slow flow tank

Parameter | Unit | Range | Description |
---|---|---|---|

C_{max} | mm | 1–500 | Maximum soil moisture storage capacity |

b_{exp} | – | 0.01–1.99 | Spatial variability of soil moisture capacity |

α | – | 0.01–0.99 | Quick/slow flow distribution factor |

R _{s} | day | 0.01–0.99 | Recession coefficient for slow flow tank |

R _{q} | day | 0.01–0.99 | Recession coefficient for quick flow tank |

Parameter | Unit | Range | Description |
---|---|---|---|

C_{max} | mm | 1–500 | Maximum soil moisture storage capacity |

b_{exp} | – | 0.01–1.99 | Spatial variability of soil moisture capacity |

α | – | 0.01–0.99 | Quick/slow flow distribution factor |

R _{s} | day | 0.01–0.99 | Recession coefficient for slow flow tank |

R _{q} | day | 0.01–0.99 | Recession coefficient for quick flow tank |

## METHODOLOGY

### Optimisation method

For the optimisation algorithm, we used dynamically dimensioned search (DDS) (Tolson & Shoemaker 2007) which is a simple, single objective, heuristic global search algorithm. DDS searches the parameter space globally and incrementally localises as the number of iterations reaches the maximum allowable number of simulations. The procedure from global to local scales is done by probabilistically decreasing the number of model parameters in the neighbourhood. New search avoids poor local optima and parameter values are updated by perturbing the current solution values in the randomly selected dimensions with perturbation magnitudes randomly sampled from a normal distribution with a zero mean. More details can be found in Tolson & Shoemaker (2007).

### Model parameterisation schemes

*i*is the

*i*

^{th}day, and

*N*is the number of days in the calibration period. In this study, only RMSE is used as the objective function. Exploration of the effect of different objective functions and their combinations in calibrating rainfall–runoff models can be another research topic to extend the current study (e.g., Jie

*et al*. 2015).

Calibration has been done for three time periods: 1960s, 1970s and 1980s. Normally a warm-up period (e.g., 1 or 2 years) is used during calibration to reduce the influence of the initial values of state variables. In this study, instead of using a warm-up period, we set the initial value of soil moisture as a parameter to be optimised to avoid the warm-up problem (this is fine for calibration, but not suitable for validation in which a warm-up period of 1 year is still needed because the initial value of soil moisture is not a model parameter).

### Evaluation of the parameterisation schemes and the optimal calibration period

*y*is the calibration period,

*c*is the type of calibration scheme and

*m*is the number of groups (e.g., 12 for monthly calibration).

As previously noted, the initial value of the soil moisture is set as if it is a ‘model parameter’ and is optimised during calibration. However, in reality, the soil moisture is not a hydrological model parameter but a state variable, hence, the optimised initial soil moisture value estimated from the calibration period should not be used in the validation period. If we know the correlation between the soil moisture and the flow, the initial soil moisture value in the validation period could be estimated from the initial flow value. However, since this is not the main point of the study, we just assume that the initial soil moisture value evolves to an appropriate value after 1 year instead of building the soil moisture and flow relationship. Therefore, the first 1 year of the validation period has not been taken into account in evaluating the model efficiency.

## RESULTS OF MODEL EVALUATION

## DISCUSSION

### Sub-annual calibration and over-parameterisation

_{1}, a

_{2}…a

_{12}). However, when these 12 separately calibrated models are combined (Figure 10(a)) to evaluate the annual performance, it is apparent that this combined model is different to a single model with 12 parameters (Figure 10(b)) since the model structure between the two is different. Therefore the two models, a combined model and one model with 12 parameters are not equivalent and should not be confused.

*μ*is a random sample from a Gaussian distribution. This flow series generated from a stationary system is then used to calibrate the hydrological model, IHACRES.

*m*parameters) which is separately calibrated on a sub-annual basis (

*n*groups) is equivalent to a complicated single model with

*m*×

*n*parameters (Equation (12)):

*n*) increases. This is because given that the hypothesis is true, the model becomes more complicated (i.e., one model with

*m*×

*n*parameters) when the number of groups

*n*increases, which results in less error due to overfitting (Figure 11). The calibration has been performed on an annual, biannual, seasonal, bimonthly and monthly time scale based on SCS. The model performances are presented in Table 3. The RMSE values are similar among different time scales which mean that the sub-annual calibration scheme does not improve the model performance. Therefore, the hypothesis should be rejected. The reason for no improvements in the complicated model is that the model structures are still the same although the sub-annual calibration has been performed in different time scales. Hence, the sub-annually (

*n*groups) calibrated model (

*m*parameters) is not equivalent to a single model with

*m*×

*n*parameters.

Calibration period | Annual | Biannual | Seasonal | Bimonthly | Monthly |
---|---|---|---|---|---|

RMSE (m^{3}/s) | 4.24 | 4.23 | 4.31 | 4.40 | 4.33 |

Calibration period | Annual | Biannual | Seasonal | Bimonthly | Monthly |
---|---|---|---|---|---|

RMSE (m^{3}/s) | 4.24 | 4.23 | 4.31 | 4.40 | 4.33 |

Another possible concern is that a time-variant parameter is not a parameter any more but a state variable. This should not be a problem because such a time-variant parameter scheme has already been widely adopted in the ‘adaptive control’ (Tao 2003) which is used to control the system with varying parameters.

### Calibration of nonstationary system and the optimal calibration period

As mentioned in the previous section, if the system is stationary, there is no advantage in applying the sub-annual calibration scheme. However, in real life the system is nonstationary and it is changing with time. The response of the nonstationary system may be different dependent on the catchment change (e.g., seasonal vegetation change, seasonal soil structure change, etc.) which means that different hydrological model parameters may be needed. In this changing system, the more calibration groups divided, the less the error will be since the model has more flexibility to fit to the observation. Since there is a trade-off between bias and variance, there is an optimal calibration period between annual and monthly models, as shown in Figure 8. It is understandable that IHACRES (eight parameters) has a shorter optimised calibration period than HYMOD (five parameters) since IHACRES has more parameters than HYMOD, which can cope with the change of the system better.

### Comparison of soil moisture

From this result, it is found that realistic flow and soil moisture simulations may not both be achieved at the optimal state. Sometimes we may get the right answers (predicting the best flow for practical purposes) for the wrong reasons (unrealistic soil moisture). That is, the flow may be optimum at the expense of soil moisture due to the drawback of the hydrological model itself. There has been relevant research regarding this issue by Zhuo & Han (2016a, 2016b).

### Which calibration scheme is more reliable

We have explored three options of calibration schemes for hydrological modellers to choose from: (1) the conventional method considering inter-annual variations of the data, (2) SCS and (3) PCS considering intra-annual variations of the data. Logically, the SCS should be better than the PCS since the model should run in series by logic, but the results show that the PCS beats the SCS. A possible interpretation of this result may be in part due to the issue of optimisation in a high dimensional space and the interdependency of numerous parameters in SCS.

Then the question is which method is more reliable? Our recommendation depends on the purpose of the calibration. If one is interested in the volume of the flow water, we recommend the PCS since it represents the best flow simulation result and is easy to calibrate due to the small number of parameters, although there are discontinuities in the time series of soil moisture. On the other hand, when one is interested in soil moisture as well as flow, we recommend the SCS which is more logical and the performance is not much worse in flow but shows continuity in soil moisture simulation. However, the downside of this method is that it is not efficient and takes a longer time to calibrate than PCS since the number of degrees of freedom equals the number of sub-annual period (e.g., 12 for monthly calibration) times the number of parameters.

## CONCLUSIONS

This study has compared the hydrological model performances under different sub-annual period calibration schemes using two conceptual models, IHACRES and HYMOD. There are some studies regarding sub-annual period calibration methods, but little attention has been paid to the issue of how to calibrate (e.g., in series or in parallel) the non-continuous time series and what is an optimal time period for sub-annual calibration. It is, therefore, important to investigate reliable calibration schemes for non-continuous sub-annual period calibration. In this study, we have proposed two alternative approaches to calibrate hydrological models by sub-annual calibration schemes in which unique model parameter sets are estimated for each sub-period (annual, biannual, seasonal, bimonthly and monthly). In one approach, unique parameter sets for each sub-period are calibrated simultaneously and parameter values are thus changed for each sub-period – the SCS. In the second approach, the model is calibrated *n* (the number of groups, e.g., four for seasonal) times but only using data from one sub-period at the time. The *n* models are then combined to get the result of a complete time series – the PCS. Then, three-fold cross validation is used to find the optimal sub-annual calibration period; it is possible that this optimal calibration period is related to local catchment change and the purpose of the data usage. Overall, we have found that the model calibrated on the sub-annual period schemes generally performs better than the model calibrated in the conventional way, which implies that it is worth considering intra-annual variations in calibrating hydrological models in a changing environment. For the study catchment, from the flow point of view (predicting flow only for practical purposes), the optimal calibration periods for IHACRES are bimonthly and seasonal for SCS and PCS, respectively. For HYMOD, biannual is the best sub-annual model for both SCS and PCS. However, from the soil moisture point of view, dividing the sub-annual calibration period sometimes may not produce a realistic soil moisture pattern, which indicates improvement of the hydrological model structure is needed to achieve both the simulated flow and soil moisture at the optimal state. Therefore, not only the simulated flow but also the soil moisture needs to be considered to find the optimal calibration period. Among the dynamic calibration schemes, PCS performed slightly better than SCS. Since there are pros and cons in both SCS and PCS, we recommend choosing the method depending on the purpose of the sub-period calibration. Although the study catchment is specific to southwest England, the methodology proposed in this study is generic and applicable to other catchments. Since only one catchment is explored in this investigation, it is clear that such a study has not completely solved this problem. We hope this paper will stimulate the hydrological community to explore a variety of sites with different hydrological models so that valuable experiences and knowledge can be gained to improve our understanding about such a complex model calibration issue.

## ACKNOWLEDGEMENTS

The first author is grateful for the financial support from the Government of Republic of Korea in carrying out his PhD study at the University of Bristol. The data used in this study are available upon request from the corresponding author via email (kk12496@bristol.ac.uk).