Conceptual rainfall–runoff models are widely used to understand the hydrologic responses of catchments of interest. Modellers calculate the model performance statistics for the calibration and validation periods to investigate whether these models serve as satisfactory representations of the natural hydrologic phenomenon. Another useful method to investigate model suitability is sensitivity analysis (SA), which investigates structural uncertainty in the models. However, a comprehensive method is needed, which led us to develop a model suitability index (MSI) by combining the results of model performance statistics and SA. Here, we assessed and compared the suitability of three rainfall–runoff models (GR4J, IHACRES and Sacramento model) for seven Korean catchments using MSI. MSI showed that the GR4J and IHACRES models are suitable, having more than 0.5 MSI, whereas the Sacramento has less than 0.5 MSI, representing unsuitability for most of the Korean catchments. The MSI developed in this study is a quantitative measure that can be used for the comparison of rainfall–runoff models for different catchments. It uses the results of existing model performance statistics and sensitivity indices; hence, users can easily apply this index to their models and catchments to investigate suitability.
INTRODUCTION
Conceptual rainfall–runoff models are widely used to understand and predict the hydrologic responses of catchments of interest. In particular, parsimonious conceptual rainfall–runoff models have benefits of ease of use because of a smaller number of input data required, a faster computational time with reasonable results and a lower structural identifiability problem (Shin et al. 2015). The range of study using the conceptual models is broad: parameter calibration and validation (Vansteenkiste et al. 2014b; Willems 2014; Willems et al. 2014), comparison of objective functions (Jie et al. 2016), sensitivity analysis (SA) (Shin et al. 2013; Van Hoey et al. 2014; Massmann & Holzmann 2015), uncertainty analysis (Wagener et al. 2003; Blasone et al. 2008; Clark et al. 2008; Hailegeorgis & Alfredsen 2015; Shin et al. 2015), regionalisation (Li et al. 2014; Singh et al. 2014; Zhang et al. 2014) and climate change (Chiew et al. 2009; Vaze et al. 2010; Vansteenkiste et al. 2014a; Shin et al. 2016). These various studies demonstrate the usefulness of conceptual rainfall–runoff models.
To investigate whether the conceptual models sufficiently replicate the observed hydrographs, a routine method – the split-sampling test (Klemeš 1986) – is applied for the calibration and validation periods, and model performance statistics are calculated for these periods. Modellers assess model performance using various objective functions and performance statistics to decide the suitability of the models to their study catchments. Hydrologic models with more parameters tend to have higher performance statistics compared to parsimonious models, but more complex models could have a serious interaction between parameters, which causes uncertainty in the results (Shin et al. 2015).
Another useful method to investigate and compare the suitability of models is SA. If all or many parameters are sensitive, then the models are structurally sound, which means that modellers can trust the results from these models. SA is an essential step to assess model parameters and, therefore, to investigate the model structure (Shin et al. 2013). The sensitivity of model parameters may vary for different study areas (Shin et al. 2013); hence, every study that makes use of hydrologic models with new catchment data needs SA. SA is categorised into global and local methods, and global SA methods are appropriate if there are interactions between model parameters, which results in a non-linear output (Saltelli & Annoni 2010).
Suitable rainfall–runoff models should have both good model performance and less uncertainty. There were several studies and discussions about the suitability or acceptability of models in terms of model performance and structural uncertainty (Beven & Freer 2001; Vrugt et al. 2003; Wagener 2003); however, these are not enough for users to determine which model should be selected as the suitable one. The aim of this study is to provide a guide for suitable model selection considering both model performance and uncertainty. To provide a clearer and easier guide, we developed the model suitability index (MSI) by coupling the results of model performance statistics and global SA. MSI is a quantitative index; hence, users can compare the different rainfall–runoff models using this measure. We compared three conceptual rainfall–runoff models using MSI for seven Korean catchments to investigate which models are suitable or require caution.
The subjects of each section are as follows: the second section describes the SA method, model performance criteria and MSI; the third section briefly describes rainfall–runoff models, catchments, data, the target functions for SA, the calibration method and the method used for this study; the fourth section shows the results of SA, performance statistics and MSI for the three conceptual rainfall–runoff models; and the final section provides a discussion and the conclusions of this study.
SA METHOD, MODEL PERFORMANCE CRITERIA AND MSI
Sobol method for SA
The Sobol method (Sobol 1993) is a variance-based quantitative global SA algorithm that has been applied to various studies (e.g., Nossent et al. 2011; Yang 2011; Herman et al. 2013; Shin et al. 2013). Through the decomposition of the model output variance, this method estimates the relative contributions of individual parameters and parameter interactions to the output variance. The sensitivity of a given parameter is, as a result, quantified by a ratio ranging from 0 to 1. This method calculates a first-order sensitivity index (FSI) and a total sensitivity index (TSI). FSI measures the sensitivity effect of each parameter on the output, whereas TSI measures the effect of each parameter and its interaction with the other parameters on the output (Saltelli et al. 2008).
We applied Saltelli's scheme (Saltelli 2002) in the R ‘sensitivity’ package (Pujol et al. 2012), which calculates TSI using the reduced number of samples from n(2k + 2) to n(k + 2). Here, n represents the initial sample size used (10,000 in this study), and k is the number of the parameters of the hydrological model. Hence, the total number of samples generated is 60,000 for four parameter models, and 70,000 and 150,000 for five and 13 parameter models, respectively. More details of the sampling method are described in the work by Shin et al. (2013).
Model performance criteria
The other two objective functions are NSElog, which is NSE with log-transformed data and 0.5(NSE + NSElog). The NSE objective function relatively focuses on fitting high flow (Krause et al. 2005), whereas the NSElog objective function relatively focuses on fitting low flow caused by the log transformation of time series data. The objective function of 0.5(NSE + NSElog) focuses on fitting medium flow by giving equal weight on NSE and NSElog.
Model suitability index
We gave the same weight to the averaged SR and PS for the calculation of MSI because we believed that both measures are equally important. The SR is as important as the PS because over-parameterisation due to insensitive parameters is a well-known problem in rainfall–runoff models with large numbers of parameters (van Griensven et al. 2006), and this uncertainty problem could cause the uncertain model simulation results, such as producing equifinal multiple simulation hydrographs for the calibration period (Beven 1993) but producing unequifinal multiple simulation hydrographs for the validation periods by applying the multiple optimal parameter sets from the calibration, which have significantly variable parameter values (Shin et al. 2015). Therefore, the modellers could have problems with decision-making due to this prediction uncertainty. The SR can be coupled with the PS because we used the quantitative SA method of Sobol for SR. Therefore, the calculated MSI values can be used for the comparison of any model with any catchment data. If the MSI value is equal to 1, then it means that all the parameters are sensitive and that the simulated hydrographs are perfectly matched to the observed hydrographs.
RAINFALL–RUNOFF MODELS, CATCHMENTS, DATA, TARGET FUNCTIONS AND THE CALIBRATION METHOD
Rainfall–runoff models
We used three well-known conceptual rainfall–runoff models with different complexities from four to 13 parameters. The GR4J model (Perrin et al. 2003) with four parameters uses two unit hydrographs and two stores for the production and routing of water. The storage of rainfall, evapotranspiration and percolation in the surface soil are controlled by the production store, and the routing of effective rainfall is controlled by the routing store. In the routing store, effective rainfall is separated by the ratio of 0.9 and 0.1, and then, two unit hydrographs route the portioned effective rainfall. Quick flow is generated by 10% of effective rainfall, and slow flow is generated using 90% of effective rainfall (Le Moine et al. 2008). The GR4J model has been applied in various studies (Moussu et al. 2011; Andréassian et al. 2012; Pushpalatha et al. 2012; Shin et al. 2013, 2015).
The IHACRES model has many versions, which are used in various studies (Viney et al. 2009; Vaze et al. 2010; Post et al. 2012; Shin et al. 2013, 2015). We used the catchment moisture deficit (CMD) version (Croke & Jakeman 2004) as it has more physical meaning for the parameters. The CMD version of the IHACRES model calculates the varying catchment moisture at each time step. There are two stores in the IHACRES model: the nonlinear store for the generation of effective rainfall using nonlinear function to deal with the raw rainfall and using accounting equation to calculate the CMD output, and the linear store to convert effective rainfall into quick and slow flow using unit hydrographs. The summation of the quick and slow flow at each time step becomes the total flow. Table 1 shows the parameters of the IHACRES model, and we fixed the ‘e’ parameter as unity as it directly uses the potential evapotranspiration data instead of the temperature data. Hence, five parameters are used for the IHACRES model.
Parameter name . | Parameter no. . | Unit . | Range . | Description . |
---|---|---|---|---|
GR4J | ||||
X1 | 1 | mm | 50–5,000 | Maximum capacity of the production store |
X2 | 2 | mm | −15 to 4 | Groundwater exchange coefficient |
X3 | 3 | mm | 10–1,300 | 1-day-ahead maximum capacity of the routing store |
X4 | 4 | day | 0.5–5 | Time base of the unit hydrograph, UH1 |
IHACRES-CMD | ||||
f | 1 | – | 0.5–1.3 | CMD stress threshold as a proportion of d |
e | – | – | 1 (fixed) | Temperature to potential evapotranspiration (PET) conversion factor |
d | 2 | mm | 50–550 | CMD threshold for producing flow |
τs (tau_s) | 3 | day | 10–1,000 | Time constant for slow flow store |
τq (tau_q) | 4 | day | 0–10 | Time constant for quick flow store |
vs (v_s) | 5 | – | 0–1 | Fractional volume for slow flow |
Sacramento | ||||
UZTWM | 1 | mm | 1–150 | Upper zone tension water maximum capacity |
UZFWM | 2 | mm | 1–150 | Upper zone free water maximum capacity |
UZK | 3 | 1/day | 0.1–0.5 | Upper zone free water lateral depletion rate |
PCTIM | 4 | – | 0.000001–0.1 | Fraction of the impervious area |
ADIMP | 5 | – | 0–0.4 | Fraction of the additional impervious area |
ZPERC | 6 | – | 1–250 | Maximum percolation rate coefficient |
REXP | 7 | – | 0–5 | Exponent of the percolation equation |
LZTWM | 8 | mm | 1–500 | Lower zone tension water maximum capacity |
LZFSM | 9 | mm | 1–1,000 | Lower zone supplementary free water maximum capacity |
LZFPM | 10 | mm | 1–1,000 | Lower zone primary free water maximum capacity |
LZSK | 11 | 1/day | 0.01–0.25 | Lower zone supplementary free water depletion rate |
LZPK | 12 | 1/day | 0.0001–0.25 | Lower zone primary free water depletion rate |
PFREE | 13 | – | 0–0.6 | Direct percolation fraction from the upper to the lower zone free water storage |
SIDE | – | – | 0.0 (fixed) | Fraction of base flow that is draining to areas other than the observed channel |
RSERV | – | – | 0.3 (fixed) | Fraction of the lower zone free water that is unavailable for transpiration purposes |
RIVA | – | – | 0.0 (fixed) | Fraction of the riparian vegetation area |
Parameter name . | Parameter no. . | Unit . | Range . | Description . |
---|---|---|---|---|
GR4J | ||||
X1 | 1 | mm | 50–5,000 | Maximum capacity of the production store |
X2 | 2 | mm | −15 to 4 | Groundwater exchange coefficient |
X3 | 3 | mm | 10–1,300 | 1-day-ahead maximum capacity of the routing store |
X4 | 4 | day | 0.5–5 | Time base of the unit hydrograph, UH1 |
IHACRES-CMD | ||||
f | 1 | – | 0.5–1.3 | CMD stress threshold as a proportion of d |
e | – | – | 1 (fixed) | Temperature to potential evapotranspiration (PET) conversion factor |
d | 2 | mm | 50–550 | CMD threshold for producing flow |
τs (tau_s) | 3 | day | 10–1,000 | Time constant for slow flow store |
τq (tau_q) | 4 | day | 0–10 | Time constant for quick flow store |
vs (v_s) | 5 | – | 0–1 | Fractional volume for slow flow |
Sacramento | ||||
UZTWM | 1 | mm | 1–150 | Upper zone tension water maximum capacity |
UZFWM | 2 | mm | 1–150 | Upper zone free water maximum capacity |
UZK | 3 | 1/day | 0.1–0.5 | Upper zone free water lateral depletion rate |
PCTIM | 4 | – | 0.000001–0.1 | Fraction of the impervious area |
ADIMP | 5 | – | 0–0.4 | Fraction of the additional impervious area |
ZPERC | 6 | – | 1–250 | Maximum percolation rate coefficient |
REXP | 7 | – | 0–5 | Exponent of the percolation equation |
LZTWM | 8 | mm | 1–500 | Lower zone tension water maximum capacity |
LZFSM | 9 | mm | 1–1,000 | Lower zone supplementary free water maximum capacity |
LZFPM | 10 | mm | 1–1,000 | Lower zone primary free water maximum capacity |
LZSK | 11 | 1/day | 0.01–0.25 | Lower zone supplementary free water depletion rate |
LZPK | 12 | 1/day | 0.0001–0.25 | Lower zone primary free water depletion rate |
PFREE | 13 | – | 0–0.6 | Direct percolation fraction from the upper to the lower zone free water storage |
SIDE | – | – | 0.0 (fixed) | Fraction of base flow that is draining to areas other than the observed channel |
RSERV | – | – | 0.3 (fixed) | Fraction of the lower zone free water that is unavailable for transpiration purposes |
RIVA | – | – | 0.0 (fixed) | Fraction of the riparian vegetation area |
We used 13 instead of 16 parameters for the Sacramento model (Burnash et al. 1973), which is the same number of parameters used by Shin et al. (2013). The Sacramento model has five runoff components: direct runoff from an impervious area, surface runoff, interflow, supplementary base flow and primary base flow. For a brief description of the process, excess rainfall becomes runoff through a unit hygrograph, and the rest of the rainfall fills various depths of soil moisture stores, which are interconnected. Loss through evapotranspiration happens at the soil stores, and the rest of the water becomes interflow and groundwater. The summation of the surface and lateral flow become streamflow (Podger 2004). The Sacramento model has been used in many studies including those of Shin et al. (2013, 2015), Vaze et al. (2010), Van Werkhoven et al. (2009), and Petheram et al. (2012).
Table 1 shows the descriptions of the parameters of the three rainfall–runoff models. The parameter ranges for the GR4J and Sacramento models are the same as those used by Shin et al. (2013); hence, readers can refer to Shin et al. (2013) for more description. We used the hydrologic models, objective functions and calibration algorithm in the Hydrological Model Assessment and Development (Hydromad) (Andrews et al. 2011) modelling package, which is an open-source software package in R. The Hydromad package is available at http://hydromad.catchment.org.
Catchments
Data
The daily areal rainfall and potential evapotranspiration data for the seven catchments were used as input data for the three conceptual rainfall–runoff models, and the observed flow data were used for the calibration and validation of the model parameters. The areal rainfall generated by Thiessen polygon method using data from multiple rainfall gauge stations, and the observed flow data were taken from the Water Resources Management Information System (http://www.wamis.go.kr). The potential evapotranspiration data were generated by the Penmann–Monteith method in the ‘SPEI’ package (Beguería & Vicente-Serrano 2015) using the daily point gauge climate data (maximum and minimum temperatures, wind speed and sunshine hours) from the Korea Meteorological Administration (http://www.kma.go.kr). We carefully selected the climate point gauge stations, which contain both long enough periods and good quality of data. The climate point gauge stations used for the Soyanggang, Chungju, Andong, Imha, Yongdam, Seomjingang and Juam catchments were the Chuncheon, Chungju, Yeongju, Andong, Jangsu, Imsil and Gwangju stations, respectively.
The total data periods are different for the seven dam catchments because of the different years of dam construction. To use the data efficiently, the data were split, having the same length of sub-periods for each catchment but different sub-periods for different catchments (Table 2). Hence, the calibration and validation periods were the same for each catchment but different over the seven catchments. The duration of calibration and validation periods were selected for more than 6 years considering the studies of Anctil et al. (2004), Kim et al. (2011) and Yapo et al. (1996). For calibration and validation, 1 year warm-up period was used, that is, the warm-up period for the P1 period of the Soyanggang catchment in Table 2 is January 1, 1974, to December 31, 1974.
Catchment . | Period name . | Data period . |
---|---|---|
Soyanggang | P1 | January 1, 1975–December 31, 1984 |
P2 | January 1, 1985–December 31, 1994 | |
P3 | January 1, 1995–December 31, 2004 | |
P4 | January 1, 2005–December 31, 2014 | |
Chungju | P1 | January 1, 1988–December 31, 1996 |
P2 | January 1, 1997–December 31, 2005 | |
P3 | January 1, 2006–December 31, 2014 | |
Andong | P1 | January 1, 1979–December 31, 1987 |
P2 | January 1, 1988–December 31, 1996 | |
P3 | January 1, 1997–December 31, 2005 | |
P4 | January 1, 2006–December 31, 2014 | |
Imha | P1 | January 1, 1995–December 31, 2004 |
P2 | January 1, 2005–December 31, 2014 | |
Yongdam | P1 | January 1, 2003–December 31, 2008 |
P2 | January 1, 2009–December 31, 2014 | |
Seomjingang | P1 | January 1, 1979–December 31, 1987 |
P2 | January 1, 1988–December 31, 1996 | |
P3 | January 1, 1997–December 31, 2005 | |
P4 | January 1, 2006–December 31, 2014 | |
Juam | P1 | January 1, 1993–December 31, 2003 |
P2 | January 1, 2004–December 31, 2014 |
Catchment . | Period name . | Data period . |
---|---|---|
Soyanggang | P1 | January 1, 1975–December 31, 1984 |
P2 | January 1, 1985–December 31, 1994 | |
P3 | January 1, 1995–December 31, 2004 | |
P4 | January 1, 2005–December 31, 2014 | |
Chungju | P1 | January 1, 1988–December 31, 1996 |
P2 | January 1, 1997–December 31, 2005 | |
P3 | January 1, 2006–December 31, 2014 | |
Andong | P1 | January 1, 1979–December 31, 1987 |
P2 | January 1, 1988–December 31, 1996 | |
P3 | January 1, 1997–December 31, 2005 | |
P4 | January 1, 2006–December 31, 2014 | |
Imha | P1 | January 1, 1995–December 31, 2004 |
P2 | January 1, 2005–December 31, 2014 | |
Yongdam | P1 | January 1, 2003–December 31, 2008 |
P2 | January 1, 2009–December 31, 2014 | |
Seomjingang | P1 | January 1, 1979–December 31, 1987 |
P2 | January 1, 1988–December 31, 1996 | |
P3 | January 1, 1997–December 31, 2005 | |
P4 | January 1, 2006–December 31, 2014 | |
Juam | P1 | January 1, 1993–December 31, 2003 |
P2 | January 1, 2004–December 31, 2014 |
Target functions for SA
Calibration method
The shuffled complex evolution (SCE) (Duan et al. 1992, 1993) global optimisation method was used to calibrate the model parameters. The SCE algorithm is one of the popular methods for parameter calibration (Nicklow et al. 2010), and it has been used for various studies until recently (Serrat-Capdevila et al. 2011; Moreno et al. 2012; Shin et al. 2015, 2016). The SCE algorithm combines the strength of four methods: simplex procedure (Nelder & Mead 1965), controlled random search (Price 1987), competitive evolution (Holland 1975) and the concept of complex shuffling (Duan et al. 1992). A brief description of the procedure of the SCE method is as follows:
Step 1: Select the initial ‘population’ randomly throughout the feasible parameter space.
Step 2: Divide the ‘population’ into ‘complexes’, which represent communities that have parents to generate an offspring.
Step 3: Evolve each complex independently by applying a competitive complex evolution strategy.
Step 4: Shuffle the evolved complexes into a single, whole population to share the information.
Step 5: Iterate Steps 2–4 until the results are converged to the preselected threshold.
Method used
First, we performed SA by applying the Sobol method for the parameters of the three models using the three target functions and the daily data of the seven catchments. Then, we investigated whether each parameter has worked properly for its own purpose and the over-parameterisation problem by counting the number of sensitive parameters.
Second, we calibrated the parameters of the three models by applying the SCE algorithm using the three objective functions and the daily data of the seven catchments. Then, we calculated the model performance statistics for the calibration and validation periods, and compared the three models. To investigate the overall model ability of replication of natural flow and prediction, we generated boxplots using the performance statistics for each model of the seven catchments, and the objective function grouped by calibration and validation periods. A larger dispersion of the boxplot for the validation period represents a lower prediction ability.
Lastly, we assessed the suitability of the rainfall–runoff models by applying MSI. We compared the three conceptual models for the seven catchments using the bar plot of MSI. The bar plot is an easy and quick method to investigate which models are suitable and which models require caution.
RESULTS
SA for the parameters of the three conceptual models
For the GR4J model, parameter no. 4 (X4) related to the unit hydrograph is sensitive to the high flow target function (Figure 3(a)), and parameter no. 2 (X2) related to the groundwater is sensitive to the low flow target function (Figure 3(b)). Parameter no. 3 (X3) related to the routing store of 1-day-ahead is more sensitive to the low flow target function (Figure 3(c)) as it is related to the antecedent soil moisture condition. Parameter no. 1 (X1) is always sensitive regardless of the target functions except for the Imha and Seomjingang catchments for the NSElog* target function (Figure 3(b)) because it is for producing the streamflow. It is worth noting that the NSElog* target function is a statistical measure, which calculates the overall performance by comparison of modelled and observed streamflow, therefore, it sometimes produces ‘type I error’ (incorrectly classifying a sensitive parameter as being insensitive). As the F20 target function is a scalar value calculated from the modelled streamflow time series, it is sometimes more reliable to judge the sensitivity of the low flow parameter compared to the NSElog* target function. Figure 3(c) shows that parameter no. 1 (X1) is sensitive to the F20 target function; hence, this parameter is sensitive to the low flow target function. Also note that this parameter could be less sensitive to the F20 target function if the characteristics of catchment and data are very dry; therefore, there is not enough information in the data for SA as demonstrated by Shin et al. (2013). The seven study catchments and data have wet characteristics as mentioned in the sub-section ‘Catchments’, therefore, these catchments are not cases for less sensitivity for parameter no. 1 (X1). Nos. 3 (for Juam), 1 (for Chungju) and 3 (for Yongdam) parameters for the NSE* target function (Figure 3(a)) are near the diagonal line that represents similar maximum and minimum TSIs for all periods. This model would not have insensitive nos. 1 and 3 parameters for this target function if a different data period for these catchments was used.
For the IHACRES model, parameter no. 5 (vs) related to slow flow is sensitive to all target functions (Figure 4) as this parameter is also directly related to the quick flow calculated as 1 – the volume of quick flow. Parameter no. 4 (τq) related to the quick flow is sensitive to the NSE* target function, and parameter no. 1 (f) related to the evapotranspiration is relatively sensitive to the NSElog* and F20 target functions with respect to the maximum TSI as expected. Parameter no. 2 (d), which is related to the soil depth for flow production, is more sensitive to the NSE* target function with respect to the maximum TSI than the other target functions related to the low flow as the change of high flow magnitude by the change of this parameter is relatively larger than the change of low flow magnitude. Parameter no. 3 (τs) related to the slow flow is more sensitive to the F20 target function (Figure 4(f)) compared to the sensitivity of this parameter to the NSElog* target function (Figure 4(d)) with respect to the maximum TSI. However, the minimum TSI values in Figure 4(d) and 4(f) have much lower TSI values, which means that the IHACRES model could be relatively weak for the low flow simulation. The parameter nos. 1 and 3 for the Juam and Seomjingang catchments are sensitive to the F20 target function in terms of the maximum TSI (Figure 4(e) and 4(f)) but the sensitivities of these parameters are below the threshold for the Chungju catchment (Figure 4(f)). This represents that the parameter sensitivity is dependent on the catchment's characteristics.
For the Sacramento model, parameter nos. 2, 5, 8, 9, 11 and 12 are relatively sensitive (at least one parameter is sensitive) to the NSE* target function with respect to the maximum TSI; parameter nos. 1, 12 and 13, to the NSElog* target function; and parameter nos. 1, 11 and 12, to the F20 target function. Hence, eight parameters (nos. 1, 2, 5, 8, 9, 11, 12 and 13) are relatively sensitive, and five parameters (nos. 3, 4, 6, 7 and 10) are insensitive to both high and low flow target functions. These insensitive parameters could cause an identifiability problem through a serious interaction with the other parameters (Shin et al. 2015). Therefore, the Sacramento model has parametric and structural identifiability problems.
Boxplots to assess the overall sensitivity of the parameters
Comparison of the three model performances for the three objective functions
In Figure 9, for the NSE objective function, the three models have good model performances over the seven catchments except for the P2 periods of the Seomjingang catchment. For the P2 periods, the three models have relatively low model performances for the calibration and validation periods, which also occurs in Figures 10 and 11. It could be due to the significantly lowered runoff ratio from 1990 to 1994 in Figure 2 (5-year moving average) or a significant error in the observed data.
The GR4J model, with only four parameters, dominates the other two models in terms of model performances with the NSE objective function for four catchments (Soyanggang, Andong, Imha and Juam) and has similar model performances for two catchments (Yongdam and Seomjingang) (Figure 9). For the Chungju catchment, the GR4J model has a slightly lower model performance than the other models, but still has a good performance having more than 0.7 NSE. Overall, the Sacramento and IHACRES models follow the GR4J model, with a reasonable model performance for high flow. The GR4J model has a higher or similar model performance for the 0.5(NSE + NSElog) and NSElog objective functions over the seven catchments compared to the other two models.
For a more detailed analysis of the Chungju catchment, the GR4J model has a similar or slightly lower model performance than the other models for the NSE objective function, but this model has a similar model performance for the 0.5(NSE + NSElog) objective function, and a slightly higher or similar model performance for the NSElog objective function. The model performances of the IHACRES model with the NSE objective function is higher than with the NSElog objective function, whereas the model performances of the GR4J are opposite, which implies that the GR4J model is more suitable for the low flow simulation and the IHACRES model is more suitable for the high flow simulation for some catchments. These results complement the study of Shin et al. (2013). For the P3 calibration period and other periods for validation, the Sacramento model has similar or better model performances for the NSE and 0.5(NSE + NSElog) objective functions but a lower model performance for the NSElog objective function than the other two models. It means that more parameters for groundwater in the Sacramento model have not functioned enough for low flow fitting due to the insensitivity discussed in the sub-section ‘Boxplots to assess the overall sensitivity of the parameters’.
Boxplots to assess the overall replicability and predictability of the models
In Figure 12, the GR4J model has a higher or similar model performance overall. The median value of NSE is higher than the other two models. The range of the boxplot for the 0.5(NSE + NSElog) performance statistic is narrower, which represents a more stable medium flow replicability than the other models. The Sacramento model follows the GR4J model (see the median value of the boxplot), and the IHARES model has a relatively lower model performance but has a narrow range over the objective functions, representing a more stable replicability of the catchment response.
In Figure 13, the GR4J model has the highest median value of the boxplot for NSE values, but the range of NSE values is relatively higher than the other models; hence, it has a relatively unstable prediction ability for high flow. However, the difference of the NSE value range between the GR4J model and Sacramento model is less than 0.1, representing little difference. This phenomenon was also found in the results for the 0.5(NSE + NSElog) and NSElog objective functions. The model performances of the Sacramento model follow those of the GR4J model, and the IHACRES model also shows reasonable model performances with more than 0.5 of performance statistics.
Comparison of rainfall–runoff models using MSI
It is hard to judge which model is more suitable when one considers the sensitivity indices and performance statistics from the result of the sub-sections above. For example, the GR4J model had good and clear parameter sensitivities capturing their purposes, and the median values of the model performance statistics were higher than the other models. However, the range of the performance statistics were sometimes wider than the other models; hence, the simulation uncertainty was higher. The IHACRES model had better parameter sensitivities than the Sacramento model but the median values of the model performance statistics were similar to or lower than the Sacramento model. The higher narrowness of the performance range compared to the Sacramento model depended on the objective functions. The Sacramento model had more uncertainty due to possessing more insensitive parameters than the other models, but the median values of the model performance statistics were better than the IHACRES model. The MSI, which comprehensively considers both sensitivity indices and performance statistics, has the strength of ease and clearness to judge the superiority and inferiority of the models compared to the other two methods.
We decided that 0.5 of MSI is the threshold of good MSI as NSE values greater than 0.5 are good (Moriasi et al. 2007). Moreover, the study of van Werkhoven et al. (2009) undertook the Sobol SA of the Sacramento model in order to fix insensitive parameters that have TSIs less than 0.2 threshold while maintaining the same predictive performance to the non-fixed parameter case. As a result, this sensitivity threshold reduced around half of the total number of Sacramento parameters. As more fixed parameters result in the poorer model performances, we believed that the model with more than half of the sensitive parameters for the arbitrary target function is suitable. As described in the sub-section ‘MSI’, we give the same weight to model performance and sensitivity; therefore, the threshold value of good MSI is 0.5.
DISCUSSION AND CONCLUSIONS
MSI considers both model performance and uncertainty quantitatively; hence, it can be applied to any model and any catchment for the purpose of comparison. Furthermore, this index is very flexible as it uses various target functions and objective functions for sensitive analysis and performance statistics, respectively. This reliability of MSI can be improved by adding more sensitivity indices and performance statistics. We used the quantitative SA method of Sobol, but the qualitative SA method, such as the Morris (1991) method, cannot be used for MSI calculation as MSI comparisons between models or catchments are not possible. We gave the same weight to the model performance and sensitivity results, but this weight can be adjusted by users with valid reasons. The analysis of sensitivity to the weights given to these two parts is out of bounds for this study so we have not done this analysis. However, this analysis can be an interesting study because the MSI can be changed with different weights; therefore, this work will be researched in the future.
The parameters of the models used in this study were related to the input rainfall and its response of runoff, therefore, they can be used for the SA, model performance test and, finally, the MSI analysis. However, if some of the parameters from a conceptual model are related to snow accumulation and melt, and are applied to some tropical catchments, then the MSI of the model will be low because the parameters will be insensitive, even if the other parameters of the model capture the runoff reasonably. Therefore, the selection of relevant model parameters is an essential prerequisite step before the MSI analysis.
While this paper focused on three conceptual rainfall–runoff models, the method can be applied to other physically based distributed rainfall–runoff models. The detailed analysis for the sensitivity indices in the sub-section ‘SA for the parameters of the three conceptual models’ can be used to investigate the parametric and structural uncertainty of the model, and the detailed analysis for the model performances in the sub-section ‘Comparison of the three model performances for the three objective functions’ can be used to investigate the strength of the model, i.e., the extent of capturing the dynamics of rainfall–runoff processes. The overall analysis using all the sensitivity indices or model performances in the sub-sections ‘Boxplots to assess the overall sensitivity of the parameters’ and ‘Boxplots to assess the overall replicability and predictability of the models’ can be a useful tool for gaining intuition about model suitability. The MSI analysis in the sub-section ‘Comparison of rainfall–runoff models using MSI’ can be a pragmatic measure to screen out the suitable model.
To summarise, we have investigated the suitability of three conceptual rainfall–runoff models for the Korean catchments through the comparison of the sensitivity of model parameters, model performances and MSI by coupling the results of SA and the model performance statistics. We used three well-known and widely used hydrological models – GR4J, IHACRES and Sacramento – for the seven unregulated dam catchments, which have different periods of data sets from 12 to 40 years. Three target functions for high, low and very low flow were used for SA, and three objective functions for high, medium and low flow were used for calibration. The SCE algorithm was used for parameter estimation, and the split-sample independent data sets were used for the calibration and validation periods.
As a result, the parameters of the GR4J model had good sensitivity, representing well-identified parameters, and this model with only four parameters had the best model performance overall; hence, it had the best MSI values. The IHACRES model had well-identified parameters overall, but had one parameter with a relatively low sensitivity and this parameter could cause a lower performance for the low flow simulation. The IHACRES model ranked as second best in terms of MSI values over the seven catchments. Both GR4J and IHACRES were suitable models, with more than 0.5 of MSI. However, the Sacramento model had MSI values of less than 0.5 for most catchments as it has many insensitive parameters. The Sacramento model had an over-parameterisation problem, and this uncertainty problem could be propagated to the model results as demonstrated by Shin et al. (2015); hence, care needs to be taken. The Sacramento model has relatively more parameters for groundwater simulation compared to the GR4J and IHACRES models, but its model performance for the low flow simulation was not sufficiently better. This implies that the use of more parameters cannot guarantee better model performance, as warned by Jakeman & Hornberger (1993).