There are different views on the selection of hydrological model structural complexity for streamflow prediction in ungauged basins. Some studies suggest that complex models are better than simple models due to the former's prediction capability; whereas some studies favor parsimonious model structures to overcome a risk of over-parameterization. The Xinanjiang (XAJ) model, the most widely used hydrological model in China, has two different versions, as follows: (1) the simple version with seven parameters (XAJ7) and (2) the complex version with 14 parameters (XAJ14). In this study, the two versions of the XAJ model were comprehensively evaluated for streamflow prediction in ungauged basins based on their efficiency, parameter identifiability, and independence. The results showed that the complex XAJ14 model was more flexible than the simple XAJ7 in calibration mode; while the two models have similar performance in validation and regionalization modes. Lack of parameter identifiability and the presence of parameter interdependence most likely explain why the complex XAJ14 cannot consistently outperform the XAJ7 in different modes. Therefore, the simple XAJ7 is a better choice than XAJ14 for streamflow prediction in ungauged basins.
INTRODUCTION
Streamflow observation records are important for hydrological research and practices, such as flooding forecasting, hydraulic engineering design, drought risk assessment, and water resources planning and management. However, streamflow observations are not always available in many parts of the world, and the publicly available streamflow records are often incomplete or very brief (Sivapalan 2003; Wagener et al. 2004a). Missing or incomplete streamflow observations have greatly impeded the operation of hydrological research and practices. Some methods can be used to reconstruct the streamflow time series, including statistic-based methods, artificial intelligence-based methods, and process-based hydrological models (Wagener et al. 2004b; Besaw et al. 2010; Benito 2012). However, these methods often become invalid, unable to reproduce streamflow given the limited observed data for calibration. Consequently, reliable streamflow prediction in ungauged basins is a great challenge for hydrological research (Sivapalan et al. 2003; Seibert & Beven 2009). The International Association of Hydrological Sciences (IAHS) launched the Scientific Decade of IAHS (2003–2012), entitled Predictions in Ungauged Basins (PUB), which focuses on improving scientific understanding and simulation of hydrological processes in ungauged basins (Sivapalan et al. 2003; Blöschl et al. 2013; Hrachowitz et al. 2013).
Several methods have been proposed for estimating streamflow in ungauged basins, and most of them involve the use of hydrological models (McIntyre et al. 2005; Castiglioni et al. 2010; Ahiablame et al. 2012; Booker & Snelder 2012; El-Hames 2012). Parameter estimation is a key step of hydrological modeling, which largely determines the accuracy of streamflow simulation and prediction (Xu 2003; Lee et al. 2005; McIntyre et al. 2005). Due to high heterogeneity in landscape properties, hydrological model parameters cannot be measured directly at the catchment scale, which usually are inferred by a calibration process (Xu 1999; Vaché & McDonnell 2006; Jin et al. 2009). However, parameter calibration in ungauged basins cannot be directly performed because of the lack of observations. An alternative strategy is to transfer model parameters from gauged basins to ungauged basins, i.e., parameter regionalization (Blöschl 2006; Zhang & Chiew 2009; Pechlivanidis et al. 2010; Kizza et al. 2013). An overview of the parameter regionalization methods has been presented by He et al. (2011) and Parajka et al. (2013).
Model complexity is also an important criterion of model selection in addition to the goodness-of-fit. The complexity of hydrological models, which is commonly quantified by the number of model parameters, has significant impacts on parameter regionalization in ungauged basins (Chiew et al. 1993; Perrin et al. 2001). The calibration and parameter regionalization of the hydrological model is not straightforward as the number of model parameters increases (McCabe et al. 2005). It is generally expected that a model with a high degree of freedom (complex model) will perform better in the calibration period in comparison to a simple model (due to higher degrees of freedom). However, this is not always the case in the ‘untrained’ validation period (Wheater et al. 1993; Wagener et al. 2001b; Hailegeorgis & Alfredsen 2015). Some studies have suggested the model users avoid using complex models to estimate streamflow in ungauged basins (Jakeman & Hornberger 1993; Lee et al. 2005; Bárdossy 2007; Skaugen et al. 2015). A complex model may result in model over-parameterization, which can bring a large degree of uncertainty in streamflow prediction (Jakeman & Hornberger 1993; Bárdossy 2007; Skaugen et al. 2015). Nevertheless, conclusions from other studies were in favor of complex models (Vaché & McDonnell 2006; Li et al. 2015).
The Xinanjiang (XAJ) model, as the most widely used hydrological model in China, has two different versions, as follows: (1) the simple version with seven parameters (XAJ7) and (2) the complex version with 14 parameters (XAJ14). The main objective of this study is to evaluate which of these two models, XAJ7 and XAJ14, is more suitable for streamflow prediction in ungauged basins. In this study, model efficiency is not the only criterion of model evaluation, and parameter identifiability and independence are also considered. This paper is structured as follows: the section below describes the two XAJ model versions, study area and data sources; then the next section introduces methods used for model evaluation; followed by a section focusing on the results and discussion; and, finally the summary and conclusions are presented.
XAJ MODEL DESCRIPTION AND STUDY AREA
XAJ model description
The XAJ14 model has more extensive applications than the XAJ7 model in China, and it has been employed in the China National Flood Forecasting System (WMO 2011; Yao et al. 2014). The XAJ14 model contains the following four modules: runoff generation, three-layer evapotranspiration (ET), separation of runoff components, and runoff routing (Table 1). Generally, ET occurs in the top three soil layers; the actual evapotranspiration (AET) is estimated as a function of potential evapotranspiration (PET) and available soil moisture. AET first occurs in the upper layer at potential rate until the water storage is exhausted. Then, the water storage in the lower layer begins to supply for AET. AET occurs in the deepest layer of soil only when the lower storage layer is reduced to a proportion of storage capacity. A similar mechanism is also used to describe the soil moisture repletion process among the three soil layers. The XAJ14 model uses a single parabolic curve to describe the spatial heterogeneity of the soil moisture storage capacity and assumes that the runoff is not produced until the soil moisture storage reaches field capacity (Zhao 1992; Cheng et al. 2006). The generated runoff is separated into the following three components: surface runoff, interflow, and groundwater according to different free water storage structures. The surface runoff directly flows into the river, and the interflow and groundwater is released slowly to river channels through a single linear reservoir. Finally, the Muskingum routing equation is adopted to calculate the discharge at the watershed outlet.
Module . | Pars. . | Parameter descriptions . | Range and units . |
---|---|---|---|
Runoff generation | Bb | Exponential of the distribution to tension water capacity | 0–1 |
IMP | Percentage of impervious and saturated areas in the catchment | 0–0.1 (%) | |
Evapotranspiration | UMb,c | Average soil moisture storage capacity of the upper layer | 5–100 (mm) |
LM | Average soil moisture storage capacity of the middle layer | 50–300 (mm) | |
DMb,c | Average soil moisture storage capacity of the deepest layer | 5–100 (mm) | |
C | ET coefficient of the deepest layer | 0.1–0.2 | |
Runoff separation | SM | Areal mean free water capacity of the surface soil layer | 10–60 (mm) |
EX | Exponential of the spatial distribution curve of free water storage capacity | 1.0–1.5 | |
KI | Outflow coefficient of free water storage to the interflow | 0.3–0.7 | |
KG | Outflow coefficient of free water storage to the groundwater | 0.1–0.2 | |
FCa | Steady recharge constant to groundwater | 0.5–15 (mm/d) | |
Routing | CI | Recession constant of the lower interflow storage | 0.1–0.9 |
CGb | Recession constant of the lower groundwater storage | 0.95–0.99 | |
XEb | Muskingum coefficient of geometry factor | 0–0.5 | |
KEb | Muskingum coefficient of residence time of water | 1–3 (d) |
Module . | Pars. . | Parameter descriptions . | Range and units . |
---|---|---|---|
Runoff generation | Bb | Exponential of the distribution to tension water capacity | 0–1 |
IMP | Percentage of impervious and saturated areas in the catchment | 0–0.1 (%) | |
Evapotranspiration | UMb,c | Average soil moisture storage capacity of the upper layer | 5–100 (mm) |
LM | Average soil moisture storage capacity of the middle layer | 50–300 (mm) | |
DMb,c | Average soil moisture storage capacity of the deepest layer | 5–100 (mm) | |
C | ET coefficient of the deepest layer | 0.1–0.2 | |
Runoff separation | SM | Areal mean free water capacity of the surface soil layer | 10–60 (mm) |
EX | Exponential of the spatial distribution curve of free water storage capacity | 1.0–1.5 | |
KI | Outflow coefficient of free water storage to the interflow | 0.3–0.7 | |
KG | Outflow coefficient of free water storage to the groundwater | 0.1–0.2 | |
FCa | Steady recharge constant to groundwater | 0.5–15 (mm/d) | |
Routing | CI | Recession constant of the lower interflow storage | 0.1–0.9 |
CGb | Recession constant of the lower groundwater storage | 0.95–0.99 | |
XEb | Muskingum coefficient of geometry factor | 0–0.5 | |
KEb | Muskingum coefficient of residence time of water | 1–3 (d) |
aThe parameter that only belongs to XAJ7 model.
bThe parameters that are shared by XAJ7 and XAJ14 models.
cThe parameter range of UM and DM in the XAJ7 are 10–150 and 50–350, respectively.
Note that the range of parameters comes from the published studies of Li et al. (2009) and Zhao et al. (1980).
Study area and data used
Catchment characteristics . | Min . | Median . | Max . |
---|---|---|---|
Catchment area (km2) | 435 | 1,081 | 3,548 |
Mean annual rainfall (mm) | 1,341 | 1,544 | 1,940 |
Aridity index | 0.58 | 0.80 | 0.92 |
Runoff coefficient | 0.48 | 0.58 | 0.72 |
Percent forest cover (%) | 48 | 75 | 92 |
Stream length (km) | 36 | 57 | 99 |
Mean elevation (m) | 76 | 178 | 348 |
Catchment slope (‰) | 1.5 | 5.2 | 16.7 |
Catchment characteristics . | Min . | Median . | Max . |
---|---|---|---|
Catchment area (km2) | 435 | 1,081 | 3,548 |
Mean annual rainfall (mm) | 1,341 | 1,544 | 1,940 |
Aridity index | 0.58 | 0.80 | 0.92 |
Runoff coefficient | 0.48 | 0.58 | 0.72 |
Percent forest cover (%) | 48 | 75 | 92 |
Stream length (km) | 36 | 57 | 99 |
Mean elevation (m) | 76 | 178 | 348 |
Catchment slope (‰) | 1.5 | 5.2 | 16.7 |
Daily meteorological data from 42 stations were obtained from the Meteorological Bureau of Jiangxi Province. PET was estimated by the Hargreaves and Samani equation (Hargreaves & Samani 1985). Basin average rainfall and PET were calculated by the Thiessen polygons method based on the available meteorological stations in and around each catchment. Observed streamflow data were provided by China's Hydrological Year Book, published by the Hydrological Bureau of the Ministry of Water Resources, China. All test catchments have 17 years of continuous streamflow data from 1970 to 1986. The period 1971–1978 was used for model calibration (1970 was used for model warm up), and the period 1979–1986 was used for model validation.
METHODOLOGY
Parameter regionalization methodology
Model performance assessment criteria
Parameter sensitivity and identifiability analysis
RESULTS AND DISCUSSION
Model evaluation in calibration, validation, and regionalization modes
Figure 5 shows model regionalization results based on physical similar and spatial proximity methods. The physical similar method performs slightly better than the spatial proximity method in the regionalization mode. Model regionalization results are poorer than the model calibration and validation results, with median KGE values from the regionalization results being approximately 0.10 to 0.15 lower than the calibration and validation results. The XAJ14 and XAJ7 models achieve similar regionalization results whether for the physical similar method or the spatial proximity method, indicating that more complex process representations in the XAJ14 model do not improve streamflow prediction ability in ungauged basins compared with the simple XAJ7 model.
Regionalization under high flow and low flow simulations
Parameter identifiability and independence
We also analyzed the independence of model parameters for the two hydrological models. Table 3 shows the correlation coefficients of calibrated model parameters in XAJ14 and XAJ7 models. The interdependence of the calibrated parameters is weak for XAJ7, and the correlation coefficients between model parameters range from −0.25 and 0.26. The weak correlations between the parameters of XAJ7 probably stem from the parsimony of the model. Compared to XAJ7, the parameter interdependence in XAJ14 is more pronounced, and the correlation coefficients between some parameters are greater than 0.50 or less than −0.50, indicating that some of the parameters in XAJ14 model are covariant with each other and have a similar effect on the streamflow simulations.
. | Parameters . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model . | B . | UM . | DM . | CG . | XE . | KE . | LM/FC . | IMP . | C . | SM . | EX . | KG . | KI . | CI . |
XAJ14 | ||||||||||||||
B | 1.00 | |||||||||||||
UM | 0.15 | 1.00 | ||||||||||||
DM | -0.02 | 0.21 | 1.00 | |||||||||||
CG | 0.02 | −0.30 | 0.18 | 1.00 | ||||||||||
XE | 0.07 | −0.01 | −0.19 | 0.00 | 1.00 | |||||||||
KE | −0.13 | 0.01 | −0.31 | −0.31 | −0.17 | 1.00 | ||||||||
LM | 0.16 | −0.50 | −0.63 | −0.13 | −0.07 | 0.24 | 1.00 | |||||||
IMP | 0.21 | 0.05 | 0.14 | 0.17 | −0.70 | 0.24 | 0.06 | 1.00 | ||||||
C | −0.04 | −0.41 | 0.10 | 0.28 | 0.07 | −0.10 | −0.16 | −0.17 | 1.00 | |||||
SM | 0.09 | 0.13 | −0.42 | −0.26 | 0.31 | 0.53 | 0.07 | −0.47 | −0.07 | 1.00 | ||||
EX | 0.21 | −0.03 | −0.37 | 0.01 | 0.04 | −0.38 | 0.07 | 0.03 | −0.17 | 0.24 | 1.00 | |||
KG | 0.13 | 0.36 | −0.27 | −0.54 | 0.22 | −0.14 | 0.20 | −0.46 | −0.37 | 0.16 | −0.05 | 1.00 | ||
KI | 0.04 | 0.42 | −0.03 | −0.44 | −0.32 | −0.04 | −0.07 | 0.24 | −0.56 | 0.03 | 0.17 | 0.25 | 1.00 | |
CI | −0.15 | −0.14 | 0.36 | 0.26 | −0.75 | 0.03 | 0.03 | 0.61 | 0.03 | -0.54 | −0.23 | −0.58 | 0.19 | 1.00 |
XAJ7 | ||||||||||||||
B | 1.00 | |||||||||||||
UM | 0.07 | 1.00 | ||||||||||||
DM | −0.01 | −0.15 | 1.00 | |||||||||||
CG | −0.11 | −0.19 | 0.07 | 1.00 | ||||||||||
XE | 0.20 | −0.09 | −0.01 | 0.10 | 1.00 | |||||||||
KE | −0.24 | 0.05 | −0.20 | −0.21 | −0.25 | 1.00 | ||||||||
FC | 0.23 | 0.17 | 0.26 | −0.19 | 0.10 | 0.10 | 1.00 |
. | Parameters . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model . | B . | UM . | DM . | CG . | XE . | KE . | LM/FC . | IMP . | C . | SM . | EX . | KG . | KI . | CI . |
XAJ14 | ||||||||||||||
B | 1.00 | |||||||||||||
UM | 0.15 | 1.00 | ||||||||||||
DM | -0.02 | 0.21 | 1.00 | |||||||||||
CG | 0.02 | −0.30 | 0.18 | 1.00 | ||||||||||
XE | 0.07 | −0.01 | −0.19 | 0.00 | 1.00 | |||||||||
KE | −0.13 | 0.01 | −0.31 | −0.31 | −0.17 | 1.00 | ||||||||
LM | 0.16 | −0.50 | −0.63 | −0.13 | −0.07 | 0.24 | 1.00 | |||||||
IMP | 0.21 | 0.05 | 0.14 | 0.17 | −0.70 | 0.24 | 0.06 | 1.00 | ||||||
C | −0.04 | −0.41 | 0.10 | 0.28 | 0.07 | −0.10 | −0.16 | −0.17 | 1.00 | |||||
SM | 0.09 | 0.13 | −0.42 | −0.26 | 0.31 | 0.53 | 0.07 | −0.47 | −0.07 | 1.00 | ||||
EX | 0.21 | −0.03 | −0.37 | 0.01 | 0.04 | −0.38 | 0.07 | 0.03 | −0.17 | 0.24 | 1.00 | |||
KG | 0.13 | 0.36 | −0.27 | −0.54 | 0.22 | −0.14 | 0.20 | −0.46 | −0.37 | 0.16 | −0.05 | 1.00 | ||
KI | 0.04 | 0.42 | −0.03 | −0.44 | −0.32 | −0.04 | −0.07 | 0.24 | −0.56 | 0.03 | 0.17 | 0.25 | 1.00 | |
CI | −0.15 | −0.14 | 0.36 | 0.26 | −0.75 | 0.03 | 0.03 | 0.61 | 0.03 | -0.54 | −0.23 | −0.58 | 0.19 | 1.00 |
XAJ7 | ||||||||||||||
B | 1.00 | |||||||||||||
UM | 0.07 | 1.00 | ||||||||||||
DM | −0.01 | −0.15 | 1.00 | |||||||||||
CG | −0.11 | −0.19 | 0.07 | 1.00 | ||||||||||
XE | 0.20 | −0.09 | −0.01 | 0.10 | 1.00 | |||||||||
KE | −0.24 | 0.05 | −0.20 | −0.21 | −0.25 | 1.00 | ||||||||
FC | 0.23 | 0.17 | 0.26 | −0.19 | 0.10 | 0.10 | 1.00 |
Note that the values greater (less) than or equal to 0.50 (−0.50) are shown in bold.
Overall, parameter identifiability and independence analysis indicate that model parameters in XAJ7 are easier to identify and have less correlation behavior than those in the XAJ14 model. Lack of parameter identifiability and parameter interaction could lead to large uncertainties in streamflow prediction (Beven 1993; Wheater et al. 1993). This is likely the reason why the complex XAJ14 model cannot consistently outperform the XAJ7 model in validation and regionalization modes.
Model complexity for prediction in ungauged basins
SUMMARY AND CONCLUSION
The main objective of this study was to determine whether the XAJ7 or XAJ14 model is more suitable for streamflow prediction in ungauged basins. Model evaluation was performed not only based on model performance, but also on the dependence on parameter identifiability. The results showed that the XAJ14 model benefits from the increased complexity and yields better model performance than the simple XAJ7 model in calibration mode. However, the superior performance cannot be sustained in validation and regionalization modes where the simple XAJ7 model performed similarly or even better (in low flow simulations) than the complex XAJ14 model. Parameter identifiability and independence analysis suggested that model parameters in the simple XAJ7 model are more identifiable and have less correlation behavior than those in the complex XAJ14 model, which probably causes the inconsistency of model performance between XAJ14 and XAJ7 models in different modes. Considering model efficiency, parameter identifiability, and independence, the XAJ7 model is a better choice than the XAJ14 model for streamflow prediction in ungauged basins.
In addition, current hydrological research highlights the development of an integrated model of water-related processes and the development of land surface process models (Foley et al. 1996; Arnold et al. 1998; Liu et al. 2008), in which the hydrological model only serves as a sub-model to simulate water balance. These models without exception have a large number of parameters, which makes reliable parameter estimation a challenging task. Compared with the XAJ14 model, the simple XAJ7 model seems more suitable to be coupled with other water-related models, or to be embedded into a land surface model.
ACKNOWLEDGEMENTS
This research was supported by the Natural Science Foundation of China (41201034, 41330529), the program for ‘Bingwei’ Excellent Talents in Institute of Geographic Sciences and Natural Resources Research, CAS (Project No. 2013RC202), Chinese Academy of Sciences Visiting Professorship for Senior International Sciences (Grant No. 2013T2Z0014) and Natural Sciences Foundation of Jiangsu Province (BK20141059).