Abstract
This study evaluates the performance of the soil and water assessment tool (SWAT), the hydrologiska byråns vattenbalansavdelning (HBV) and the hydrologic engineering center-hydrologic modeling system (HEC-HMS) for modeling rainfall-runoff in the data-scarce Katar catchment, Ethiopia. First, the rainfall-runoff process was simulated using the SWAT, HBV and HEC-HMS models individually. Second, simple average ensemble (SAE), weighted average ensemble (WAE) and neural network ensemble (NNE) techniques were developed by combining the results of individual models to improve overall accuracy. Statistical performance measures and flow duration curves (FDCs) were used to compare and evaluate the performance of the models. The results showed that the SWAT model outperformed the HBV and HEC-HMS models with the coefficient of determination (R2) and Nash–Sutcliffe efficiency (NSE) of 0.857 and 0.83 for calibration and 0.85 and 0.799 for validation, respectively. The ensemble result showed that NNE outperformed the SAE and WAE techniques, with NSE and R2 values of 0.924 and 0.925 for calibration and 0.896 and 0.904 for validation, respectively. The NNE technique improved the performance of SWAT, HBV and HEC-HMS by 12.14, 22.7 and 26.8%, respectively, in the validation phase. Overall, the results showed that ensemble modeling is a promising option for accurate modeling of the rainfall-runoff process.
HIGHLIGHTS
Novel ensemble approach was proposed to simulate the complex and dynamic rainfall-runoff process.
Spatial, hydrological and meteorological data were used as input for the semi-distributed models.
Three ensemble techniques were developed by combining the runoff results of the semi-distributed models to boost the overall efficiency.
The proposed ensemble technique significantly improved the modeling performance.
Graphical Abstract
INTRODUCTION
In recent years, water demand in developing countries has increased significantly due to rapid population growth (Miraji et al. 2019). Operational hydrology and water resources management require reliable predictions of the hydrological process such as rainfall-runoff process, groundwater flow and evapotranspiration (Chathuranika et al. 2022). Accurate modeling of the rainfall-runoff process using evapotranspiration, rainfall and other hydrologic data is an essential task because it provides information for water resources management, flood mitigation and warning, land use and hydrology of a watershed (Kisi et al. 2013). The rainfall-runoff models are the mathematical representation of the physical relationships between the various components of the hydrologic cycle within the defined hydrologic unit (Yang et al. 2020a, 2020b). Modeling the rainfall-runoff process, however, is a challenging task because it is a nonlinear, complex outcome of various hydrologic variables and catchment characteristics and therefore cannot be simulated with a simple model.
So far, different models have been developed to study the nonlinear and complicated relationship between rainfall and runoff, which usually shows a great deal of temporal and spatial variability due to the mixed influence of soil type, land use, weather conditions and the number of variables included in the modeling process. These models are generally categorized as data-driven and physically based models. The physically based hydrological models work by constructing a simplified watershed system and mathematical equations that describe the physical processes involved in the movement and storage of water (Young et al. 2017). Those physical models have components that resemble the physical processes and are capable of considering the uneven distribution of rainfall, evapotranspiration and spatial variation of watershed characteristics in the modeling process (Kisi et al. 2012). In contrast, the data-driven models completely ignore the physics of the process. Physically based hydrological models are extensively used for predicting and simulating the rainfall-runoff process of catchments. Among the different types of physically based models, fully and semi-distributed models are considered to be the best hydrological modeling standard because they can account for the spatial variation of soil, landscape, land use in the catchment and also atmospheric influences (Yang et al. 2020a, 2020b). In the past decades, various physically based models, such as soil and water assessment tool (SWAT) (Arnold et al. 1998; Rostamian et al. 2008), hydrologic engineering center-hydrologic modeling system (HEC-HMS) (Feldman 2000), hydrologiska byråns cattenbalansavdelning (HBV) (Bergström 1992), MIK-SHE (DHI (Danish Hydraulic Institute) 1999) and GR4J (Perrin et al. 2003) models have been used to solve a variety of water resources and environmental problems in different regions (Chathuranika et al. 2022).
The SWAT model is one of the most popular semi-distributed, physically based, watershed-scale hydrological models that can simulate river discharge, suspended sediment and nonpoint source pollutant loads under different climate, land use, management scenarios and soil types (Dash et al. 2020). The SWAT model was used for modeling hydrological processes and showed acceptable results (e.g., Biru & Kumar 2018; Hallouz et al. 2018; Ahmadi et al. 2019; Brighenti et al. 2019; Melaku & Wang 2019; Bizuneh et al. 2021). The HEC-HMS model is a semi-distributed HMS developed by the US Army Corps of Engineers Hydrologic Engineering Center (USACE 2010). The HEC-HMS model has been used in previous studies and shown good performance in hydrological modeling (e.g., Halwatura & Najim 2013; Tassew et al. 2019; Hamdan et al. 2021; Shekar & Vinay 2021). The HBV model is another semi-distributed rainfall-runoff model that works on the daily time step. HBV was used in some hydrological modeling and showed acceptable results (e.g., Ouatiki et al. 2020; Bizuneh et al. 2021; Esmaeili-Gisavandani et al. 2021). None of these studies, however, compared the performance of SWAT, HBV and HEC-HMS models in a semi-arid and subhumid tropical region characterized by highly variable rainfall and topography.
The aforementioned physically based hydrological models are acceptable methods for examining the actual physical process, especially when physical understanding is more important than accurate prediction. However, compared with the data-driven models, the physically based models exhibit a practical limitation in reaching the required accuracy (Young et al. 2017; Nourani et al. 2020). Most catchments in subhumid tropical regions have large variations in daily and seasonal runoff, which contributes to the failure of most hydrologic models to simulate the hydrology of these catchments (Tibangayuka et al. 2022). In addition, previous studies have only evaluated the performance of individual hydrologic models, whose accuracy in a given catchment varies depending on their structure. According to Fenicia et al. (2007), predictive performance can be improved by modifying the already available models through ensemble techniques. Therefore, it is believed that an ensemble method that combines the results of different models could boost the accuracy of the individual models by taking the strengths of the different models. In this context, the study conducted by Bates & Granger (1969) confirmed that combining the results of multiple models by applying different ensemble techniques would produce more accurate results than the individual models. The idea behind the model ensemble is to use the exclusive features of each model in a unique framework to increase the accuracy of modeling (Kiran & Ravi 2008). The study by Cavadias & Morin (1986) is the first in the field of hydrology to use ensemble techniques. Since then, the capability of model combination in improving prediction accuracy has been confirmed in the various hydrological processes (e.g., Noori & Kalin 2016; Esmaeili-Gisavandani et al. 2021; Nourani et al. 2021a, 2021b). Thus, the main aim of this study is to evaluate the simulation performance of SWAT, HBV and HEC-HMS models for rainfall-runoff modeling of Katar catchment and developing linear (WAE and SAE) and nonlinear ensemble (NNE) ensemble techniques to improve the accuracy of individual models. The HBV, SWAT and HEC-HMS models were selected because these models are freely available, making them useful in developing countries where funding for the use of commercial software is limited. Also, these models are capable of modeling continuous processes in tropical regions (Tibangayuka et al. 2022). The NNE technique was chosen as a nonlinear model combination approach over other techniques because they are popular, compatible and also showed high predictive accuracy in ensemble studies previously published in other fields (Nourani et al. 2020, 2021a, 2021b). The ensemble techniques developed provide watershed managers, decision-makers and researchers with a fast and accurate method for simulating rainfall-runoff processes. The considered models were applied in the Katar catchment, where the main river drains into Lake Ziway. According to Desta & Fetene (2020), this catchment plays an important role in economic growth and food security. The catchment was selected as a case study for this research because of its importance for the community living in it and the data availability.
MATERIALS AND METHODS
Description of the study area
Data used in the study
Physically based models require both spatial and temporal data. The input data used include soil maps, land use land cover (LULC) maps, digital elevation models (DEM), meteorological data and runoff data. The meteorological data used for this study, such as 20 years of daily minimum and maximum temperatures and rainfall from six meteorological stations in the study area, were collected from the Ethiopian National Meteorological Agency. Similarly, 20 years of daily runoff data (at Abura station) needed for validation and calibration of the proposed models were collected from the Ethiopian Ministry of Water, Irrigation and Electricity. The statistics of the daily runoff data at the Abura hydrometry station and the Thiessen polygon average precipitation (from six stations) are shown in Table 1. From the total of 20 years of daily data, the first 2 years of data (1998–1999) were used for the warming period, 12 years of data (from 1 January 2006 to 31 December 2017) were used for the calibration of the models and 6 years of data (from 1 January 2000 to 31 December 2005) were used for validation.
Data type . | Period . | Statistical parameters . | ||||
---|---|---|---|---|---|---|
Minimum . | Average . | Maximum . | SD . | Coefficient of variation . | ||
Discharge (m3/s) | Calibration | 0.106 | 12.05 | 152.033 | 20.062 | 1.6 |
Validation | 1.188 | 13.4 | 110.624 | 20.01 | 1.4935 | |
Whole | 0.106 | 12.5 | 152.033 | 19.54 | 1.563 | |
Calibration | 0 | 2.5145 | 71.2 | 4.806 | 1.911 | |
Rainfall (mm) | Validation | 0 | 2.835 | 61 | 6.352 | 2.24 |
Whole | 0 | 2.6215 | 71.2 | 5.3732 | 2.05 |
Data type . | Period . | Statistical parameters . | ||||
---|---|---|---|---|---|---|
Minimum . | Average . | Maximum . | SD . | Coefficient of variation . | ||
Discharge (m3/s) | Calibration | 0.106 | 12.05 | 152.033 | 20.062 | 1.6 |
Validation | 1.188 | 13.4 | 110.624 | 20.01 | 1.4935 | |
Whole | 0.106 | 12.5 | 152.033 | 19.54 | 1.563 | |
Calibration | 0 | 2.5145 | 71.2 | 4.806 | 1.911 | |
Rainfall (mm) | Validation | 0 | 2.835 | 61 | 6.352 | 2.24 |
Whole | 0 | 2.6215 | 71.2 | 5.3732 | 2.05 |
Methodology
Physically based rainfall-runoff models
SWAT model
Land use affects runoff, erosion, infiltration and evapotranspiration in the catchment. In this study, to prepare the land use map of the Katar, a Landsat image was downloaded from the United States Geological Survey website. ArcGIS 10.3 was used to process the satellite image and generate the required land use information (Figure 4). ArcSWAT (2012) was used to set up the model. Whereas, the sequential uncertainty fitting program of the SUFI-2 algorithm of the SWAT-CUP software was used for uncertainty analysis and model calibration (Esmaeili-Gisavandani et al. 2021).
HEC-HMS
The HEC-HMS model is the most widely applicable rainfall-runoff model in which excess (effective) rainfall in the catchment is determined by considering the connected pervious surface characteristics. The direct streamflow is then formed by combining the near-surface flow and overland flows (Young et al. 2017). In this method, the basin is subdivided into sub-basins connected by channel links. In this study, HEC-HMS 4.7.1 was used, where the model components consist of the terrain data manager, basin model manager, meteorological model manager, time series data manager and control specification manager. The DEM, which is one of the inputs for HEC-HMS was processed and clipped to the size of the study area using ArcGIS 10.3. The processed DEM is then imported into HEC-HMS. HEC-HMS 4.7.1 version has a GIS component and it was used for sink preprocessing, drainage preprocessing, stream identification and delineating the elements (dividing the catchment into different sub-catchments).
HBV model
Model validation and calibration for single models
The first step in hydrological modeling is to identify the most sensitive parameters that have a significant influence on the model output. According to Ouatiki et al. (2020), sensitivity analysis allows the modeler for identifying the influence of each parameter in governing the runoff process. The applied physically based models were calibrated and validated using daily discharge data. The observed data from 1 January 2006 to 31 December 2017 were used for calibration and discharge data from 1 January 2000 to 31 December 2005 were used for validation. In the SWAT model, calibration was performed using the SUFI-2 algorithm of SWAT-CUP software, which automatically adjusts model parameters iteratively until the best fit between observed and predicted runoff is achieved. For the HEC-HMS model, the most sensitive parameter identification was performed during model optimization using a one-at-a-time approach. This method was carried out by changing the value of one parameter (in the range of ±25%) while keeping the others constant and comparing the NSE value between the observed and simulated discharge values at the catchment outlet. This technique has been used for sensitivity analysis in the HEC-HMS model in previous studies (Tassew et al. 2019; Fanta & Sime 2022). Regarding the HBV model, sensitive parameter identification and model calibration was conducted Monte Carlo's optimization method by setting the objective function that generates the optimum parameter values within the predefined value ranges of the parameter (Bizuneh et al. 2021).
Model evaluation criteria
The accuracy of the models was evaluated based on these performance indicators according to the ranges and interpretations recommended by Moriasi et al. (2007, 2015) as shown in Table 2.
Performance rating . | NSE . | R2 . | PBIAS (%) . | RSR . |
---|---|---|---|---|
Unsatisfactory | ≤50 | ≤60 | ≥± 15 | <0.7 |
Satisfactory | 50 < NSE ≤ 70 | 60 < R2 ≤ 75 | ±10 ≤ PBIAS < ±15 | 0.6 < RSR ≤ 0.7 |
Good | 70 < NSE ≤ 80 | 75 < R2 ≤ 85 | ±5 ≤ PBIAS < ±10 | 0.5 < RSR ≤ 0.6 |
Very good | >80 | >85 | <± 5 | 0 < RSR ≤ 0.5 |
Performance rating . | NSE . | R2 . | PBIAS (%) . | RSR . |
---|---|---|---|---|
Unsatisfactory | ≤50 | ≤60 | ≥± 15 | <0.7 |
Satisfactory | 50 < NSE ≤ 70 | 60 < R2 ≤ 75 | ±10 ≤ PBIAS < ±15 | 0.6 < RSR ≤ 0.7 |
Good | 70 < NSE ≤ 80 | 75 < R2 ≤ 85 | ±5 ≤ PBIAS < ±10 | 0.5 < RSR ≤ 0.6 |
Very good | >80 | >85 | <± 5 | 0 < RSR ≤ 0.5 |
Ensemble techniques
Simple average ensemble
Weighted average ensemble
Neural network ensemble
RESULTS AND DISCUSSION
The sensitivity and simulation results of the individual models (i.e., SWAT, HEC-HMS and HBV) and the ensemble technique (NNE, WAE and SAE) are discussed accordingly in the following subsections.
Sensitivity analysis result
A DEM with 30 × 30 m resolution was used to configure the SWAT model. The spatial input data for the model was delineated and predefined using ArcGIS interfaces. HRU analysis was used to execute HRUs definition from catchment characteristics such as soil data, land use and land cover map and slope. In the SWAT model, the catchment is divided into sub-catchments to lump the total catchment area into small elements having unique slopes, soil and land use characteristics and HRUs. In this study, the Katar catchment was divided into 17 sub-catchments and 156 HRUs. The division of catchments into sub-catchments is done based on the watershed characteristics similarity and the optimum HRU number as recommended by Bizuneh et al. (2021).
In the sensitivity analysis of SWAT model using the SUFI-2 algorithm of SWAT-CUP software, the minimum and maximum values of different parameters were obtained from the literature and then the fitted (optimum) values of the parameters were obtained through iteration until the best agreement between observed and simulated runoff was achieved, using R2 as an objective function (Bizuneh et al. 2021). These parameters were ranked based on their t-stat value during global sensitivity analysis. The list of selected parameters used for the SWAT model calibration and their optimal values are presented in Table 3.
Parameter . | Description . | Min value . | Max value . | Fitted value . | Rank . |
---|---|---|---|---|---|
CN2.mgt | SCS runoff CN | 35 | 98 | 61.14 | 1 |
ALPHA_BF.gw | Base flow alpha factor | 0 | 1 | 0.875 | 2 |
GW_DELY.gw | Groundwater delay | 0 | 500 | 262 | 3 |
SOL_K.sol | Saturated hydraulic conductivity | 0 | 2,000 | 41 | 4 |
GW_REVAP.gw | Groundwater ‘revap’ coefficient | 0.02 | 0.2 | 0.18 | 5 |
GWQMN.gw | A threshold minimum depth of water in the shallow aquifer for base flow to occur | 0 | 5,000 | 267.5 | 6 |
SOL_AWC.sol | Available water capacity of the soil layer | 0 | 1 | 0.55 | 7 |
HRU_SLP.hru | Average slope steepness | 0 | 1 | 0.59 | 8 |
SURLAG.bsn | Surface runoff lag time | 0.05 | 24 | 0.845 | 9 |
Parameter . | Description . | Min value . | Max value . | Fitted value . | Rank . |
---|---|---|---|---|---|
CN2.mgt | SCS runoff CN | 35 | 98 | 61.14 | 1 |
ALPHA_BF.gw | Base flow alpha factor | 0 | 1 | 0.875 | 2 |
GW_DELY.gw | Groundwater delay | 0 | 500 | 262 | 3 |
SOL_K.sol | Saturated hydraulic conductivity | 0 | 2,000 | 41 | 4 |
GW_REVAP.gw | Groundwater ‘revap’ coefficient | 0.02 | 0.2 | 0.18 | 5 |
GWQMN.gw | A threshold minimum depth of water in the shallow aquifer for base flow to occur | 0 | 5,000 | 267.5 | 6 |
SOL_AWC.sol | Available water capacity of the soil layer | 0 | 1 | 0.55 | 7 |
HRU_SLP.hru | Average slope steepness | 0 | 1 | 0.59 | 8 |
SURLAG.bsn | Surface runoff lag time | 0.05 | 24 | 0.845 | 9 |
Table 3 shows that CN2 was the most sensitive parameter and that the base flow alpha factor (ALPHA_BF) and groundwater delay (GW_DELY) were the second and third most sensitive parameters, respectively. According to Fanta & Sime (2022), the sensitivity of CN could be because factors for runoff generation such as slope, LULC and soil type are combined into a single CN value. A reduction in CN leads to a reduction in runoff and vice versa. Moreover, the LULC of the Katar catchment is predominantly an improper agricultural practice that could affect the natural relationship between runoff and the infiltration of the soil. Therefore, it could be reasonably said that runoff in the catchment is highly dependent on the CN of the soil. The high sensitivity of ALPHA_BF and GW_DELY in the SWAT model revealed the importance of groundwater dynamics and aquifer systems in the hydrology of the catchment. Calibration is a procedure to better parameterize a model for a given set of local conditions in order to reduce predictive uncertainty. A nearly similar result was obtained in the study by Esmaeili-Gisavandani et al. (2021) and Fanta & Sime (2022).
The second physically based model used in this study was the HEC-HMS model. Runoff in the HEC-HMS model is simulated by analysing meteorological data through open channel routing (Shekar & Vinay 2021). The HEC-HMS physically based model was calibrated and optimised using 18 years of daily observed runoff data. Similar to the SWAT model, a sensitivity analysis was performed in the HEC-HMS model to determine the most sensitive parameters affecting the simulation of runoff. In the HEC-HMS model, the most sensitive parameters identified during model optimisation are listed in Table 4. For the HEC-HMS model, the identification of the most sensitive parameters during model optimisation was performed using a one-at-a-time approach. In this method, the value of one parameter was changed, while the others were held constant and the RMSE value was compared between the observed and simulated discharge values at the outlet of the catchment. The optimum values of the parameters vary between the sub-basins.
Parameter . | Description . | Value range . | Rank . |
---|---|---|---|
CN | SCS_Curve Number | 35–99 | 1 |
Tlag | Lag time | 0.1–30,000 | 2 |
Ia | SCS-CN initial abstraction | 0.001–500 | 3 |
K (h) | Flood wave traveling time | 0.005–150 | 4 |
x | Weighted coefficient of discharge | 0.005–0.5 | 5 |
Parameter . | Description . | Value range . | Rank . |
---|---|---|---|
CN | SCS_Curve Number | 35–99 | 1 |
Tlag | Lag time | 0.1–30,000 | 2 |
Ia | SCS-CN initial abstraction | 0.001–500 | 3 |
K (h) | Flood wave traveling time | 0.005–150 | 4 |
x | Weighted coefficient of discharge | 0.005–0.5 | 5 |
Based on the sensitivity analysis result (Table 4), similar to the SWAT model, CN was the most sensitive parameter of the HEC-HMS model, while Tlag and Ia were the second and third most sensitive parameters, respectively. Varying the values of these parameters during the sensitivity analysis resulted in a significant deviation from the previously predicted runoff value. A similar result was obtained in the sensitivity analysis by Fanta & Sime (2022), Tassew et al. (2019) and Zelelew & Melesse (2018), where CN was identified as the most sensitive parameter and Tlag was second.
The third semi-distributed conceptual model used in this study was the HBV model. The rainfall-runoff simulation using the HBV model requires spatial (e.g., LULC), hydrological (discharge) and climate data of temperature and evapotranspiration. The climate and hydrological input data for the catchment were prepared in the format of the HBV model. The LULC of the Katar catchment classified using the SWAT model was merged into three LULC types in view of vegetation characteristics similarity because the model accepts only a maximum of three vegetation zones. The HBV model was configured into different parameters and five elevation zones with three vegetation zones of the catchment. In this study, the identification of sensitive parameters for the HBV model was conducted using Monte Carlo's optimization method, setting the objective function that generates the optimum values of the parameters within the predefined value range of the parameter as shown in Table 5. Accordingly, FC, parameters controlling the contribution of rainfall to runoff (BETA), recession (storage) coefficient 1 (K1) and LP (soil moisture value above which ETact reaches ETpot) were the most sensitive parameters.
Parameter . | Unit . | Range . | Optimum . | Rank . |
---|---|---|---|---|
FC | mm | 100–550 | 177.77 | 1 |
BETA (β) | – | 1–6 | 2.23 | 2 |
LP | mm | 0.3–1 | 0.893 | 3 |
K1 | day-1 | 0.01–0.2 | 0.0116 | 4 |
UZL | mm | 0–100 | 43.49 | 5 |
PERC | mm/day | 0–4 | 3.4 | 6 |
K0 | day−1 | 0.1–0.5 | 0.353 | 7 |
K2 | day−1 | 0.001–0.1 | 0.063 | 8 |
MAXBAS | 1–2.5 | 2.26109 | 9 |
Parameter . | Unit . | Range . | Optimum . | Rank . |
---|---|---|---|---|
FC | mm | 100–550 | 177.77 | 1 |
BETA (β) | – | 1–6 | 2.23 | 2 |
LP | mm | 0.3–1 | 0.893 | 3 |
K1 | day-1 | 0.01–0.2 | 0.0116 | 4 |
UZL | mm | 0–100 | 43.49 | 5 |
PERC | mm/day | 0–4 | 3.4 | 6 |
K0 | day−1 | 0.1–0.5 | 0.353 | 7 |
K2 | day−1 | 0.001–0.1 | 0.063 | 8 |
MAXBAS | 1–2.5 | 2.26109 | 9 |
Similar findings were obtained in different studies, which have used HBV for rainfall-runoff simulations (Abebe et al. 2010; Osuch et al. 2015; Parra et al. 2018; Ouatiki et al. 2020; Bizuneh et al. 2021). The maximum water holding capacity of the soil (FC) is one of the parameters in the soil routine that greatly influences the initiation of runoff. Under wet soil conditions, the contribution of FC to runoff could be high and under dry soil conditions, its contribution could be low. According to Abebe et al. (2010), FC plays a key role in the formulation of the HBV model in partitioning effective precipitation into SM and runoff. A higher FC value means that the water storage capacity of the soil is very high and thus a high quantity of water is available for evapotranspiration and vice versa. This makes FC the most important and sensitive parameter in controlling the contribution of soil to runoff generation. Since BETA (β), the second most sensitive parameter, is an exponent of the ratio between SM and FC (Equation (7)), it thus affects the partitioning of net precipitation into SM recharge and runoff. Therefore, the amount of SM available for evapotranspiration depends on this exponent, and thus, on the strong effect of the BETA on the model output. According to Tibangayuka et al. (2022), the sensitivity of BETA and LP also indicates that evaporation rate and precipitation significantly affect the amount of runoff in the catchment.
Results of single physically based models
The performance was measured based on the hydrological credibility of the model's output during the validation and calibration phases. The obtained outputs of the developed physically based models to simulate the rainfall-runoff process in the Katar catchment are presented in Table 6.
. | SWAT . | HBV . | HEC-HMS . | |||
---|---|---|---|---|---|---|
Goodness of fit . | Calibration . | Validation . | Calibration . | Validation . | Calibration . | Validation . |
NSE | 0.83 | 0.799 | 0.762 | 0.73 | 0.756 | 0.7065 |
RMSE (m3/s) | 7.85 | 8.68 | 9.4 | 10.37 | 9.52 | 10.84 |
MAE (m3/s) | 5.374 | 5.658 | 6.407 | 7.863 | 6.037 | 7.135 |
R2 | 0.857 | 0.85 | 0.777 | 0.835 | 0.779 | 0.762 |
PBIAS (%) | −16.1 | −22.3 | −19.3 | −31.16 | −18.4 | −22.5 |
RSR | 0.407 | 0.434 | 0.488 | 0.518 | 0.494 | 0.542 |
. | SWAT . | HBV . | HEC-HMS . | |||
---|---|---|---|---|---|---|
Goodness of fit . | Calibration . | Validation . | Calibration . | Validation . | Calibration . | Validation . |
NSE | 0.83 | 0.799 | 0.762 | 0.73 | 0.756 | 0.7065 |
RMSE (m3/s) | 7.85 | 8.68 | 9.4 | 10.37 | 9.52 | 10.84 |
MAE (m3/s) | 5.374 | 5.658 | 6.407 | 7.863 | 6.037 | 7.135 |
R2 | 0.857 | 0.85 | 0.777 | 0.835 | 0.779 | 0.762 |
PBIAS (%) | −16.1 | −22.3 | −19.3 | −31.16 | −18.4 | −22.5 |
RSR | 0.407 | 0.434 | 0.488 | 0.518 | 0.494 | 0.542 |
Based on the model performance measures, the SWAT model showed better performance with NSE, R2, RSR, RMSE, MAE and PBIAS values of 0.799, 0.85, 0.434, 8.68 m3/s, 5.658 m3/s and −22.3, respectively, in the validation period. Similarly, the values of NSE, R2, RSR, RMSE, MAE and PBIAS were 0.83, 0.857, 0.407, 7.85, 5.374 and −16.1, respectively, during the calibration period. According to the Moriasi et al. (2015) model performance rating system (for the daily time step), the SWAT model showed a good performance considering the NSE value and a very good based on the values of R2 and RSR. According to Tibangayuka et al. (2022), model performance is good when the RMSE value is less than half the SD of the observed data. In this regard, the SWAT model provided good predictive performance. However, it showed unsatisfactory results with regard to its PBIAS values in both the validation and calibration phases.
The HBV model, for its part, simulated well the daily runoff in both the calibration and validation phases. As shown in Table 6, the HBV model provided an NSE value of 0.732, an RSR value of 0.518, and an R2 value of 0.835 in the validation phase, which is a good performance based on Moriasi et al. (2015) performance rating criteria. This model gave unsatisfactory results based on its PBIAS value. The third semi-distributed model, the HEC-HMS, provided a good performance with NSE values of 0.756 and 0.7065, RSR values of 0.494 and 0.542 and R2 values of 0.779 and 0.762 in the calibration and validation phases, respectively. Similar to the HBV model, the HEC-HMS model led to unsatisfactory results with regard to the PBIAS value (−22.5%). Table 6 shows that all three models yielded negative PBIAS, indicating overestimation according to Moriasi et al. (2007). In this study, it was found that the SWAT, HEC-HMS and HBV are suitable for simulating rainfall-runoff in the catchment. However, in all of the statistical performance measures, the SWAT model led to the best result in simulating the rainfall-runoff process over the Katar catchments compared with the HBV and HEC-HMS models (Table 6). The better performance of the SWAT model could be due to its ability to better discretize the study catchment into more detailed sub-catchments with uniform hydrologic and spatial characteristics (HRUs).
The findings of this study were compared with the results obtained by Bizuneh et al. (2021), Shekar & Vinay (2021) and Temesgen Ayalew (2019) and found a fair similarity. For example, Shekar & Vinay (2021) applied the SWAT and HEC-HMS models in the calibration and validation periods, respectively, in their study of the subhumid tropical Hemavathi catchment in India. They reported the SWAT model performed better with R2 values of 0.81 and 0. 85, and NSE values of 0.76 and 0.82 in the calibration and validation phases, respectively. Bizuneh et al. (2021) also compared the performances of SWAT and HBV models in three watersheds of the Upper Blue Nile River basin in Ethiopia and reported that the SWAT model provided NSE values of 0.81 and 0.8 and R2 values of 0.85 and 0.81, in the calibration and validation phases. This study also reported that the HBV model provides NSE values up to 0.81 and 0.63 and R2 values of 0.82 and 0.72 in the calibration and validation phases. The statistical performance indices value of HBV obtained in this study were similar to the ones obtained by Temesgen Ayalew (2019) and Tibangayuka et al. (2022), which reported the suitability of the HBV model in tropical catchments.
The HEC-HMS model (Figure 10(a)) provided 119.9 m3/s and overestimates the maximum observed runoff in the validation phase by 8.4%. Similarly, in the calibration phase, the maximum runoff (152.033 m3/s) occurred on 12 August 2014, while HEC-HMS predicted 177.188 m3/s, overestimating the value by 16.55%. According to Fanta & Sime (2022), using the SCS-CN loss method of the HEC-HMS model in tropical regions could lead to an overestimation of peak runoff. Moreover, the Katar catchment is predominantly agricultural, which may result in the simulated runoff being higher than the observed runoff value. The overestimation of runoff by the HEC-HMS model has been reported in many studies (e.g., Abushandi & Merkel 2013; Bhuiyan et al. 2017; Fanta & Sime 2022).
In addition to the time series, the performances of the SWAT (Figure 10(f)), HBV (Figure 10(d)) and HEC-HMS (Figure 10(b)) models for rainfall-runoff simulation were compared using a scatter plot. From the figures, the best compliance is seen between the observed runoff value of the SWAT model and the HBV model, as the points are less spread and close to the 1:1 line. The HEC-HMS model shows the worst performance as it has the highest RMSE (10.84 m3/s) value and the lowest value of NSE (0.7065) and the points in Figure 10(a) are also more scattered and far from the 1:1 line.
A detailed analysis of model performance based on the RMSE for each segment of the hydrograph was also performed, as shown in Table 7. The RMSE values indicate that the SWAT model was better at simulating very low and high flows, the HEC-HMS model was better at simulating low and medium flows and both the HBV and SWAT models gave almost the same performance in simulating the very high flow range.
Hydrograph phase . | Range (m3/s) . | HEC-HMS . | SWAT . | HBV . |
---|---|---|---|---|
Very low flow | 1.188–1.559 | 0.233 | 0.193 | 0.854 |
Low flow | 1.641–2.415 | 0.342 | 0.828 | 1.556 |
Medium flow | 2.526–18.788 | 5.49 | 5.622 | 9.001 |
High flow | 19.244–60.032 | 9.353 | 8.954 | 10.001 |
Very high flow | 61.048–110.624 | 5.371 | 4.508 | 4.601 |
Hydrograph phase . | Range (m3/s) . | HEC-HMS . | SWAT . | HBV . |
---|---|---|---|---|
Very low flow | 1.188–1.559 | 0.233 | 0.193 | 0.854 |
Low flow | 1.641–2.415 | 0.342 | 0.828 | 1.556 |
Medium flow | 2.526–18.788 | 5.49 | 5.622 | 9.001 |
High flow | 19.244–60.032 | 9.353 | 8.954 | 10.001 |
Very high flow | 61.048–110.624 | 5.371 | 4.508 | 4.601 |
As shown in Table 6, the SWAT model outperformed the HEC-HMS and HBV models based on statistical performance measures. The hydrographs also show that the HEC-HMS model better simulates medium flows and the SWAT model better simulates very low and high flows, based on the pattern of observed hydrographs. Based on the time series plots and the selected FDC segments, it could be seen that different models might provide different prediction performances at different points of the time series. Therefore, a more accurate prediction of runoff could be achieved by combining the outputs of the individual physically based models through ensemble techniques. Therefore, three ensemble techniques for rainfall-runoff modeling were developed in this study to boost the overall modeling accuracy.
Results of ensemble techniques
To improve the single physically based model's prediction efficiency, three ensemble techniques were developed using the results of the SWAT, HEC-HMS and HBV models as inputs. The results of nonlinear and linear ensemble techniques for runoff simulation are shown in Table 8. The SAE value was obtained by simply taking the arithmetic average of the physically based models’ outputs. In the WAE technique, on the other hand, ensemble modeling was performed by assigning weights for every single model's output based on its NSE value in the validation phase. Thus, in the WAE technique, the weights assigned for the SWAT, HBV and HEC-HMS models were 0.3575, 0.3251 and 0.314, respectively. The weights of WAE were obtained using NSE values of the single model in the validation phase.
Goodness of fit . | SAE . | WAE . | NNE . | |||
---|---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | Calibration . | Validation . | |
NSE | 0.829 | 0.792 | 0.846 | 0.818 | 0.924 | 0.896 |
R2 | 0.918 | 0.877 | 0.919 | 0.878 | 0.925 | 0.904 |
MAE (m3/s) | 5.807 | 6.008 | 4.204 | 5.985 | 3.244 | 3.726 |
RSR | 0.398 | 0.426 | 0.3766 | 0.4225 | 0.2717 | 0.3216 |
RMSE (m3/s) | 7.97 | 8.53 | 7.54 | 8.46 | 5.44 | 6.44 |
PBIAS (%) | −22 | −34 | −17.9 | −33.9 | −3.94 | −9.1 |
Goodness of fit . | SAE . | WAE . | NNE . | |||
---|---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | Calibration . | Validation . | |
NSE | 0.829 | 0.792 | 0.846 | 0.818 | 0.924 | 0.896 |
R2 | 0.918 | 0.877 | 0.919 | 0.878 | 0.925 | 0.904 |
MAE (m3/s) | 5.807 | 6.008 | 4.204 | 5.985 | 3.244 | 3.726 |
RSR | 0.398 | 0.426 | 0.3766 | 0.4225 | 0.2717 | 0.3216 |
RMSE (m3/s) | 7.97 | 8.53 | 7.54 | 8.46 | 5.44 | 6.44 |
PBIAS (%) | −22 | −34 | −17.9 | −33.9 | −3.94 | −9.1 |
The NNE technique was the third ensemble technique used in this study as a nonlinear ensemble technique because of its simplicity and popularity and most importantly, its better performance in model combination studies reported in other fields. For example, NNE was used in vehicular traffic noise (Nourani et al. 2020) and wastewater treatment (Nourani et al. 2018) and they reported the high ability of NNE in increasing the accuracy of the single models. In this study, the best result of NNE was achieved when using six hidden neurons, as shown in Table 8.
As shown in Table 8, all developed ensemble techniques have shown very good performance in both the calibration and validation phases in terms of their NSE, R2, RSR and RMSE values according to Moriasi et al. (2007, 2015). Similar to the single models, the linear ensemble techniques (WAE and SAE) showed unsatisfactory performance in terms of the PBIAS value. The linear ensemble technique improved the performance of the HEC-HMS model by 9.65 and 12.1% and the HBV model by 8.8 and 8.5% in the calibration and validation phases, respectively, based on the NSE value. The SAE technique improved the performances of individual models except for the SWAT model. According to Nourani et al. (2021a, 2021b), this could be because arithmetic averaging yields a higher value than the minimum value and lower than the highest values in the dataset. The results of the ensemble technique (Table 9) show that the difference in performance between the linear ensemble models (SAE and WAE) is not large on most statistical performance, but the WAE technique was slightly better than SAE. This could be due to the weighting of the inputs of this technique according to their relative importance. The WAE technique improved the HEC-HMS, SWAT and HBV models by increasing the NSE values by 15.75, 2.4 and 12%, respectively, in the validation phase.
Hydrograph phase . | Range (m3/s) . | SAE . | WAE . | NNE . |
---|---|---|---|---|
Very low flow | 1.188–1.559 | 0.279 | 0.278 | 0.643 |
Low flow | 1.641–2.415 | 1.163 | 1.152 | 0.538 |
Medium flow | 2.526–18.788 | 6.712 | 6.69 | 2.371 |
High flow | 19.244–60.032 | 9.866 | 9.839 | 6.515 |
Very high flow | 61.048–110.624 | 6.44 | 6.585 | 8.43 |
Hydrograph phase . | Range (m3/s) . | SAE . | WAE . | NNE . |
---|---|---|---|---|
Very low flow | 1.188–1.559 | 0.279 | 0.278 | 0.643 |
Low flow | 1.641–2.415 | 1.163 | 1.152 | 0.538 |
Medium flow | 2.526–18.788 | 6.712 | 6.69 | 2.371 |
High flow | 19.244–60.032 | 9.866 | 9.839 | 6.515 |
Very high flow | 61.048–110.624 | 6.44 | 6.585 | 8.43 |
The best ensemble technique (NNE), improved the performances of SWAT, HBV and HEC-HMS by 12.14, 22.7 and 26.8%, respectively, in the validation phase, based on the NSE value. The RMSE and MAE values of the NNE technique were lower than those of the WAE, SAE and single models (HEC-HMS, SWAT and HBV) in both the calibration and validation phases, indicating the error reduction of the NNE technique. The NNE technique reduced the RMSE of the SWAT, HEC-HMS and HBV models by 25.81, 40.59 and 37.9%, respectively, during the validation phase. Among the ensemble techniques used in this study, the NNE technique provided the best result with very good performance based on all statistical performance measures. This could be due to the ability of the FFNN-based ensemble (NNE) to handle the nonlinear, complex and dynamic nature of the rainfall-runoff process. According to Sharghi et al. (2018), the use of FFNN as a nonlinear ensemble technique simulates the nonlinear behavior of the phenomenon more accurately than the SAE and WAE. The findings of the three ensemble techniques in this study were compared with the results obtained by Nourani et al. (2021a, 2021b), Elkiran et al. (2019) and Sharghi et al. (2018) and found a fair degree of similarity.
In addition, the RMSE of the developed ensemble techniques was also evaluated for different segments of the FDCs in the validation phase (Table 9). The RMSE values of the very high flow were 6.585, 8.43 and 6.44 m3/s for WAE, NNE and SAE, respectively. For the NNE technique, the RMSE values for high, medium, low and very low flows were 6.515, 2,371, 0.538 and 0.643 m3/s, respectively. Similarly, RMSE values from WAE for high, medium, low and very low flows were 9.839, 6.69, 1.152 and 0.278 m3/s, respectively. The other linear ensemble technique, SAE, yielded RMSE values of 9.866, 6.712, 1.163 and 0.279 m3/s for high, medium, low and very low flows, respectively. The NNE technique provides better performance in low, medium and high flows because its RMSE values are smaller compared with the WAE and SAE techniques. The SAE technique outperformed the other techniques in the very high flow segment. The WAE technique provides better simulation performance in the very low flow segment of the FDC.
The Taylor diagram, first introduced by Taylor (2001), graphically visualises different performance measures such as SD and correlation coefficient (r) to evaluate the accuracy of the models developed. The importance of the Taylor diagram is that it combines various model performance indicators in a single graph and statistically quantifies the similarity between the simulated and actual values. From Figure 14(a), it is shown that the best physically based model is SWAT with r = 0.926 and SD = 1.75, while HEC-HMS is the worst model. The Taylor diagram was also used to evaluate the performance of model combination techniques as shown in Figure 14(b). According to Yaseen et al. (2018), a predictive model is perfect when its r value is 1. In Figure 14, based on the mentioned performance indicators (r = 0.964 and SD = 1.98), it is seen that NNE led to the most accurate predictive ability compared with the other ensemble techniques.
CONCLUSIONS
The current study evaluated the simulation performances of the SWAT, HBV, HEC-HMS models for rainfall-runoff simulation in the Katar catchment in Ethiopia. To further improve the simulation performance, the outputs of the single models were combined using NNE, SAE and WAE techniques. Calibration and validation of the developed hydrologic models were performed using 18-year runoff records (2000–2017) at Abura station. The performance of the models was then evaluated using statistical performance measures (NSE, R2, PBIAS, MAE, RSR and RMSE), time series of observed and simulated discharges and FDCs. The result shows that the SWAT model performed better than the HEC-HMS and HBV models with NSE, R2, RMSE, RSR, MAE and PBIAS values of 0.799, 8.68 m3/s, 0.434, 0.85, 5.658 m3/s and −22.3, respectively, in the validation phase. The FDC also indicated that the SWAT model could simulate very low (>Q95), high (Q5–Q20) and very high (<Q5) flows better than the HEC-HMS and HBV models. The second best performing single model was HBV, which provided NSE, RMSE, MAE, R2, PBIAS and RSR values of 0.73, 10.37 m3/s, 7.863 m3/s, 0.835, −31.16 and 0.518, respectively, in the validation phase. However, this model was outperformed by the poorly performing model (HEC-HMS) in simulating the medium and low flows. The result of ensemble modeling shows that the NNE technique significantly improved the individual models in both the validation and calibration phases. It improved the performances of SWAT, HBV and HEC-HMS by increasing the NSE values up to 12.14, 22.74 and 26.8% and decreasing the RMSE value by 25.81, 37.9 and 40.59%, respectively, in the validation phase. The high ability of NNE to deal with noise and nonlinear behavior of the rainfall-runoff process could be the reason for its best performance. Therefore, this technique is recommended in the Katar catchment to improve the results of the individual model as well as for accurate and reliable water resource assessment in similar regions. In addition, this study used only the physically based model's output in the ensemble unit. To improve the prediction efficiency and consider the effects of catchment characteristics, future studies should test the inclusion of artificial intelligence and physically based models in the ensemble framework.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.