This study evaluates the performance of the soil and water assessment tool (SWAT), the hydrologiska byråns vattenbalansavdelning (HBV) and the hydrologic engineering center-hydrologic modeling system (HEC-HMS) for modeling rainfall-runoff in the data-scarce Katar catchment, Ethiopia. First, the rainfall-runoff process was simulated using the SWAT, HBV and HEC-HMS models individually. Second, simple average ensemble (SAE), weighted average ensemble (WAE) and neural network ensemble (NNE) techniques were developed by combining the results of individual models to improve overall accuracy. Statistical performance measures and flow duration curves (FDCs) were used to compare and evaluate the performance of the models. The results showed that the SWAT model outperformed the HBV and HEC-HMS models with the coefficient of determination (R2) and Nash–Sutcliffe efficiency (NSE) of 0.857 and 0.83 for calibration and 0.85 and 0.799 for validation, respectively. The ensemble result showed that NNE outperformed the SAE and WAE techniques, with NSE and R2 values of 0.924 and 0.925 for calibration and 0.896 and 0.904 for validation, respectively. The NNE technique improved the performance of SWAT, HBV and HEC-HMS by 12.14, 22.7 and 26.8%, respectively, in the validation phase. Overall, the results showed that ensemble modeling is a promising option for accurate modeling of the rainfall-runoff process.

  • Novel ensemble approach was proposed to simulate the complex and dynamic rainfall-runoff process.

  • Spatial, hydrological and meteorological data were used as input for the semi-distributed models.

  • Three ensemble techniques were developed by combining the runoff results of the semi-distributed models to boost the overall efficiency.

  • The proposed ensemble technique significantly improved the modeling performance.

Graphical Abstract

Graphical Abstract
Graphical Abstract

In recent years, water demand in developing countries has increased significantly due to rapid population growth (Miraji et al. 2019). Operational hydrology and water resources management require reliable predictions of the hydrological process such as rainfall-runoff process, groundwater flow and evapotranspiration (Chathuranika et al. 2022). Accurate modeling of the rainfall-runoff process using evapotranspiration, rainfall and other hydrologic data is an essential task because it provides information for water resources management, flood mitigation and warning, land use and hydrology of a watershed (Kisi et al. 2013). The rainfall-runoff models are the mathematical representation of the physical relationships between the various components of the hydrologic cycle within the defined hydrologic unit (Yang et al. 2020a, 2020b). Modeling the rainfall-runoff process, however, is a challenging task because it is a nonlinear, complex outcome of various hydrologic variables and catchment characteristics and therefore cannot be simulated with a simple model.

So far, different models have been developed to study the nonlinear and complicated relationship between rainfall and runoff, which usually shows a great deal of temporal and spatial variability due to the mixed influence of soil type, land use, weather conditions and the number of variables included in the modeling process. These models are generally categorized as data-driven and physically based models. The physically based hydrological models work by constructing a simplified watershed system and mathematical equations that describe the physical processes involved in the movement and storage of water (Young et al. 2017). Those physical models have components that resemble the physical processes and are capable of considering the uneven distribution of rainfall, evapotranspiration and spatial variation of watershed characteristics in the modeling process (Kisi et al. 2012). In contrast, the data-driven models completely ignore the physics of the process. Physically based hydrological models are extensively used for predicting and simulating the rainfall-runoff process of catchments. Among the different types of physically based models, fully and semi-distributed models are considered to be the best hydrological modeling standard because they can account for the spatial variation of soil, landscape, land use in the catchment and also atmospheric influences (Yang et al. 2020a, 2020b). In the past decades, various physically based models, such as soil and water assessment tool (SWAT) (Arnold et al. 1998; Rostamian et al. 2008), hydrologic engineering center-hydrologic modeling system (HEC-HMS) (Feldman 2000), hydrologiska byråns cattenbalansavdelning (HBV) (Bergström 1992), MIK-SHE (DHI (Danish Hydraulic Institute) 1999) and GR4J (Perrin et al. 2003) models have been used to solve a variety of water resources and environmental problems in different regions (Chathuranika et al. 2022).

The SWAT model is one of the most popular semi-distributed, physically based, watershed-scale hydrological models that can simulate river discharge, suspended sediment and nonpoint source pollutant loads under different climate, land use, management scenarios and soil types (Dash et al. 2020). The SWAT model was used for modeling hydrological processes and showed acceptable results (e.g., Biru & Kumar 2018; Hallouz et al. 2018; Ahmadi et al. 2019; Brighenti et al. 2019; Melaku & Wang 2019; Bizuneh et al. 2021). The HEC-HMS model is a semi-distributed HMS developed by the US Army Corps of Engineers Hydrologic Engineering Center (USACE 2010). The HEC-HMS model has been used in previous studies and shown good performance in hydrological modeling (e.g., Halwatura & Najim 2013; Tassew et al. 2019; Hamdan et al. 2021; Shekar & Vinay 2021). The HBV model is another semi-distributed rainfall-runoff model that works on the daily time step. HBV was used in some hydrological modeling and showed acceptable results (e.g., Ouatiki et al. 2020; Bizuneh et al. 2021; Esmaeili-Gisavandani et al. 2021). None of these studies, however, compared the performance of SWAT, HBV and HEC-HMS models in a semi-arid and subhumid tropical region characterized by highly variable rainfall and topography.

The aforementioned physically based hydrological models are acceptable methods for examining the actual physical process, especially when physical understanding is more important than accurate prediction. However, compared with the data-driven models, the physically based models exhibit a practical limitation in reaching the required accuracy (Young et al. 2017; Nourani et al. 2020). Most catchments in subhumid tropical regions have large variations in daily and seasonal runoff, which contributes to the failure of most hydrologic models to simulate the hydrology of these catchments (Tibangayuka et al. 2022). In addition, previous studies have only evaluated the performance of individual hydrologic models, whose accuracy in a given catchment varies depending on their structure. According to Fenicia et al. (2007), predictive performance can be improved by modifying the already available models through ensemble techniques. Therefore, it is believed that an ensemble method that combines the results of different models could boost the accuracy of the individual models by taking the strengths of the different models. In this context, the study conducted by Bates & Granger (1969) confirmed that combining the results of multiple models by applying different ensemble techniques would produce more accurate results than the individual models. The idea behind the model ensemble is to use the exclusive features of each model in a unique framework to increase the accuracy of modeling (Kiran & Ravi 2008). The study by Cavadias & Morin (1986) is the first in the field of hydrology to use ensemble techniques. Since then, the capability of model combination in improving prediction accuracy has been confirmed in the various hydrological processes (e.g., Noori & Kalin 2016; Esmaeili-Gisavandani et al. 2021; Nourani et al. 2021a, 2021b). Thus, the main aim of this study is to evaluate the simulation performance of SWAT, HBV and HEC-HMS models for rainfall-runoff modeling of Katar catchment and developing linear (WAE and SAE) and nonlinear ensemble (NNE) ensemble techniques to improve the accuracy of individual models. The HBV, SWAT and HEC-HMS models were selected because these models are freely available, making them useful in developing countries where funding for the use of commercial software is limited. Also, these models are capable of modeling continuous processes in tropical regions (Tibangayuka et al. 2022). The NNE technique was chosen as a nonlinear model combination approach over other techniques because they are popular, compatible and also showed high predictive accuracy in ensemble studies previously published in other fields (Nourani et al. 2020, 2021a, 2021b). The ensemble techniques developed provide watershed managers, decision-makers and researchers with a fast and accurate method for simulating rainfall-runoff processes. The considered models were applied in the Katar catchment, where the main river drains into Lake Ziway. According to Desta & Fetene (2020), this catchment plays an important role in economic growth and food security. The catchment was selected as a case study for this research because of its importance for the community living in it and the data availability.

Description of the study area

The study was conducted in the Katar catchment, which is one of the sub-catchments in the Ethiopian Central Rift Valley basin. It is located in the Oromia regional state of Ethiopia. The catchment area includes the Katar River, which flows into Lake Ziway. This river and its tributaries originate in the Lalema and Chilalo mountains and flow into Lake Ziway. The catchment has a complex topography with elevations ranging from 1,635 m (around the gauging station) to 4,167 m above mean sea level. The catchment covers a total area of 3,350 km2. Geographically, the Katar catchment lies between 7.359° and 8.165° North latitude and 38.899° and 39.41° East longitude (Figure 1). The climatic condition of the Katar catchment is characterised by a semi-arid to subhumid climate with an average annual temperature that varies between 16 and 20 °C. Based on 20 years of data, the Katar catchment has the maximum and minimum annual rainfall of 1,231.7 and 729.6 mm, respectively. The catchment attains its minimum discharge of 0.106 m3/s in January and its maximum discharge of 152.033 m3/s in August. There are six meteorological stations within the catchment: Assela, Sagure, Bekoji, Arata, Kulumsa and Ogolcho. The predominant land use in the catchment is smallholder, fragmented agriculture. The rainy season occurs from June to September (contributing about 70% of rainfalls) and the dry period extends from October to May. Similarly, there are six dominant soil types in the catchment (Figure 4).
Figure 1

Map of the study area.

Figure 1

Map of the study area.

Close modal

Data used in the study

Physically based models require both spatial and temporal data. The input data used include soil maps, land use land cover (LULC) maps, digital elevation models (DEM), meteorological data and runoff data. The meteorological data used for this study, such as 20 years of daily minimum and maximum temperatures and rainfall from six meteorological stations in the study area, were collected from the Ethiopian National Meteorological Agency. Similarly, 20 years of daily runoff data (at Abura station) needed for validation and calibration of the proposed models were collected from the Ethiopian Ministry of Water, Irrigation and Electricity. The statistics of the daily runoff data at the Abura hydrometry station and the Thiessen polygon average precipitation (from six stations) are shown in Table 1. From the total of 20 years of daily data, the first 2 years of data (1998–1999) were used for the warming period, 12 years of data (from 1 January 2006 to 31 December 2017) were used for the calibration of the models and 6 years of data (from 1 January 2000 to 31 December 2005) were used for validation.

Table 1

Descriptive statistics of daily runoff and rainfall data

Data typePeriodStatistical parameters
MinimumAverageMaximumSDCoefficient of variation
Discharge (m3/s) Calibration 0.106 12.05 152.033 20.062 1.6 
Validation 1.188 13.4 110.624 20.01 1.4935 
Whole 0.106 12.5 152.033 19.54 1.563 
Calibration 2.5145 71.2 4.806 1.911 
Rainfall (mm) Validation 2.835 61 6.352 2.24 
Whole 2.6215 71.2 5.3732 2.05 
Data typePeriodStatistical parameters
MinimumAverageMaximumSDCoefficient of variation
Discharge (m3/s) Calibration 0.106 12.05 152.033 20.062 1.6 
Validation 1.188 13.4 110.624 20.01 1.4935 
Whole 0.106 12.5 152.033 19.54 1.563 
Calibration 2.5145 71.2 4.806 1.911 
Rainfall (mm) Validation 2.835 61 6.352 2.24 
Whole 2.6215 71.2 5.3732 2.05 

Methodology

In this study, three physically based models, namely SWAT, HBV and HEC-HMS were used to simulate rainfall-runoff processes. The study was carried out in two steps (Figure 2). First, the rainfall-runoff process was modeled using the three proposed physically based models. In the second step, three ensemble techniques, such as neural network ensemble (NNE), weighted average ensemble (WAE) and simple average ensemble (SAE) were developed. For the second step, the runoff results of each physically based model obtained in the first step were used as inputs to the ensemble techniques. The runoff obtained from the ensemble techniques in the second step was compared with the result of the physically based models from the first step.
Figure 2

Schematic of the proposed methodology.

Figure 2

Schematic of the proposed methodology.

Close modal

Physically based rainfall-runoff models

SWAT model
SWAT is a semi-distributed, physically based hydrological model developed by the USDA Agricultural Research Service (USDA-ARS) to examine land management practice impacts on sediment, agricultural chemicals and water supplies in large river basins. It is a conceptual model, working at sub-daily and daily time steps. To represent the spatial heterogeneity, the SWAT model divides the watershed into sub-watersheds and the sub-watersheds are further divided into hydrological response units (HRUs), which are land areas with unique slopes, soil types and land use combinations. Surface runoff is then estimated separately for each sub-basin and routed to quantify the total surface runoff of the basin. The inputs used for the SWAT model include DEM, LULC maps, soil maps, climate data and runoff data. The general structure of the SWAT model is shown in Figure 3.
Figure 3

Schematic of SWAT model simulation.

Figure 3

Schematic of SWAT model simulation.

Close modal
Figure 4

LULC map of the Katar catchment.

Figure 4

LULC map of the Katar catchment.

Close modal
The major components of SWAT are hydrology, sedimentation, climate, nutrients, crop growth, pesticides, agricultural management and soil temperature. The study by Srinivasan et al. (1998) described these components in detail. The hydrologic component (hydrologic cycle) is expressed in terms of the water balance as follows:
(1)
where SWt, SW0, Rday, Qsurf, Ea, Qlat, Wseep and Qgw are the final water content, initial soil water content, precipitation, surface runoff, evapotranspiration, percolation and return flow, respectively, on the day i (all in mm).
In the SWAT model, the surface runoff component can be computed by the Green–Ampt infiltration equation or the soil conservation service (SCS)-curve number (CN). For this particular study, the SCS-CN method was used to calculate surface runoff as follows:
(2)
where Q, R and S are daily runoff (m3/s), rainfall (mm) and retention parameter (mm).
The retention parameter (S) is related to CN as follows:
(3)

Land use affects runoff, erosion, infiltration and evapotranspiration in the catchment. In this study, to prepare the land use map of the Katar, a Landsat image was downloaded from the United States Geological Survey website. ArcGIS 10.3 was used to process the satellite image and generate the required land use information (Figure 4). ArcSWAT (2012) was used to set up the model. Whereas, the sequential uncertainty fitting program of the SUFI-2 algorithm of the SWAT-CUP software was used for uncertainty analysis and model calibration (Esmaeili-Gisavandani et al. 2021).

Another input required for the SWAT model is a soil map, which was downloaded from the Food and Agricultural Organization (FAO) database. The downloaded global scale FAO soil map is clipped to the Katar catchment area as shown in Figure 5.
Figure 5

Soil map of the study area.

Figure 5

Soil map of the study area.

Close modal
HEC-HMS

The HEC-HMS model is the most widely applicable rainfall-runoff model in which excess (effective) rainfall in the catchment is determined by considering the connected pervious surface characteristics. The direct streamflow is then formed by combining the near-surface flow and overland flows (Young et al. 2017). In this method, the basin is subdivided into sub-basins connected by channel links. In this study, HEC-HMS 4.7.1 was used, where the model components consist of the terrain data manager, basin model manager, meteorological model manager, time series data manager and control specification manager. The DEM, which is one of the inputs for HEC-HMS was processed and clipped to the size of the study area using ArcGIS 10.3. The processed DEM is then imported into HEC-HMS. HEC-HMS 4.7.1 version has a GIS component and it was used for sink preprocessing, drainage preprocessing, stream identification and delineating the elements (dividing the catchment into different sub-catchments).

In this method, runoff is computed by subtracting the volume of water that is transpired or evaporated, intercepted, stored and infiltrated, from the total rainfall. In HEC-HMS, there are about 12 loss methods in which some of which are designed to simulate continuous events while the others are intended for event simulation. Among these methods, some are complex and require more inputs, which are not easily available. Therefore, among different loss estimation methods, the SCS-CN method was used in this study to estimate direct runoff. This method is selected because of its simplicity, is well supported by empirical data, provides better results and its calculation is easy as it requires few variables. According to Mishra & Singh (2004), the SCS-CN method takes into account most runoff-generating characteristics of the watershed, such as land use, soil type, antecedent moisture and hydrologic conditions, using Equation (4).
(4)
where Pe, Ia, P and S are excess precipitation, initial abstraction, accumulated rainfall depth and maximum potential retention. S is calculated from CN via Equation (3).
In the HEC-HMS model, the transform prediction simulates direct runoff from excess rainfall in the catchment by transforming excess rainfall into runoff. Using different transformation methods, the SCS Unit hydrograph was used in this study to transform excess rainfall into runoff. This method was chosen because it requires one variable (lag time) as input. Lag time is the time from the center of rainfall excess to the peak of the hydrograph and is calculated for each sub-catchment based on Tc as shown in Equation (5).
(5)
where Tlag and Tc are lag time and time of concentration, respectively both in minutes.
In the HEC-HMS model, there are various routing methods available. In this study, the Muskingum method was used for streamflow routing because it is a simple and straightforward technique that has been extensively applied in river engineering (Tewolde & Smithers 2006). Also, only two parameters are needed for the Muskingum method: the travel time (K) of the flood wave through the routing reach and the dimensionless weight (X) corresponding to the attenuation of the flood wave on its way through the reach (Tewolde & Smithers 2006; Tassew et al. 2019). These routing parameters in the HEC-HMS model are derived by calibrating the measured discharge hydrograph as follows:
(6)
where the value of X is in the range of 0–0.5.
The general structure of the HEC-HMS model for rainfall-runoff modeling is shown in Figure 6.
Figure 6

Schematic of the HEC-HMS model setup and run.

Figure 6

Schematic of the HEC-HMS model setup and run.

Close modal
HBV model
HBV is a semi-distributed rainfall-runoff model that was developed by the Swedish Meteorological and Hydrological Institute by Bergstorm & Forsman (1973). The model contains a simple structure and requires few input variables to simulate runoff produced in the catchment as shown in Figure 7. The inputs used in the HBV model in this study include daily temperature, evapotranspiration and precipitation to simulate the rainfall-runoff process in the Katar catchment. The catchment is divided into sub-catchments based on the elevation and vegetation zone. HBV light was used for this study and to consider spatial heterogeneity, the model suggests a maximum of 20 elevation zones and 3 vegetation zones. Therefore, in this study, the catchment was classified into five elevations and three vegetation zones. The automatic Monte Carlo method was used to determine the most sensitive parameters and the best objective function that generates random parameter values within a predefined range of model parameters. In this study, HBV light was used, which is the updated version of the original model. HBV has the capability of predicting groundwater influence on the runoff and uses delay parameters to represent catchment response procedure.
Figure 7

Schematic of the HBV model (Esmaeili-Gisavandani et al. 2021).

As shown in Figure 7, the HBV light model consists of four routines: soil moisture (SM) routine (BETA, evapotranspiration limit (LP) and field capacity (FC)), response routine (K0, PERC, UZL, K1, K2), snow routine and routing routine parameter (MAXBAS). The snow routine represents the contribution of snowmelt to runoff formation and is not relevant in this study because there is no snow in the study catchment. The soil routine is based on the parameters such as BETA (β), FC and LP. BETA in the soil routine controls the relative contribution of rainfall to runoff. The soil routine controls the change in groundwater recharge and SM content based on FC and quantity of water derived from the earlier routine (P) as:
(7)
According to Esmaeili-Gisavandani et al. (2021), if LP is less than SM/FC, the potential evaporation (ETpot) is equal to the actual evaporation (ETact). If not, ETact is linearly minimized as follows:
(8)
In the HBV model, the catchment under study is represented by a two-reservoir model and surface runoff is calculated by subtracting evapotranspiration and infiltration from rainfall, which represents the inflows to the first reservoir. The first reservoir outflows are intermediate flow (Q1) and fast flow (Q0), while groundwater flow (Q2) is the second reservoir's outflow. The total groundwater flow (Qgw) is calculated by adding the two or three outflows, based on upper zone storing (SUZ) located above or below the threshold zone (UZL) as follows:
(9)
where K2, K1 and K0 are the recession coefficients for Q2, Q1 and Q0, respectively.
Finally, the simulated runoff (Qsim) is computed using MAXBAS and triangular weighting function as follows:
(10)
where C(i) is calculated as follows:
(11)

Model validation and calibration for single models

The first step in hydrological modeling is to identify the most sensitive parameters that have a significant influence on the model output. According to Ouatiki et al. (2020), sensitivity analysis allows the modeler for identifying the influence of each parameter in governing the runoff process. The applied physically based models were calibrated and validated using daily discharge data. The observed data from 1 January 2006 to 31 December 2017 were used for calibration and discharge data from 1 January 2000 to 31 December 2005 were used for validation. In the SWAT model, calibration was performed using the SUFI-2 algorithm of SWAT-CUP software, which automatically adjusts model parameters iteratively until the best fit between observed and predicted runoff is achieved. For the HEC-HMS model, the most sensitive parameter identification was performed during model optimization using a one-at-a-time approach. This method was carried out by changing the value of one parameter (in the range of ±25%) while keeping the others constant and comparing the NSE value between the observed and simulated discharge values at the catchment outlet. This technique has been used for sensitivity analysis in the HEC-HMS model in previous studies (Tassew et al. 2019; Fanta & Sime 2022). Regarding the HBV model, sensitive parameter identification and model calibration was conducted Monte Carlo's optimization method by setting the objective function that generates the optimum parameter values within the predefined value ranges of the parameter (Bizuneh et al. 2021).

Model evaluation criteria

The accuracy of hydrological predictive models must be evaluated in both the calibration and validation phases. Dawson et al. (2007) explained and discussed 20 performance evaluation criteria commonly used in hydrological process forecasting. According to Nourani et al. (2018), at least one goodness of fit and one absolute error measure should be used to sufficiently evaluate model performance. In this study, the performance of the models was evaluated using root mean square error (RMSE), percent bias (PBIAS), mean absolute error (MAE), determination coefficient (R2), RMSE-observations standard deviation ratio (RSR) and Nash–Sutcliffe efficiency (NSE) as recommended by Moriasi et al. (2007). These performance indices were calculated using the following equations as follows:
(12)
(13)
(14)
(15)
(16)
(17)
where Qob, , Qpre and n are observed, average observed and predicted runoff value and the number of observations, respectively.

The accuracy of the models was evaluated based on these performance indicators according to the ranges and interpretations recommended by Moriasi et al. (2007, 2015) as shown in Table 2.

Table 2

The statistical evaluations used to assess the accuracy of the hydrologic models (Moriasi et al. 2015)

Performance ratingNSER2PBIAS (%)RSR
Unsatisfactory ≤50 ≤60 ≥± 15 <0.7 
Satisfactory 50 < NSE ≤ 70 60 < R2 ≤ 75 ±10 ≤ PBIAS < ±15 0.6 < RSR ≤ 0.7 
Good 70 < NSE ≤ 80 75 < R2 ≤ 85 ±5 ≤ PBIAS < ±10 0.5 < RSR ≤ 0.6 
Very good >80 >85 <± 5 0 < RSR ≤ 0.5 
Performance ratingNSER2PBIAS (%)RSR
Unsatisfactory ≤50 ≤60 ≥± 15 <0.7 
Satisfactory 50 < NSE ≤ 70 60 < R2 ≤ 75 ±10 ≤ PBIAS < ±15 0.6 < RSR ≤ 0.7 
Good 70 < NSE ≤ 80 75 < R2 ≤ 85 ±5 ≤ PBIAS < ±10 0.5 < RSR ≤ 0.6 
Very good >80 >85 <± 5 0 < RSR ≤ 0.5 

Ensemble techniques

It is well known that for a given data set, the performance of one model may be better than the other and if the data set is changed, an opposite result may be obtained. To benefit from the advantages of all the applied models without losing generality, an ensemble technique can be developed using the results of the individual models (Kiran & Ravi 2008). The ensemble technique is a machine learning method that combines the results of different models to increase the overall accuracy of the modeling (Sharghi et al. 2018). According to Kiran & Ravi (2008), ensemble methods are classified into linear (e.g., simple and weighted averaging) and nonlinear ensembles (e.g., artificial neural network (ANN) trained to obtain ensemble results). In this study, due to the simplicity of the technique, nonlinear and linear ensemble techniques were applied using NNE, SAE and WAE. The inputs for the ensemble techniques were the runoff outputs of the individual physically based models. The nonlinear ensemble was created by training the data-driven feed-forward neural network (FFNN) models with the runoff values obtained from each of the physically based models. The general ensemble procedure is described in Figure 8.
Figure 8

Schematic of the ensemble process.

Figure 8

Schematic of the ensemble process.

Close modal

Simple average ensemble

In the SAE method, the arithmetic average of the results of physically based SWAT, HBV and HEC-HMS models is computed as the final predicted runoff values as follows:
(18)
where Q, Qi and N are the output of the SAE, the output of the ith single model (SWAT, HBV and HEC-HMS) and the numbers of single models (N = 3), respectively.

Weighted average ensemble

In WAE modeling, different weights are assigned to the individual models’ outputs based on their relative importance, as follows:
(19)
where wi is the weight on the ith method output, which is calculated as,
(20)
where NSEi is the NSE of the ith model.

Neural network ensemble

The third model combination technique used in this study was NNE. An ANN is a convincing method for processing a large amount of dynamic, nonlinear and noisy data, especially when the physical relationship between input and output is not fully understood. FFNN was trained with a backpropagation algorithm because it is the most common method in neural networks used in solving different hydrologic problems. Among the different ANN training techniques, this study used Levenberg–Marquardt (LM) due to its fast convergence capability, as reported by Sahoo et al. (2005). In this process, the tangent sigmoid activation function was used for both hidden and output layers. The optimum number of hidden neurons and epochs were determined via a trial-and-error method. Therefore, in NNE, FFFN was trained by using the outputs of the SWAT, HEC-HMS and HBV models as input. The trial-and-error method was used to determine the number of epoch and hidden layer neurons. The general structure of ANN is shown in Figure 9. As shown in the picture, the FFNN structure contains input, hidden and output layers. The results of every single model were fed to the input layer of the model and trained using the LM algorithm.
Figure 9

Structure of FFNN.

Figure 9

Structure of FFNN.

Close modal

The sensitivity and simulation results of the individual models (i.e., SWAT, HEC-HMS and HBV) and the ensemble technique (NNE, WAE and SAE) are discussed accordingly in the following subsections.

Sensitivity analysis result

A DEM with 30 × 30 m resolution was used to configure the SWAT model. The spatial input data for the model was delineated and predefined using ArcGIS interfaces. HRU analysis was used to execute HRUs definition from catchment characteristics such as soil data, land use and land cover map and slope. In the SWAT model, the catchment is divided into sub-catchments to lump the total catchment area into small elements having unique slopes, soil and land use characteristics and HRUs. In this study, the Katar catchment was divided into 17 sub-catchments and 156 HRUs. The division of catchments into sub-catchments is done based on the watershed characteristics similarity and the optimum HRU number as recommended by Bizuneh et al. (2021).

In the sensitivity analysis of SWAT model using the SUFI-2 algorithm of SWAT-CUP software, the minimum and maximum values of different parameters were obtained from the literature and then the fitted (optimum) values of the parameters were obtained through iteration until the best agreement between observed and simulated runoff was achieved, using R2 as an objective function (Bizuneh et al. 2021). These parameters were ranked based on their t-stat value during global sensitivity analysis. The list of selected parameters used for the SWAT model calibration and their optimal values are presented in Table 3.

Table 3

Most sensitive parameters optimized value and rank for the SWAT model

ParameterDescriptionMin valueMax valueFitted valueRank
CN2.mgt SCS runoff CN 35 98 61.14 
ALPHA_BF.gw Base flow alpha factor 0.875 
GW_DELY.gw Groundwater delay 500 262 
SOL_K.sol Saturated hydraulic conductivity 2,000 41 
GW_REVAP.gw Groundwater ‘revap’ coefficient 0.02 0.2 0.18 
GWQMN.gw A threshold minimum depth of water in the shallow aquifer for base flow to occur 5,000 267.5 
SOL_AWC.sol Available water capacity of the soil layer 0.55 
HRU_SLP.hru Average slope steepness 0.59 
SURLAG.bsn Surface runoff lag time 0.05 24 0.845 
ParameterDescriptionMin valueMax valueFitted valueRank
CN2.mgt SCS runoff CN 35 98 61.14 
ALPHA_BF.gw Base flow alpha factor 0.875 
GW_DELY.gw Groundwater delay 500 262 
SOL_K.sol Saturated hydraulic conductivity 2,000 41 
GW_REVAP.gw Groundwater ‘revap’ coefficient 0.02 0.2 0.18 
GWQMN.gw A threshold minimum depth of water in the shallow aquifer for base flow to occur 5,000 267.5 
SOL_AWC.sol Available water capacity of the soil layer 0.55 
HRU_SLP.hru Average slope steepness 0.59 
SURLAG.bsn Surface runoff lag time 0.05 24 0.845 

Table 3 shows that CN2 was the most sensitive parameter and that the base flow alpha factor (ALPHA_BF) and groundwater delay (GW_DELY) were the second and third most sensitive parameters, respectively. According to Fanta & Sime (2022), the sensitivity of CN could be because factors for runoff generation such as slope, LULC and soil type are combined into a single CN value. A reduction in CN leads to a reduction in runoff and vice versa. Moreover, the LULC of the Katar catchment is predominantly an improper agricultural practice that could affect the natural relationship between runoff and the infiltration of the soil. Therefore, it could be reasonably said that runoff in the catchment is highly dependent on the CN of the soil. The high sensitivity of ALPHA_BF and GW_DELY in the SWAT model revealed the importance of groundwater dynamics and aquifer systems in the hydrology of the catchment. Calibration is a procedure to better parameterize a model for a given set of local conditions in order to reduce predictive uncertainty. A nearly similar result was obtained in the study by Esmaeili-Gisavandani et al. (2021) and Fanta & Sime (2022).

The second physically based model used in this study was the HEC-HMS model. Runoff in the HEC-HMS model is simulated by analysing meteorological data through open channel routing (Shekar & Vinay 2021). The HEC-HMS physically based model was calibrated and optimised using 18 years of daily observed runoff data. Similar to the SWAT model, a sensitivity analysis was performed in the HEC-HMS model to determine the most sensitive parameters affecting the simulation of runoff. In the HEC-HMS model, the most sensitive parameters identified during model optimisation are listed in Table 4. For the HEC-HMS model, the identification of the most sensitive parameters during model optimisation was performed using a one-at-a-time approach. In this method, the value of one parameter was changed, while the others were held constant and the RMSE value was compared between the observed and simulated discharge values at the outlet of the catchment. The optimum values of the parameters vary between the sub-basins.

Table 4

Parameter sensitivity rank for the HEC-HMS model

ParameterDescriptionValue rangeRank
CN SCS_Curve Number 35–99 
Tlag Lag time 0.1–30,000 
Ia SCS-CN initial abstraction 0.001–500 
K (h) Flood wave traveling time 0.005–150 
x Weighted coefficient of discharge 0.005–0.5 
ParameterDescriptionValue rangeRank
CN SCS_Curve Number 35–99 
Tlag Lag time 0.1–30,000 
Ia SCS-CN initial abstraction 0.001–500 
K (h) Flood wave traveling time 0.005–150 
x Weighted coefficient of discharge 0.005–0.5 

Based on the sensitivity analysis result (Table 4), similar to the SWAT model, CN was the most sensitive parameter of the HEC-HMS model, while Tlag and Ia were the second and third most sensitive parameters, respectively. Varying the values of these parameters during the sensitivity analysis resulted in a significant deviation from the previously predicted runoff value. A similar result was obtained in the sensitivity analysis by Fanta & Sime (2022), Tassew et al. (2019) and Zelelew & Melesse (2018), where CN was identified as the most sensitive parameter and Tlag was second.

The third semi-distributed conceptual model used in this study was the HBV model. The rainfall-runoff simulation using the HBV model requires spatial (e.g., LULC), hydrological (discharge) and climate data of temperature and evapotranspiration. The climate and hydrological input data for the catchment were prepared in the format of the HBV model. The LULC of the Katar catchment classified using the SWAT model was merged into three LULC types in view of vegetation characteristics similarity because the model accepts only a maximum of three vegetation zones. The HBV model was configured into different parameters and five elevation zones with three vegetation zones of the catchment. In this study, the identification of sensitive parameters for the HBV model was conducted using Monte Carlo's optimization method, setting the objective function that generates the optimum values of the parameters within the predefined value range of the parameter as shown in Table 5. Accordingly, FC, parameters controlling the contribution of rainfall to runoff (BETA), recession (storage) coefficient 1 (K1) and LP (soil moisture value above which ETact reaches ETpot) were the most sensitive parameters.

Table 5

Sensitivity result of the HBV model

ParameterUnitRangeOptimumRank
FC mm 100–550 177.77 
BETA (β– 1–6 2.23 
LP mm 0.3–1 0.893 
K1 day-1 0.01–0.2 0.0116 
UZL mm 0–100 43.49 
PERC mm/day 0–4 3.4 
K0 day−1 0.1–0.5 0.353 
K2 day−1 0.001–0.1 0.063 
MAXBAS  1–2.5 2.26109 
ParameterUnitRangeOptimumRank
FC mm 100–550 177.77 
BETA (β– 1–6 2.23 
LP mm 0.3–1 0.893 
K1 day-1 0.01–0.2 0.0116 
UZL mm 0–100 43.49 
PERC mm/day 0–4 3.4 
K0 day−1 0.1–0.5 0.353 
K2 day−1 0.001–0.1 0.063 
MAXBAS  1–2.5 2.26109 

Similar findings were obtained in different studies, which have used HBV for rainfall-runoff simulations (Abebe et al. 2010; Osuch et al. 2015; Parra et al. 2018; Ouatiki et al. 2020; Bizuneh et al. 2021). The maximum water holding capacity of the soil (FC) is one of the parameters in the soil routine that greatly influences the initiation of runoff. Under wet soil conditions, the contribution of FC to runoff could be high and under dry soil conditions, its contribution could be low. According to Abebe et al. (2010), FC plays a key role in the formulation of the HBV model in partitioning effective precipitation into SM and runoff. A higher FC value means that the water storage capacity of the soil is very high and thus a high quantity of water is available for evapotranspiration and vice versa. This makes FC the most important and sensitive parameter in controlling the contribution of soil to runoff generation. Since BETA (β), the second most sensitive parameter, is an exponent of the ratio between SM and FC (Equation (7)), it thus affects the partitioning of net precipitation into SM recharge and runoff. Therefore, the amount of SM available for evapotranspiration depends on this exponent, and thus, on the strong effect of the BETA on the model output. According to Tibangayuka et al. (2022), the sensitivity of BETA and LP also indicates that evaporation rate and precipitation significantly affect the amount of runoff in the catchment.

Results of single physically based models

The performance was measured based on the hydrological credibility of the model's output during the validation and calibration phases. The obtained outputs of the developed physically based models to simulate the rainfall-runoff process in the Katar catchment are presented in Table 6.

Table 6

The results of physically based models for rainfall-runoff modeling

SWAT
HBV
HEC-HMS
Goodness of fitCalibrationValidationCalibrationValidationCalibrationValidation
NSE 0.83 0.799 0.762 0.73 0.756 0.7065 
RMSE (m3/s) 7.85 8.68 9.4 10.37 9.52 10.84 
MAE (m3/s) 5.374 5.658 6.407 7.863 6.037 7.135 
R2 0.857 0.85 0.777 0.835 0.779 0.762 
PBIAS (%) −16.1 −22.3 −19.3 −31.16 −18.4 −22.5 
RSR 0.407 0.434 0.488 0.518 0.494 0.542 
SWAT
HBV
HEC-HMS
Goodness of fitCalibrationValidationCalibrationValidationCalibrationValidation
NSE 0.83 0.799 0.762 0.73 0.756 0.7065 
RMSE (m3/s) 7.85 8.68 9.4 10.37 9.52 10.84 
MAE (m3/s) 5.374 5.658 6.407 7.863 6.037 7.135 
R2 0.857 0.85 0.777 0.835 0.779 0.762 
PBIAS (%) −16.1 −22.3 −19.3 −31.16 −18.4 −22.5 
RSR 0.407 0.434 0.488 0.518 0.494 0.542 

Based on the model performance measures, the SWAT model showed better performance with NSE, R2, RSR, RMSE, MAE and PBIAS values of 0.799, 0.85, 0.434, 8.68 m3/s, 5.658 m3/s and −22.3, respectively, in the validation period. Similarly, the values of NSE, R2, RSR, RMSE, MAE and PBIAS were 0.83, 0.857, 0.407, 7.85, 5.374 and −16.1, respectively, during the calibration period. According to the Moriasi et al. (2015) model performance rating system (for the daily time step), the SWAT model showed a good performance considering the NSE value and a very good based on the values of R2 and RSR. According to Tibangayuka et al. (2022), model performance is good when the RMSE value is less than half the SD of the observed data. In this regard, the SWAT model provided good predictive performance. However, it showed unsatisfactory results with regard to its PBIAS values in both the validation and calibration phases.

The HBV model, for its part, simulated well the daily runoff in both the calibration and validation phases. As shown in Table 6, the HBV model provided an NSE value of 0.732, an RSR value of 0.518, and an R2 value of 0.835 in the validation phase, which is a good performance based on Moriasi et al. (2015) performance rating criteria. This model gave unsatisfactory results based on its PBIAS value. The third semi-distributed model, the HEC-HMS, provided a good performance with NSE values of 0.756 and 0.7065, RSR values of 0.494 and 0.542 and R2 values of 0.779 and 0.762 in the calibration and validation phases, respectively. Similar to the HBV model, the HEC-HMS model led to unsatisfactory results with regard to the PBIAS value (−22.5%). Table 6 shows that all three models yielded negative PBIAS, indicating overestimation according to Moriasi et al. (2007). In this study, it was found that the SWAT, HEC-HMS and HBV are suitable for simulating rainfall-runoff in the catchment. However, in all of the statistical performance measures, the SWAT model led to the best result in simulating the rainfall-runoff process over the Katar catchments compared with the HBV and HEC-HMS models (Table 6). The better performance of the SWAT model could be due to its ability to better discretize the study catchment into more detailed sub-catchments with uniform hydrologic and spatial characteristics (HRUs).

The findings of this study were compared with the results obtained by Bizuneh et al. (2021), Shekar & Vinay (2021) and Temesgen Ayalew (2019) and found a fair similarity. For example, Shekar & Vinay (2021) applied the SWAT and HEC-HMS models in the calibration and validation periods, respectively, in their study of the subhumid tropical Hemavathi catchment in India. They reported the SWAT model performed better with R2 values of 0.81 and 0. 85, and NSE values of 0.76 and 0.82 in the calibration and validation phases, respectively. Bizuneh et al. (2021) also compared the performances of SWAT and HBV models in three watersheds of the Upper Blue Nile River basin in Ethiopia and reported that the SWAT model provided NSE values of 0.81 and 0.8 and R2 values of 0.85 and 0.81, in the calibration and validation phases. This study also reported that the HBV model provides NSE values up to 0.81 and 0.63 and R2 values of 0.82 and 0.72 in the calibration and validation phases. The statistical performance indices value of HBV obtained in this study were similar to the ones obtained by Temesgen Ayalew (2019) and Tibangayuka et al. (2022), which reported the suitability of the HBV model in tropical catchments.

Different researchers have used different model performance measures, including graphical, statistical or a combination of both. According to Harmel et al. (2014), using both statistical and graphical performance measures is important for a robust assessment of model performance. Some statistical performance measures (e.g., NSE) are systematic error insensitive and can perform well even when low values are poorly fitted (Moriasi et al. 2015). In this case, the graphical measures provide additional evidence as to where the model's performance is inadequate. Thus, the performances of the SWAT, HBV and HEC-HMS models in simulating the rainfall-runoff process of the Katar catchment were evaluated using a scatter plot and time series as shown in Figure 10. The time series of daily observed and simulated runoff for the validation period (2000–2005) for the Katar catchment using the HEC-HMS, HBV and SWAT models are shown in Figure 10(a), 10(c) and 10(e), respectively. From the figures, it is clear that the physically based models (especially the SWAT model) were able to simulate the pattern of the observed hydrograph quite well, although with some deviations. As can be seen in Figure 10, the highest observed daily runoff value (110.624 m3/s) in the Katar catchment at the Abura station during the validation phase (2000–2005) occurred on 9 September 2004, while the estimated runoff by the SWAT (Figure 10(e)) and HBV models (Figure 10(c)) was 103.158 and 72.69 m3/s, respectively. This shows that both the SWAT and HBV models underestimated the peak discharge by 6.75 and 34.3%, respectively. In the calibration phase, the SWAT and HBV models underestimated peak flow by 4.86 and 11.58%, respectively. Although it could not accurately simulate peak runoff, the SWAT model provided a value very close to the observed peak runoff. Several studies have also reported underestimation of daily peak flows by the SWAT model (Chathuranika et al. 2022; Fanta & Sime 2022) and the HBV model (Bizuneh et al. 2021; Tibangayuka et al. 2022). According to Tibangayuka et al. (2022), one of the possible reasons for the underestimation of peak runoff by the hydrological model could be that the computed areal precipitation may not represent the catchment area well, since most stations in the catchment are unevenly distributed.
Figure 10

Time series of observed runoff versus (a) HEC-HMS; (c) HBV and (e) SWAT; and scatter plot of observed runoff versus (b) HEC-HMS; (d) HBV and (e) SWAT models in the validation phase.

Figure 10

Time series of observed runoff versus (a) HEC-HMS; (c) HBV and (e) SWAT; and scatter plot of observed runoff versus (b) HEC-HMS; (d) HBV and (e) SWAT models in the validation phase.

Close modal

The HEC-HMS model (Figure 10(a)) provided 119.9 m3/s and overestimates the maximum observed runoff in the validation phase by 8.4%. Similarly, in the calibration phase, the maximum runoff (152.033 m3/s) occurred on 12 August 2014, while HEC-HMS predicted 177.188 m3/s, overestimating the value by 16.55%. According to Fanta & Sime (2022), using the SCS-CN loss method of the HEC-HMS model in tropical regions could lead to an overestimation of peak runoff. Moreover, the Katar catchment is predominantly agricultural, which may result in the simulated runoff being higher than the observed runoff value. The overestimation of runoff by the HEC-HMS model has been reported in many studies (e.g., Abushandi & Merkel 2013; Bhuiyan et al. 2017; Fanta & Sime 2022).

In addition to the time series, the performances of the SWAT (Figure 10(f)), HBV (Figure 10(d)) and HEC-HMS (Figure 10(b)) models for rainfall-runoff simulation were compared using a scatter plot. From the figures, the best compliance is seen between the observed runoff value of the SWAT model and the HBV model, as the points are less spread and close to the 1:1 line. The HEC-HMS model shows the worst performance as it has the highest RMSE (10.84 m3/s) value and the lowest value of NSE (0.7065) and the points in Figure 10(a) are also more scattered and far from the 1:1 line.

According to Jimeno-Sáez et al. (2018), statistical model performance indices and time series cannot provide an explicit performance comparison of model accuracy across different value intervals. This problem can be addressed using the flow duration curve (FDC) by analysing flows in different segments. Thus, in this study, in addition to the statistical measures, the performance of the applied models was also analysed using FDC by dividing the hydrograph into five segments: very low flow (Q95), low flow (Q70Q95), medium flow (Q20Q70), high flow (Q5Q20) and very high flow (>Q5) as recommended by Jimeno-Sáez et al. (2018). Where Qi represents the flow (Q) with exceedance probability. The performances of the HEC-HMS, SWAT and HBV models under different segments of the hydrograph in the validation phase are shown in Figure 11. The FDC in figure shows that the SWAT model generally performs better in the high and very low flow segments and the HEC-HMS model performs better in the medium flow segment.
Figure 11

FDCs of observed and simulated runoff in the validation phase.

Figure 11

FDCs of observed and simulated runoff in the validation phase.

Close modal

A detailed analysis of model performance based on the RMSE for each segment of the hydrograph was also performed, as shown in Table 7. The RMSE values indicate that the SWAT model was better at simulating very low and high flows, the HEC-HMS model was better at simulating low and medium flows and both the HBV and SWAT models gave almost the same performance in simulating the very high flow range.

Table 7

RMSE values (m3/s) of the physically based models in each segment of the hydrograph

Hydrograph phaseRange (m3/s)HEC-HMSSWATHBV
Very low flow 1.188–1.559 0.233 0.193 0.854 
Low flow 1.641–2.415 0.342 0.828 1.556 
Medium flow 2.526–18.788 5.49 5.622 9.001 
High flow 19.244–60.032 9.353 8.954 10.001 
Very high flow 61.048–110.624 5.371 4.508 4.601 
Hydrograph phaseRange (m3/s)HEC-HMSSWATHBV
Very low flow 1.188–1.559 0.233 0.193 0.854 
Low flow 1.641–2.415 0.342 0.828 1.556 
Medium flow 2.526–18.788 5.49 5.622 9.001 
High flow 19.244–60.032 9.353 8.954 10.001 
Very high flow 61.048–110.624 5.371 4.508 4.601 

As shown in Table 6, the SWAT model outperformed the HEC-HMS and HBV models based on statistical performance measures. The hydrographs also show that the HEC-HMS model better simulates medium flows and the SWAT model better simulates very low and high flows, based on the pattern of observed hydrographs. Based on the time series plots and the selected FDC segments, it could be seen that different models might provide different prediction performances at different points of the time series. Therefore, a more accurate prediction of runoff could be achieved by combining the outputs of the individual physically based models through ensemble techniques. Therefore, three ensemble techniques for rainfall-runoff modeling were developed in this study to boost the overall modeling accuracy.

Results of ensemble techniques

To improve the single physically based model's prediction efficiency, three ensemble techniques were developed using the results of the SWAT, HEC-HMS and HBV models as inputs. The results of nonlinear and linear ensemble techniques for runoff simulation are shown in Table 8. The SAE value was obtained by simply taking the arithmetic average of the physically based models’ outputs. In the WAE technique, on the other hand, ensemble modeling was performed by assigning weights for every single model's output based on its NSE value in the validation phase. Thus, in the WAE technique, the weights assigned for the SWAT, HBV and HEC-HMS models were 0.3575, 0.3251 and 0.314, respectively. The weights of WAE were obtained using NSE values of the single model in the validation phase.

Table 8

Results of the proposed ensemble techniques for rainfall-runoff modeling

Goodness of fitSAE
WAE
NNE
CalibrationValidationCalibrationValidationCalibrationValidation
NSE 0.829 0.792 0.846 0.818 0.924 0.896 
R2 0.918 0.877 0.919 0.878 0.925 0.904 
MAE (m3/s) 5.807 6.008 4.204 5.985 3.244 3.726 
RSR 0.398 0.426 0.3766 0.4225 0.2717 0.3216 
RMSE (m3/s) 7.97 8.53 7.54 8.46 5.44 6.44 
PBIAS (%) −22 −34 −17.9 −33.9 −3.94 −9.1 
Goodness of fitSAE
WAE
NNE
CalibrationValidationCalibrationValidationCalibrationValidation
NSE 0.829 0.792 0.846 0.818 0.924 0.896 
R2 0.918 0.877 0.919 0.878 0.925 0.904 
MAE (m3/s) 5.807 6.008 4.204 5.985 3.244 3.726 
RSR 0.398 0.426 0.3766 0.4225 0.2717 0.3216 
RMSE (m3/s) 7.97 8.53 7.54 8.46 5.44 6.44 
PBIAS (%) −22 −34 −17.9 −33.9 −3.94 −9.1 

The NNE technique was the third ensemble technique used in this study as a nonlinear ensemble technique because of its simplicity and popularity and most importantly, its better performance in model combination studies reported in other fields. For example, NNE was used in vehicular traffic noise (Nourani et al. 2020) and wastewater treatment (Nourani et al. 2018) and they reported the high ability of NNE in increasing the accuracy of the single models. In this study, the best result of NNE was achieved when using six hidden neurons, as shown in Table 8.

As shown in Table 8, all developed ensemble techniques have shown very good performance in both the calibration and validation phases in terms of their NSE, R2, RSR and RMSE values according to Moriasi et al. (2007, 2015). Similar to the single models, the linear ensemble techniques (WAE and SAE) showed unsatisfactory performance in terms of the PBIAS value. The linear ensemble technique improved the performance of the HEC-HMS model by 9.65 and 12.1% and the HBV model by 8.8 and 8.5% in the calibration and validation phases, respectively, based on the NSE value. The SAE technique improved the performances of individual models except for the SWAT model. According to Nourani et al. (2021a, 2021b), this could be because arithmetic averaging yields a higher value than the minimum value and lower than the highest values in the dataset. The results of the ensemble technique (Table 9) show that the difference in performance between the linear ensemble models (SAE and WAE) is not large on most statistical performance, but the WAE technique was slightly better than SAE. This could be due to the weighting of the inputs of this technique according to their relative importance. The WAE technique improved the HEC-HMS, SWAT and HBV models by increasing the NSE values by 15.75, 2.4 and 12%, respectively, in the validation phase.

Table 9

RMSE values (m3/s) of the ensemble technique in each segment of the hydrograph

Hydrograph phaseRange (m3/s)SAEWAENNE
Very low flow 1.188–1.559 0.279 0.278 0.643 
Low flow 1.641–2.415 1.163 1.152 0.538 
Medium flow 2.526–18.788 6.712 6.69 2.371 
High flow 19.244–60.032 9.866 9.839 6.515 
Very high flow 61.048–110.624 6.44 6.585 8.43 
Hydrograph phaseRange (m3/s)SAEWAENNE
Very low flow 1.188–1.559 0.279 0.278 0.643 
Low flow 1.641–2.415 1.163 1.152 0.538 
Medium flow 2.526–18.788 6.712 6.69 2.371 
High flow 19.244–60.032 9.866 9.839 6.515 
Very high flow 61.048–110.624 6.44 6.585 8.43 

The best ensemble technique (NNE), improved the performances of SWAT, HBV and HEC-HMS by 12.14, 22.7 and 26.8%, respectively, in the validation phase, based on the NSE value. The RMSE and MAE values of the NNE technique were lower than those of the WAE, SAE and single models (HEC-HMS, SWAT and HBV) in both the calibration and validation phases, indicating the error reduction of the NNE technique. The NNE technique reduced the RMSE of the SWAT, HEC-HMS and HBV models by 25.81, 40.59 and 37.9%, respectively, during the validation phase. Among the ensemble techniques used in this study, the NNE technique provided the best result with very good performance based on all statistical performance measures. This could be due to the ability of the FFNN-based ensemble (NNE) to handle the nonlinear, complex and dynamic nature of the rainfall-runoff process. According to Sharghi et al. (2018), the use of FFNN as a nonlinear ensemble technique simulates the nonlinear behavior of the phenomenon more accurately than the SAE and WAE. The findings of the three ensemble techniques in this study were compared with the results obtained by Nourani et al. (2021a, 2021b), Elkiran et al. (2019) and Sharghi et al. (2018) and found a fair degree of similarity.

Furthermore, the time series plot is another way of visual evaluation between observed and simulated runoff values by ensemble techniques are shown in Figure 12. From the figure, it can be seen that the linear ensembles (SAE and WAE) led to less accurate results compared with the nonlinear ensembles. The NNE method agrees better with the observed data, while greater variations between simulated and observed runoff values are seen in the SAE and WAE methods. The NNE technique adequately simulates the pattern of the observed hydrograph better than the linear ensemble technique. As can be seen in Figure 12, the NNE technique was able to capture a few peak discharges but failed for most. The lack of ANN-based models in simulating peak discharges was also reported by Tibangayuka et al. (2022). The peak flows were better simulated using the WAE and SAE techniques.
Figure 12

Time series plot of observed and ensemble techniques in the validation phase.

Figure 12

Time series plot of observed and ensemble techniques in the validation phase.

Close modal
Peak flows were not adequately captured by both the single models and the ensemble techniques. Similar to the single models, the FDCs were also used to evaluate the performance of the ensemble techniques and their errors in the different segments of the hydrograph in the validation phase (Figure 13). As shown in the figure, the NNE provided better performance in the high, medium and low flow segments of the hydrograph. For example, the Q5 simulated by the SAE, WAE and NNE differed from the observed Q5 by 7.07, 6.65 and 0.57%, respectively. The simulated Q20 from NNE, WAE and SAE deviated from the observed Q20 by 29.7, 39.57 and 39.08%, respectively. The simulated Q70 of NNE, WAE and SAE differs from the observed Q20 by 19.2, 43.97 and 44.32%, respectively.
Figure 13

FDCs of the observed runoff and simulated runoff by ensemble techniques in the validation phase.

Figure 13

FDCs of the observed runoff and simulated runoff by ensemble techniques in the validation phase.

Close modal

In addition, the RMSE of the developed ensemble techniques was also evaluated for different segments of the FDCs in the validation phase (Table 9). The RMSE values of the very high flow were 6.585, 8.43 and 6.44 m3/s for WAE, NNE and SAE, respectively. For the NNE technique, the RMSE values for high, medium, low and very low flows were 6.515, 2,371, 0.538 and 0.643 m3/s, respectively. Similarly, RMSE values from WAE for high, medium, low and very low flows were 9.839, 6.69, 1.152 and 0.278 m3/s, respectively. The other linear ensemble technique, SAE, yielded RMSE values of 9.866, 6.712, 1.163 and 0.279 m3/s for high, medium, low and very low flows, respectively. The NNE technique provides better performance in low, medium and high flows because its RMSE values are smaller compared with the WAE and SAE techniques. The SAE technique outperformed the other techniques in the very high flow segment. The WAE technique provides better simulation performance in the very low flow segment of the FDC.

Alternative to statistical model performance measures (DC and RMSE) and other graphical comparison methods, the Taylor diagram was used in this study to evaluate the performances of the single physically based and ensemble techniques as shown in Figure 14.
Figure 14

Taylor diagram of (a) single models and (b) ensemble technique.

Figure 14

Taylor diagram of (a) single models and (b) ensemble technique.

Close modal

The Taylor diagram, first introduced by Taylor (2001), graphically visualises different performance measures such as SD and correlation coefficient (r) to evaluate the accuracy of the models developed. The importance of the Taylor diagram is that it combines various model performance indicators in a single graph and statistically quantifies the similarity between the simulated and actual values. From Figure 14(a), it is shown that the best physically based model is SWAT with r = 0.926 and SD = 1.75, while HEC-HMS is the worst model. The Taylor diagram was also used to evaluate the performance of model combination techniques as shown in Figure 14(b). According to Yaseen et al. (2018), a predictive model is perfect when its r value is 1. In Figure 14, based on the mentioned performance indicators (r = 0.964 and SD = 1.98), it is seen that NNE led to the most accurate predictive ability compared with the other ensemble techniques.

The current study evaluated the simulation performances of the SWAT, HBV, HEC-HMS models for rainfall-runoff simulation in the Katar catchment in Ethiopia. To further improve the simulation performance, the outputs of the single models were combined using NNE, SAE and WAE techniques. Calibration and validation of the developed hydrologic models were performed using 18-year runoff records (2000–2017) at Abura station. The performance of the models was then evaluated using statistical performance measures (NSE, R2, PBIAS, MAE, RSR and RMSE), time series of observed and simulated discharges and FDCs. The result shows that the SWAT model performed better than the HEC-HMS and HBV models with NSE, R2, RMSE, RSR, MAE and PBIAS values of 0.799, 8.68 m3/s, 0.434, 0.85, 5.658 m3/s and −22.3, respectively, in the validation phase. The FDC also indicated that the SWAT model could simulate very low (>Q95), high (Q5Q20) and very high (<Q5) flows better than the HEC-HMS and HBV models. The second best performing single model was HBV, which provided NSE, RMSE, MAE, R2, PBIAS and RSR values of 0.73, 10.37 m3/s, 7.863 m3/s, 0.835, −31.16 and 0.518, respectively, in the validation phase. However, this model was outperformed by the poorly performing model (HEC-HMS) in simulating the medium and low flows. The result of ensemble modeling shows that the NNE technique significantly improved the individual models in both the validation and calibration phases. It improved the performances of SWAT, HBV and HEC-HMS by increasing the NSE values up to 12.14, 22.74 and 26.8% and decreasing the RMSE value by 25.81, 37.9 and 40.59%, respectively, in the validation phase. The high ability of NNE to deal with noise and nonlinear behavior of the rainfall-runoff process could be the reason for its best performance. Therefore, this technique is recommended in the Katar catchment to improve the results of the individual model as well as for accurate and reliable water resource assessment in similar regions. In addition, this study used only the physically based model's output in the ensemble unit. To improve the prediction efficiency and consider the effects of catchment characteristics, future studies should test the inclusion of artificial intelligence and physically based models in the ensemble framework.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Abebe
N. A.
,
Ogden
F. L.
&
Pradhan
N. R.
2010
Sensitivity and uncertainty analysis of the conceptual HBV rainfall-runoff model: implications for parameter estimation
.
Journal of Hydrology
389
(
3–4
),
301
310
.
https://doi.org/10.1016/j.jhydrol.2010.06.007
.
Abushandi
E.
&
Merkel
B.
2013
Modelling rainfall runoff relations using HEC-HMS and IHACRES for a single rain event in an arid region of Jordan
.
Water Resources Management
27
(
7
),
2391
2409
.
https://doi.org/10.1007/s11269-013-0293-4
.
Ahmadi
M.
,
Moeini
A.
,
Ahmadi
H.
,
Motamedvaziri
B.
&
Zehtabiyan
G. R.
2019
Comparison of the performance of SWAT, IHACRES and artificial neural networks models in rainfall-runoff simulation (case study: Kan watershed, Iran)
.
Physics and Chemistry of the Earth
111
,
65
77
.
https://doi.org/10.1016/j.pce.2019.05.002.
Arnold
J. G.
,
Srinivasan
R.
,
Muttiah
R. S.
&
Williams
J. R.
1998
Large area hydrologic modeling and assessment – part 1: model development
.
Journal of the American Water Resources Association
34
(
1
),
73
89
.
Bates
J. M.
&
Granger
C. W. J.
1969
The combination of forecasts
.
Journal of the Operational Research Society
20
,
451
468
.
Bergstorm
S.
&
Forsman
A.
1973
Development of a conceptual deterministic rainfall-runoff model
.
Nordic Hydrology
4
(
3
),
17
27
.
Bergström
S.
1992
The HBV model: its structure and applications, Swedish Meteorological and Hydrological Institute (SMHI)
.
Hydrology, Norrköping
1
33
.
Bhuiyan
H. A. K. M.
,
McNairn
H.
,
Powers
J.
&
Merzouki
A.
2017
Application of HEC-HMS in a cold region watershed and use of RADARSAT-2 soil moisture in initializing the model
.
Hydrology
4
(
1
),
1
19
.
https://doi.org/10.3390/hydrology4010009
.
Biru
Z.
&
Kumar
D.
2018
Calibration and validation of SWAT model using stream flow and sediment load for Mojo watershed, Ethiopia
.
Sustainable Water Resources Management
4
(
4
),
937
949
.
https://doi.org/10.1007/s40899-017-0189-1
.
Bizuneh
B. B.
,
Moges
M. A.
,
Sinshaw
B. G.
&
Kerebih
M. S.
2021
SWAT and HBV models’ response to streamflow estimation in the upper Blue Nile Basin, Ethiopia
.
Water-Energy Nexus
4
,
41
53
.
https://doi.org/10.1016/j.wen.2021.03.001
.
Brighenti
T. M.
,
Bonumá
N. B.
,
Grison
F.
,
Mota
A.
, de A.,
Kobiyama
M.
&
Chaffe
P. L. B.
2019
Two calibration methods for modeling streamflow and suspended sediment with the swat model
.
Ecological Engineering
127 (
November 2018
),
103
113
.
https://doi.org/10.1016/j.ecoleng.2018.11.007.
Chathuranika
I. M.
,
Gunathilake
M. B.
,
Baddewela
P. K.
,
Sachinthanie
E.
,
Babel
M. S.
,
Shrestha
S.
,
Jha
M. K.
&
Rathnayake
U. S.
2022
Comparison of two hydrological models, HEC-HMS and SWAT in runoff estimation: application to Huai Bang Sai Tropical Watershed, Thailand
.
Fluids
7
(
8
),
267
.
https://doi.org/10.3390/fluids7080267.
Dash
S. S.
,
Sahoo
B.
&
Raghuwanshi
N. S.
2020
A novel embedded pothole module for Soil and Water Assessment Tool (SWAT) improving streamflow estimation in paddy-dominated catchments
.
Journal of Hydrology
588
,
125103
.
https://doi.org/10.1016/j.jhydrol.2020.125103
.
Dawson
C. W.
,
Abrahart
R. J.
&
See
L. M.
2007
Hydrotest: a web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts
.
Environmental Modelling and Software
22
(
7
),
1034
1052
.
https://doi.org/10.1016/j.envsoft.2006.06.008
.
Desta
H.
&
Fetene
A.
2020
Land-use and land-cover change in Lake Ziway watershed of the Ethiopian Central Rift Valley Region and its environmental impacts
.
Land Use Policy
96
,
104682
.
https://doi.org/10.1016/j.landusepol.2020.104682
.
DHI
1999
MIKE SHE Water Movement: User Manual
Danish Hydraulic Institute
,
Hørsholm, Denmark
.
Elkiran
G.
,
Nourani
V.
&
Abba
S.
2019
Multi-step ahead modelling of river water quality parameters using ensemble artificial intelligence-based approach
.
Journal of Hydrology
577
(
June
),
123962
.
https://doi.org/10.1016/j.jhydrol.2019.123962
.
Esmaeili-Gisavandani
H.
,
Morteza
L.
,
Sofla
M. S. D.
&
Ashrafzadeh
A.
2021
Improving the performance of rainfall-runoff models using the gene expression programming approach
.
Journal of Water and Climate Change
12
(
7
),
3308
3329
.
https://doi.org/10.2166/wcc.2021.064
.
Fanta
S. S.
&
Sime
C. H.
2022
Performance assessment of SWAT and HEC-HMS model for runoff simulation of Toba watershed, Ethiopia
.
Sustainable Water Resources Management
8
(
8
).
https://doi.org/10.1007/s40899-021-00596-8.
Feldman
A. D.
2000
Hydrologic Modeling System HEC-HMS, Technical Reference Manual. U.S. Army Corps of Engineers
.
Hydrologic Engineering Center, HEC
,
Davis, CA
,
USA
.
Fenicia
F.
,
Solomatine
D. P.
,
Savenije
H. H. G.
&
Matgen
P.
2007
Soft combination of local models in a multi-objective framework
.
Hydrology and Earth System Sciences
11
(
6
),
1797
1809
.
https://doi.org/10.5194/hess-11-1797-2007.
Hallouz
F.
,
Meddi
M.
,
Mahé
G.
,
Alirahmani
S.
&
Keddar
A.
2018
Modeling of discharge and sediment transport through the SWAT model in the basin of Harraza (Northwest of Algeria)
.
Water Science
32
(
1
),
79
88
.
https://doi.org/10.1016/j.wsj.2017.12.004.
Halwatura
D.
&
Najim
M. M. M.
2013
Application of the HEC-HMS model for runoff simulation in a tropical catchment
.
Environmental Modelling and Software
46
,
155
162
.
https://doi.org/10.1016/j.envsoft.2013.03.006
.
Hamdan
A. N. A.
,
Almuktar
S.
&
Scholz
M.
2021
Rainfall-runoff modeling using the HEC-HMS model for the Al-Adhaim River catchment, Northern Iraq
.
Hydrology
8
(
2
).
https://doi.org/10.3390/hydrology8020058.
Harmel
R. D.
,
Smith
P. K.
,
Migliaccio
K. W.
,
Chaubey
I.
,
Douglas-Mankin
K. R.
,
Benham
B.
,
Shukla
S.
,
Muñoz-Carpena
R.
&
Robson
B. J.
2014
Evaluating, interpreting, and communicating performance of hydrologic/water quality models considering intended use: a review and recommendations
.
Environmental Modelling and Software
57
,
40
51
.
https://doi.org/10.1016/j.envsoft.2014.02.013
.
Jimeno-Sáez
P.
,
Senent-Aparicio
J.
,
Pérez-Sánchez
J.
&
Pulido-Velazquez
D.
2018
A comparison of SWAT and ANN models for daily runoff simulation in different climatic zones of peninsular Spain
.
Water (Switzerland)
10
(
2
).
https://doi.org/10.3390/w10020192
Kiran
R. N.
&
Ravi
V.
2008
Software reliability prediction by soft computing techniques
.
Journal of Systems and Software
81
(
4
),
576
583
.
https://doi.org/10.1016/j.jss.2007.05.005
.
Kisi
O.
,
Dailr
A. H.
,
Cimen
M.
&
Shiri
J.
2012
Suspended sediment modeling using genetic programming and soft computing techniques
.
Journal of Hydrology
450–451
,
48
58
.
https://doi.org/10.1016/j.jhydrol.2012.05.031
.
Kisi
O.
,
Shiri
J.
&
Tombul
M.
2013
Modeling rainfall-runoff process using soft computing techniques
.
Computers and Geosciences
51
,
108
117
.
https://doi.org/10.1016/j.cageo.2012.07.001
.
Melaku
N. D.
&
Wang
J.
2019
A modified SWAT module for estimating groundwater table at Lethbridge and Barons, Alberta, Canada
.
Journal of Hydrology
575
(
May
),
420
431
.
https://doi.org/10.1016/j.jhydrol.2019.05.052.
Miraji
M.
,
Liu
J.
&
Zheng
C.
2019
The impacts of water demand and its implications for future surface water resource management
.
Water (Switzerland)
11
(
8
),
2
11
.
Mishra
S. K.
&
Singh
V. P.
2004
Long-term hydrological simulation based on the Soil Conservation Service curve number
.
Hydrological Processes
18
(
7
),
1291
1313
.
https://doi.org/10.1002/hyp.1344
.
Moriasi
D. N.
,
Arnold
J. G.
,
Van Liew
M. W.
,
Bingner
R. L.
,
Harmel
R. D.
&
Veith
T. L.
2007
Model evaluation guidelines for systematic quantification of accuracy in watershed simulations
.
Transactions of the ASABE
50
(
3
),
885
900
.
https://doi.org/10.13031/2013.23153
.
Moriasi
D. N.
,
Gitau
M. W.
,
Pai
N.
&
Daggupati
P.
2015
Hydrologic and water quality models: performance measures and evaluation criteria
.
Transactions of the ASABE
58
(
6
),
1763
1785
.
https://doi.org/10.13031/trans.58.10715
.
Noori
N.
&
Kalin
L.
2016
Coupling SWAT and ANN models for enhanced daily streamflow prediction
.
Journal of Hydrology
533
,
141
151
.
https://doi.org/10.1016/j.jhydrol.2015.11.050
.
Nourani
V.
,
Elkiran
G.
&
Abba
S. I.
2018
Wastewater treatment plant performance analysis using artificial intelligence [ndash] an ensemble approach
.
Water Science and Technology
78
(
10
),
2064
2076
.
https://doi.org/10.2166/wst.2018.477.
Nourani
V.
,
Gökçekuş
H.
,
Umar
I. K.
,
Nourani
V.
,
Gökçekuş
H.
&
Umar
I. K.
2020
Artificial intelligence based ensemble model for prediction of vehicular traffic noise
.
Environmental Research
180
(
October 2019
),
108852
.
https://doi.org/10.1016/j.envres.2019.108852
.
Nourani
V.
,
Gokcekus
H.
&
Gelete
G.
2021a
Estimation of suspended sediment load using artificial intelligence-based ensemble model
.
Complexity
2021
(
Article ID 6633760
),
19
.
https://doi.org/10.1155/2021/6633760
.
Nourani
V.
,
Gökçekuş
H.
&
Gichamo
T.
2021b
Ensemble data-driven rainfall-runoff modeling using multi-source satellite and gauge rainfall data input fusion
.
Earth Science Informatics
14
(
4
),
1787
1808
.
https://doi.org/10.1007/s12145-021-00615-4
.
Osuch
M.
,
Romanowicz
R. J.
&
Booij
M. J.
2015
Influence de l'incertitude paramétrique sur les relations entre les paramètres du modèle par le VHB et les caractéristiques climatiques
.
Hydrological Sciences Journal
60
(
7–8
),
1299
1316
.
https://doi.org/10.1080/02626667.2014.967694
.
Ouatiki
H.
,
Boudhar
A.
,
Ouhinou
A.
,
Beljadid
A.
,
Leblanc
M.
&
Chehbouni
A.
2020
Sensitivity and interdependency analysis of the HBV conceptual model parameters in a semi-arid
.
Water
12
(
9
),
2440
.
https://doi.org/https://dx.doi.org/10.3390/w12092440
.
Parra
V.
,
Fuentes-Aguilera
P.
&
Muñoz
E.
2018
Identifying advantages and drawbacks of two hydrological models based on a sensitivity analysis: a study in two Chilean Watersheds
.
Hydrological Sciences Journal
63
(
12
),
1831
1843
.
https://doi.org/10.1080/02626667.2018.1538593
.
Perrin
C.
,
Michel
C.
&
Andréassian
V.
2003
Improvement of a parsimonious model for streamflow simulation
.
Journal of Hydrology
279
(
1–4
),
275
289
.
https://doi.org/10.1016/S0022-1694(03)00225-7
.
Rostamian
R.
,
Jaleh
A.
,
Afyuni
M.
,
Mousavi
S. F.
,
Heidarpour
M.
,
Jalalian
A.
&
Abbaspour
K. C.
2008
Application of a SWAT model for estimating runoff and sediment in two mountainous basins in central Iran
.
Hydrological Sciences Journal
53
(
5
),
977
988
.
https://doi.org/10.1623/hysj.53.5.977
.
Sahoo
G. B.
,
Ray
C.
&
Wade
H. F.
2005
Pesticide prediction in ground water in North Carolina domestic wells using artificial neural networks
.
Ecological Modelling
183
(
1
),
29
46
.
https://doi.org/10.1016/j.ecolmodel.2004.07.021
.
Sharghi
E.
,
Nourani
V.
&
Behfar
N.
2018
Earthfill dam seepage analysis using ensemble artificial intelligence based modeling
.
Journal of Hydroinformatics
20
(
5
),
1071
1084
.
https://doi.org/10.2166/hydro.2018.151.
Shekar
S. N. C.
&
Vinay
D. C.
2021
Performance of HEC-HMS and SWAT to simulate streamflow in the sub-humid tropical Hemavathi Catchment
.
Journal of Water and Climate Change
12
(
7
),
3005
3017
.
https://doi.org/10.2166/wcc.2021.072
.
Srinivasan
R.
,
Ramanarayanan
T. S.
,
Arnold
J. G.
&
Bednarz
S. T.
1998
Large area hydrologic modeling and assessment part II: model application
.
Journal of the American Water Resources Association
34
(
1
),
91
101
.
https://doi.org/10.1111/j.1752-1688.1998.tb05962.x.
Tassew
B. G.
,
Belete
M. A.
&
Miegel
K.
2019
Application of HEC-HMS model for flow simulation in the Lake Tana Basin: the case of Gilgel Abay Catchment, Upper Blue Nile Basin, Ethiopia
.
Hydrology
6
(
21
),
1
17
.
https://doi.org/10.3390/hydrology6010021
.
Taylor
K. E.
2001
Summarizing multiple aspects of model performance in a single diagram
.
Journal of Geophysical Research
106
(
D7
),
7183
7192
.
https://doi.org/10.1029/2000JD900719
.
Temesgen Ayalew
A.
2019
Rainfall–runoff modeling: a comparative analyses: semi-distributed HBV light and SWAT models in Geba Catchment, Upper Tekeze Basin, Ethiopia
.
American Journal of Science, Engineering and Technology
4
(
2
),
34
.
https://doi.org/10.11648/j.ajset.20190402.12
.
Tewolde
M. H.
&
Smithers
J. C.
2006
Flood routing in ungauged catchments using Muskingum methods
.
Water SA
32
(
3
),
379
388
.
https://doi.org/10.4314/wsa.v32i3.5263
.
Tibangayuka
N.
,
Mulungu
D. M. M.
&
Izdori
F.
2022
Evaluating the performance of HBV, HEC-HMS and ANN models in simulating streamflow for a data scarce high-humid tropical catchment in Tanzania
.
Hydrological Sciences Journal
67
(
14
),
1
14
.
https://doi.org/10.1080/02626667.2022.2137417
.
USACE
2010
Hydrologic Modeling System HEC-HMS User's Manual Version 3.5
.
US Army Corps of Engineers, Hydrologic Engineering Center
,
Davis, CA
, p.
318
.
Yang
S.
,
Yang
D.
,
Chen
J.
,
Santisirisomboon
J.
,
Lu
W.
&
Zhao
B.
2020a
A physical process and machine learning combined hydrological model for daily streamflow simulations of large watersheds with limited observation data
.
Journal of Hydrology
590
,
125206
.
https://doi.org/10.1016/j.jhydrol.2020.125206
.
Yang
X.
,
Magnusson
J.
,
Huang
S.
,
Beldring
S.
&
Xu
C. Y.
2020b
Dependence of regionalization methods on the complexity of hydrological models in multiple climatic regions
.
Journal of Hydrology
582
(
April 2019
),
124357
.
https://doi.org/10.1016/j.jhydrol.2019.124357
.
Yaseen
Z. M.
,
Deo
R. C.
,
Hilal
A.
,
Abd
A. M.
,
Bueno
L. C.
,
Salcedo-Sanz
S.
&
Nehdi
M. L.
2018
Predicting compressive strength of lightweight foamed concrete using extreme learning machine model
.
Advances in Engineering Software
115
(
April 2017
),
112
125
.
https://doi.org/10.1016/j.advengsoft.2017.09.004
.
Young
C. C.
,
Liu
W. C.
&
Wu
M. C.
2017
A physically based and machine learning hybrid approach for accurate rainfall-runoff modeling during extreme typhoon events
.
Applied Soft Computing Journal
53
,
205
216
.
https://doi.org/10.1016/j.asoc.2016.12.052
.
Zelelew
D. G.
&
Melesse
A. M.
2018
Applicability of a spatially semi-distributed hydrological model for watershed scale runoff estimation in Northwest Ethiopia
.
Water (Switzerland)
10
(
7
),
10
12
.
https://doi.org/10.3390/w10070923
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).