ABSTRACT
This study investigated the performance of the adaptive neuro-fuzzy inference system (ANFIS), feed forward neural network (FFNN), Soil and Water Analysis Tool (SWAT), Hydrologic Engineering Center's Hydraulic Modeling System (HEC-HMS), Hydrologiska Byråns Vattenbalansavdelning (HBV), and support vector regression (SVR) models for rainfall–runoff modeling using gauged and satellite rainfall, and their fusions in the Gilgel Abay watershed, Ethiopia. Afterward, simple average ensemble (SAE), weighted average ensemble (WAE), and neural network ensemble (NNE) techniques were applied to combine the outputs of individual models under three scenarios. The performance of the models was evaluated using Nash–Sutcliffe efficiency (NSE) and root mean square error (RMSE). The results demonstrated that the ANFIS model outperformed all the other single models with validation stage NSE values of 0.864 and 0.875, and RMSE values of 23.58 and 21.84 m3/s for gauge and fusion rainfall data, respectively. Among the physical-based models, SWAT gave better modeling performance with the validation stage NSE values of 0.81 and 0.821 for gauge and fusion rainfall data, respectively. Moreover, an ensemble of artificial intelligence and physical-based models greatly improved the overall modeling performance. The NNE improved the performance of single models up to 15.7 and 21.2 5% for fusion and satellite-based rainfall modeling, respectively.
HIGHLIGHTS
The study integrated machine learning models with process-based models.
The performance of the models was enhanced by the fusion of ground and satellite-based rainfall data for the first time.
Afterwards, the outputs of individual models were ensembled through three scenarios.
The ensemble technique improved the rainfall fusion-based and satellite-based modeling accuracy by 15.7 and 21.2%, respectively.
INTRODUCTION
Rainfall–runoff modeling is an imperative factor for decision-making in the water sector, such as flood mitigation and warning, reservoir planning, and hydraulic structure design (Shamseldin 2010). The rainfall–runoff models are the mathematical representation of physical relations between diverse hydrologic cycle components within the defined hydrological unit (Yang et al. 2020). Nevertheless, the process is challenging because its physics is subjected to change in space and time, revealing random and nonlinear characteristics. The rainfall–runoff models are generally categorized into black-box, conceptual and physical-based models (Zhang & Savenije 2005). The physical-based semi-distributed models, for instance, the Soil and Water Assessment Tool (SWAT) and the Hydrologic Engineering Center's Hydraulic Modeling System (HEC-HMS) well understand the physical relationships of the components and truthfully simulate rainfall runoff; however, they require bulky spatial and temporal datasets within the system (Nourani et al. 2021b). On the other hand, the black-box artificial intelligence (AI)-based models such as artificial neural network (ANN), support vector regression (SVR), and adaptive neuro-fuzzy system (ANFIS) are effective in capturing nonlinear and non-stationary characteristics of the hydrological processes (Zakizadeh et al. 2020).
SWAT is a physically based and semi-distributed model that can effectively model rainfall–runoff relations (Arnold et al. 1998). The SWAT model is verified as a credible and effective tool not only for runoff simulation but also for flood prediction and warning, nutrient transportation, soil erosion, and land use pattern change modeling (Busico et al. 2020). HEC-HMS is another semi-distributed model for rainfall–runoff modeling at dendritic watersheds (Feldman 2000). Hydrologiska Byråns Vattenbalansavdelning (HBV) is a conceptual semi-distributed hydrologic model based on a simple continuity equation to model streamflow considering the physical processes of the components (Ciupak et al. 2019). The model divides the watershed into smaller sub-watersheds. HBV has been practically simulated in numerous hydrological models such as rainfall runoff, climatic variability, and water level forecasting (Pervin et al. 2021).
AI-based models are gaining popularity in rainfall–runoff modeling because they are easy and can provide accurate results with a short convergence time using few input datasets. ANN is the nonlinear black-box model and it has a fast, flexible architecture, self-learning, and adaptive characteristics with simple mathematical relations (Shamseldin 2010). ANN can detect and handle the nonlinear and non-stationary behavior of the inputs in rainfall–runoff modeling conversely; it cannot map the physical relationships of the watershed (Govindaraju 2000) and overtraining is its limitation. ANFIS is a hybrid model, and it syndicates the rapid learning capacity of ANN and fuzzy logic (FL) (Jang 1993) with the ability to apprehend the advantages of ANN and FL in a single topology (Chang et al. 2015). The ANN model has been effectively applied for rainfall–runoff modeling (Yaseen et al. 2017).
SVR is a nonlinear regression model advanced from the support regression machine (SVM) with the basic perception of having the capacity to map datasets with greater dimensionality via nonlinear techniques to map the relations between inputs and outputs. The objective function of SVR is an operational risk that is used to diminish the inaccuracy between observed and predicted variables (Wen et al. 2015). As compared to the other two AI models, the SVR model is superior, particularly because it can significantly reduce overfitting, provide a global optimum solution, and can parallel distribution processing. The limitation of SVR is its complex computational processes since it uses quadratic equations for computing the regression (Wang & Hu 2005).
Overall, the mentioned physically based and AI models could provide reliable rainfall–runoff outputs, nevertheless, for the same problem, disparate models might offer different results. Thus, an amalgamation of individual model outputs via an ensemble technique could enhance the modeling efficiency by reducing the error variance compared to single models (Shamseldin & Connor 1999; Sharghi et al. 2018). The ensemble technique could ascertain and encapsulate exceptional characteristics of every model and data, thus increasing the overall accuracy of the modeling.
For any hydrological modeling, the result would be accurate, if the rainfall dataset is good enough and evenly distributed in space and time. Conventional rain gauges, radars, and satellite-based rainfall measurement practices are the usual methods of gathering rainfall data for rainfall–runoff simulation. Nevertheless, ground stations for rainfall measurement are often unevenly and non-uniformly distributed in developing countries (Worqlul et al. 2017). In the past few decades, satellite rainfall datasets have been identified as cheap, and consistent data sources, that are available in several temporal and spatial resolutions and it has been attracting the interests of hydrologists, particularly in regions where the conventional gauging stations are unavailable or sparsely distributed (Tapiador et al. 2012). Several spacecrafts have been launched aimed at estimating precipitation, for instance, the Climate Prediction Center morphing (CMORPH) method, Tropical Rainfall Measuring Mission (TRMM), and Global Precipitation Measurement Core Observatory. The CMORPH was set in motion in 1998 to record precipitation products as near-real-time datasets and to retrieve better temporal and spatial resolution precipitation records from more accurate passive microwave sensors. The TRMM provides rainfall data in real-time (3B42RT) and post-research real-time (3B42) (Joyce et al. 2004). The TRMM combines the relative merits of rainfall information from multi-satellite sources and provides more consistent and accurate precipitation over the specified grids (Prakash et al. 2018). TRMM is an ideal technique for precipitation measurement with suitable spatial and temporal resolution because it consists of appropriate sets of measurement devices and is situated at low orbits with a suitable angle of inclination.
In spite of the fact that the satellite-based rainfall datasets are suitable inputs for rainfall–runoff simulation of non-gauged watersheds, every satellite data has its benefits since the rainfall dataset's consistencies in space and time are extremely affected by elevation and atmospheric influences (Tang & Hossain 2012). Hence, the combination of precipitation datasets from several satellite bases as input fusion might improve rainfall–runoff modeling through the calibration stage.
Uncertainties in rainfall–runoff modeling that might arise from the input data are practically managed through calibration and assimilation of the input data (Kumar et al. 2015). In catchments where rain gauging stations could not adequately represent the area, fusion of gauge and satellite rainfall data from multiple sources proved to be highly effective in rainfall–runoff modeling. The distortion in satellite rainfall estimates could be corrected by gauging rainfall during the fusion. All models have not been able to perform equally well in rainfall–runoff modeling, and they have their merits in one aspect and their shortcomings in the other. Hence, ensemble techniques could update the modeling by combining the advantages of each model in the calibration phase and enhance the overall efficiency of rainfall–runoff modeling.
This study aimed to ensemble rainfall–runoff modeling using both physically based (SWAT, HEC-HMS, and HBV), and AI-based modeling (FFNN, ANFIS, and SVR) employing the gauge and satellite rainfall datasets. The HEC-HMS SWAT and HBV were chosen in this study among many hydrological models due to their free availability and reliability in different catchments. This makes these models useful, particularly in poor nations where there is limited funding for using commercial software. Additionally, these models have shown reliable modeling results for continuous processes in tropical regions (Tibangayuka et al. 2022; Gelete et al. 2023a). This study is the first to ensemble physical-based and AI-based models in different scenarios using a fusion of gauge and satellite rainfall data sources. The current study was conducted in the Gilgel Abay catchment, Ethiopia, where the rainfall observation gauges are inadequate, uneven, and short in space and time. Furthermore, the topography of the area ranges from hills to plains, and in this case, the rainfall datasets could be subjected to orographic influences and cause bias and improper areal representations (Gebre 2015). The study catchment plays an important role in supporting the livelihood of people living in it and hence contributing to economic development. This study is the first to try to enhance the rainfall–runoff modeling of the Gilgel Abay catchment by applying rainfall fusion from satellite and gauge-based sources. Therefore, the output of this research may assist water managers and researchers in choosing an enhanced model for runoff simulation in the area. The result can also be utilized as a pioneering step for studies in different watersheds.
MATERIALS AND METHODS
Description of the study area
The map of the Gilgel Abay watershed with grid lines of satellite rainfall estimation.
The map of the Gilgel Abay watershed with grid lines of satellite rainfall estimation.
Datasets
Several temporal and spatial datasets from multiple sources were utilized in the proposed methodology to simulate runoff for the Gilgel Abay catchment. The daily climatic time series, namely rainfall (both gauge and satellite sources), maximum and minimum temperature, and discharge of 12 years (2007–2018), was utilized as inputs for rainfall–runoff simulation. The first 2 years of (2007–2008) daily data were utilized as a spin-off period for the physically based models. The remaining data were divided as 70% (2009–2015) and 30% (2016–2018) were utilized for calibration and validating the developed AI and physical-based models. The dataset series were tested for homogeneity using the double mass curve method. The source and statistics of meteorological and hydrological data used for this study are shown in Table 1.
The descriptive statistics and source of the time series dataset
Variable . | Stations . | Minimum . | Average . | Maximum . | Standard deviation . | Source . |
---|---|---|---|---|---|---|
Gauge rainfall (mm/day) | Wetet Abay | 0 | 5.32 | 71 | 9.89 | Ethiopian national meteorology agency |
Adet | 0 | 3.50 | 71.2 | 7.29 | ||
Sekela | 0 | 6.05 | 73.1 | 9.92 | ||
Gundil | 0 | 6.74 | 90.8 | 11.81 | ||
Dangila | 0 | 4.42 | 62 | 8.61 | ||
Satellite rainfall (mm/day) | CMORPH | 0 | 5.82 | 71.5 | 9.40 | Satellite data |
3B42RT | 0 | 3.73 | 55.10 | 6.60 | ||
3B42 | 0 | 3.54 | 49.32 | 6.12 | ||
Discharge (m3/s) | 4.71 | 58.4 | 406.846 | 61.01 | Ethiopian Ministry of water and energy |
Variable . | Stations . | Minimum . | Average . | Maximum . | Standard deviation . | Source . |
---|---|---|---|---|---|---|
Gauge rainfall (mm/day) | Wetet Abay | 0 | 5.32 | 71 | 9.89 | Ethiopian national meteorology agency |
Adet | 0 | 3.50 | 71.2 | 7.29 | ||
Sekela | 0 | 6.05 | 73.1 | 9.92 | ||
Gundil | 0 | 6.74 | 90.8 | 11.81 | ||
Dangila | 0 | 4.42 | 62 | 8.61 | ||
Satellite rainfall (mm/day) | CMORPH | 0 | 5.82 | 71.5 | 9.40 | Satellite data |
3B42RT | 0 | 3.73 | 55.10 | 6.60 | ||
3B42 | 0 | 3.54 | 49.32 | 6.12 | ||
Discharge (m3/s) | 4.71 | 58.4 | 406.846 | 61.01 | Ethiopian Ministry of water and energy |
In addition to the hydrometeorological data listed in Table 1, the physically based models (SWAT, HBV, and HEC-HMS) require spatial data as input. Thus, the fundamental spatial inputs for delineating the watershed into sub-basins and hydrological response units (HRUs) were the digital elevation model (DEM), land use land cover (LULC) map, and soil map. The DEM of the Gilgel Abay watershed with a spatial resolution of 12.5 × 12.5 m was downloaded from the Alaska Satellite Facility Service (https://search.asf.alaska.edu/#/). The LULC map for the year 2017 was downloaded from ESA's Sentinel-2 imagery, which is high-resolution data (10 × 10 m grid) and includes detailed land use classes.
A satellite-based precipitation dataset is an alternative source of rainfall for ungagged and sparsely gagged watersheds like the Gilgel Abay catchment. In the current study, two TRMM version 7, (3B42RT and 3B42) and CMORPH satellite precipitation products were used for rainfall–runoff modeling. The satellite rainfall datasets applied in this study were available on the daily time scale with a spatial resolution of 0.25° × 0.25°. The entire watershed was represented by eight satellite precipitation grids (Figure 1). The three satellite precipitation products were used in this study because they have shown good performance in previous studies (Bitew et al. 2012; Nourani et al. 2013) at daily temporal resolution.
Proposed methodology
Schematic diagram of proposed methodology (Ptg is the gauge rainfall and Qtob is the observed discharge).
Schematic diagram of proposed methodology (Ptg is the gauge rainfall and Qtob is the observed discharge).
Used rainfall–runoff models
In this study, three semi-distributed (SWAT, HBV, and HEC-HMS) and three AI-based (ANFIS, FFNN, and SVR) models were employed for rainfall–runoff modeling.
Physically based models
(a) LULC, (b) soil map, (c) slope, and (d) HRUs of the Gilgel Abay watershed.
HBV is another conceptual and semi-distributed hydrological model for runoff simulation, flood planning, and climate change predictions developed by the Swedish Meteorological and Hydrological Institute (Lindström et al. 1997). This mode simulates daily runoff using daily rainfall, evapotranspiration, temperature, vegetation, and elevation zones.
The runoff components are computed by three linear reservoir equations :Q0 (direct runoff component), Q1 (intermediate runoff component), and Q2 (base runoff component) using recession coefficients K0, K1, and K2, respectively. The SM subroutine is based on the parameters beta (β) (shape coefficient for nonlinear storage properties of the soil zone), maximum soil storage (field capacity (FC)), and limit of potential evaporation (LP). Beta controls the influence of precipitation on the response function. MAXBAS (length of weighing function) is used as a transformation function to compute outflow from the catchment.
The snow routine represents snowmelt processes and their contribution to streamflow and is not appraised in this study because snow is not available in the catchment area. The variations in SM and groundwater contribution are measured by SM routine based on the quantity of flow approaching from the preceding routine (P) and FC. The outflow from groundwater flow is then determined as a summation of two or three outflows based on upper zone storage (SUZ) located above or below the threshold zone (upper zone limit (UZL)).
AI-based models
The AI-based models were trained using lagged runoff and rainfall data as input. ANN is an AI-based ‘black-box’ model that includes multiple nonlinear artificial neurons that are run laterally and can be trained as a particular or several layers. It is explicitly defined by network architecture, training and verification algorithms, and activation functions (Tongal & Booij 2018). ANN provides a very effective approach to dealing with noisy, nonlinear, and non-stationary data, especially in cases where the data do not conform to basic physical relationships, making ANN an appropriate model for time series prediction. The most common ANN structure in rainfall–runoff modeling is the multi-layer perceptron, which is trained with the backpropagation algorithm (BP) and comprises layers of input, and output. From the different forms of ANN, this study utilized FFNN to map between the input features and the target variable (runoff). The FFNN trained with the BP algorithm is the widely used ANN architecture for predicting hydrologic simulations and yielded reliable outcomes (Nourani et al. 2021a; Gelete et al. 2023c). The FFNN topology containing three layers including inputs, hidden, and output layers were utilized in the current study. The parameters of FFNN, such as its bias, number of hidden neurons, and epoch, are determined by trial-and-error methods until the best fit between the measured and modeled values is reached. The best results for gauge rainfall, CMORPH, 3B42RT, and 3B42 were achieved when the numbers of neurons were 10, 9, 13, and 16, respectively.
The third AI-based model, SVR, has been developed based on the idea of a support vector machine (SVM), that has been applied for nonlinear regression and grouping of the problems (Nourani et al. 2020). SVR is the AI-based model of a supervised learning method with two-layered structures, unlike the other black-box forecasting models; it minimizes operational risk as an objective function other than decreasing the inaccuracy between the observed and simulated variables. The weights in the first layer of the SVR are nonlinear, while in the second layer they are linear. Initially, the model creates a linear regression between input and target variables, then the outputs are set to nonlinear kernels to smooth the nonlinear behavior of the input datasets (Wang et al. 2013). The model uses the modified alternate loss function, which includes a distance measure and accurately characterizes the regression relationship between the variables. In this study, different kernels of the SVR model were calibrated using different methods, and the best result was obtained radial basis function (RBF). The detailed description, including the mathematical expression of the SVR model, is found in Wang et al. (2013).
Sensitivity analysis, model calibration, and validation
The most influential parameters of the watershed and inputs to the runoff simulation were determined for each of the proposed models during an early phase of model calibration and validation. Sensitivity analysis is an important process in any modeling because it identifies the critical variables and their level of importance that are used for model calibration and verification. In modeling with SWAT, the simulation was run after delineating the watershed, HRU specification, and weather definition. Afterward, the simulated result of the SWAT model was imported into the SWAT Calibration and Uncertainty Program (SWAT-CUP) software. Thus, the Sequential Uncertainty Fitting (SUFI-2) algorithm of the SWAT-CUP was then used for sensitivity analysis, model calibrations, and validations (Abbaspour 2015). Global sensitivity analysis using the SUFI-2 algorithm was used to identify and rank the most important input parameter influencing runoff in the study area based on their maximum t-stat and absolute p-value. For the HEC-HMS model, the best-known approach to sensitivity analysis is partial derivation and changing the ranges of each parameter turn by turn (Hamby 1994). One-at-a-time perturbation of parameter values between ±25 and ±30% with an interval of 5% has shown a good performance of the model (Zelelew & Melesse 2018; Bitew et al. 2012). For the current study, the sensitivity of each parameter was investigated by altering the range of the parameter between −25 and 25% at 5% intervals until the measured and simulated datasets were significantly matched. In this technique, a single parameter is tested and optimized at a time, while the other parameters are kept constant. The parameters used for HEC-HMS model sensitivity analysis were the CN, the initial abstraction, the Muskingum method (k and x) coefficients, and the base flow. The automatic Monte Carlo technique was applied to detect sensitive parameters of the HBV model and explore the random parameters with predefined ranges and objective functions. Before calibration, the upper and lower limits of the parameters representing the watershed characteristics were established. Each parameter was calibrated within the predefined range until the objective function (higher NSE) was optimized.
Similarly, the performances of AI-based models are greatly influenced by the selection of important features. This is because incorporating too many input parameters may increase the modeling complexity and cause overfitting issues and hence lead to unrealistic modeling outcomes. According to Nourani et al. (2021a, 2021b), the most common relevant input selection methods are partial derivative, Pearson correlation, and single input–output approach using neural networks. However, the Pearson correlation approach has been criticized for not accurately capturing the nonlinear associations between input and target variables (Nourani et al. 2020). Therefore, the current study utilized FFNN-based sensitivity analysis to choose and rank the input features for AI-based algorithms. In this approach, each input variable (lag discharge and rainfall) was fed to the input layer of the FFNN model one-at-a-time and the model was run. Afterward, the predicted runoff using FFNN using each input separately and the actual value was compared. An input that led to higher NSE values was considered the most significant variable. The rank of the input decreases when their NSE value is lower.
Modeling performance evaluation

Ensemble modeling
For the same datasets, one model could outperform the others, and when multiple input datasets are used, the results of each model would be exclusively different. To take advantage of every model without neglecting the overall characteristics of the datasets, the ensemble modeling practice has been developed, which uses the output of each model as input, assigning each model a certain prominence level owed to all with the support of relating functions to give the output (Kiran & Ravi 2008). The precision of the ensemble of outputs from diverse single models would typically be superior to the precision of the preeminent particular model (Shamseldin & O'connor 1999). Moreover, in ensemble modeling, the outputs of every single model could represent the sources of input datasets that might be unique to that specific model, and then combining all that information could optimize input datasets for the model. For example, AI-based models (ANFIS, ANN, and SVR) could effectively map the nonlinear relationship between rainfall and runoff in fast convergent time. However, their structures could not accurately apprehend the physical relation between rainfall and runoff in the catchment. The physically based models, on the other hand, incorporate the influence of groundwater, catchment characteristics, and meteorological factors in runoff generation. However, their predictive accuracy is not as powerful as the AI-based models. Thus, the motivation of ensemble modeling is to benefit the strength of every single model and boost the overall modeling result. Therefore, in this study, one nonlinear (neural network) and two linear (simple average, and weighted average) ensemble models are applied to boost the rainfall–runoff modeling efficiency of the proposed individual models. In this study, three different ensemble scenarios were considered, an ensemble of only AI-based models (scenario 1), an ensemble of only physically based models (scenario 2), and an ensemble of all models (both the physical and AI-based models) (scenario 3) using a fusion of different input sources. In the scenario, the output of the three AI-based models (FFNN, ANFIS, and SVR) was used as input to predict runoff using different ensemble techniques (NNE, SAE, and WAE). For scenario 1, the outputs of the physically based models (SWAT, HEC-HMS, and HBV) were used as input for the ensemble technique. Similarly, the outputs of all the models (SWAT, HEC-HMS, HBV, FFNN, ANFIS, and SVR) were used as input for runoff simulation using different ensemble techniques.
Nonlinear NNE
In the NNE technique, the outputs of the distinct models are used as inputs of the NNE; each is allocated to a neuron of the input layer. The procedure of the NNE modeling is the same as FFNN, where the preeminent model architecture and the number of iterations of the input amalgamation should be achieved by the trial-and-error method, and the sigmoid can be considered as the hidden and output activation function. The other nonlinear ensemble kernels, such as gene expression programming and ANFIS, can also be used, but FFNN was chosen for this study because of its simplicity, rapid trainability, and providing results comparable to other nonlinear ensemble techniques.
Simple average ensemble
Weighted average ensemble
NSEi is the performance measure (e.g., NSE coefficient) of the ith single model.
The input, output, and structure of each ensemble technique in different scenarios are summarized in Table 2.
The summary of the input, structure, and output for each ensemble method
Scenario . | Ensemble technique . | Inputs . | Structure . | Output . |
---|---|---|---|---|
1 | SAE | Outputs of AI-based models | 3-1 | Runoff (Q) |
WAE | Outputs of AI-based models | 3-1 | ||
NNE | Outputs of AI-based models | 3-6-1 | ||
2 | SAE | Outputs of Physical-based models | 3-1 | Runoff (Q) |
WAE | Outputs of Physical-based models | 3-1 | ||
NNE | Outputs of Physical-based models | 3-5-1 | ||
3 | SAE | Outputs of both AI and Physical models | 6-1 | Runoff (Q) |
WAE | Outputs of both AI and Physical models | 6-1 | ||
NNE | Outputs of both AI and Physical models | 6-9-1 |
Scenario . | Ensemble technique . | Inputs . | Structure . | Output . |
---|---|---|---|---|
1 | SAE | Outputs of AI-based models | 3-1 | Runoff (Q) |
WAE | Outputs of AI-based models | 3-1 | ||
NNE | Outputs of AI-based models | 3-6-1 | ||
2 | SAE | Outputs of Physical-based models | 3-1 | Runoff (Q) |
WAE | Outputs of Physical-based models | 3-1 | ||
NNE | Outputs of Physical-based models | 3-5-1 | ||
3 | SAE | Outputs of both AI and Physical models | 6-1 | Runoff (Q) |
WAE | Outputs of both AI and Physical models | 6-1 | ||
NNE | Outputs of both AI and Physical models | 6-9-1 |
Note: In SAE and WAE structures, a-b represents the numbers of inputs and output, respectively, while the x–y–z in the NNE structure represents the numbers of input, hidden neurons, and outputs, respectively.
RESULTS AND DISCUSSION
The current study was conducted in two major stages. First, all the proposed models were used to simulate the rainfall–runoff process using both gauge and satellite rainfall datasets separately, as well as the fusion of both datasets. Second, the outputs of the models were used as inputs for ensemble modeling by NNE, SAE, and WAE aimed at improving the overall accuracy of the modeling. The ensemble modeling was carried out for three different scenarios for each rainfall data source: (i) ensemble of AI-based models, (ii) ensemble modeling of physical-based models, and (iii) ensemble of all proposed models using the fusion rainfall data. The results of sensitivity analysis, individual model's results using different input parameters from diverse, and ensemble techniques are discussed in the following sub-sections.
Result of sensitivity analysis
In the SWAT model, the sensitive parameter was determined using the SUFI-2 uncertainty algorithm as discussed earlier. In this procedure, the upper and lower bounds of each parameter were adjusted and optimized until the best calibration result was obtained. Nine parameters are ranked based on p-value and t-stat as shown in Table 3. As shown in the table, the parameter with the maximum t-stat and minimum p-value was considered the most sensitive parameter.
The sensitivity analysis result of different parameters of SWAT modeling
Parameter . | Description . | Lower bound . | Upper bound . | Optimal value . | p-value . | t-stat . | Rank . |
---|---|---|---|---|---|---|---|
CN2 | Initial SCS runoff CN for moisture condition II | −0.2 | 0.2 | 0.04 | 0 | −56 | 1 |
ALPHA_BF | Base flow alpha factor-base flow recession constant | 0.01 | 1.2 | 0.68 | 0 | 11 | 2 |
SOL_K | Saturated hydraulic conductivity | 0.7 | 0.8 | 0.72 | 0 | 6.2 | 3 |
HRU_SLP | Average slope steepness | 0 | 1 | 0.18 | 0 | 3.72 | 4 |
SOL_BD | Moist bulk density | 0.9 | 2.5 | 0.12 | 0 | −3.62 | 5 |
SOL_AWC | Available water capacity of soil layer (mm H2O/mm soil) | −0.8 | 0.8 | 0.16 | 0.24 | −1.3 | 6 |
GW_REVAP | Groundwater ‘revap’ coefficient | 0.02 | 0.4 | 0.05 | 0.3 | −1.25 | 7 |
GW_DELAY | Groundwater delay time (days) | 35 | 450 | 45 | 0.41 | −0.95 | 8 |
Surlag | Surface runoff lag coefficient | 1 | 24 | 1.21 | 0.55 | 0.62 | 9 |
Parameter . | Description . | Lower bound . | Upper bound . | Optimal value . | p-value . | t-stat . | Rank . |
---|---|---|---|---|---|---|---|
CN2 | Initial SCS runoff CN for moisture condition II | −0.2 | 0.2 | 0.04 | 0 | −56 | 1 |
ALPHA_BF | Base flow alpha factor-base flow recession constant | 0.01 | 1.2 | 0.68 | 0 | 11 | 2 |
SOL_K | Saturated hydraulic conductivity | 0.7 | 0.8 | 0.72 | 0 | 6.2 | 3 |
HRU_SLP | Average slope steepness | 0 | 1 | 0.18 | 0 | 3.72 | 4 |
SOL_BD | Moist bulk density | 0.9 | 2.5 | 0.12 | 0 | −3.62 | 5 |
SOL_AWC | Available water capacity of soil layer (mm H2O/mm soil) | −0.8 | 0.8 | 0.16 | 0.24 | −1.3 | 6 |
GW_REVAP | Groundwater ‘revap’ coefficient | 0.02 | 0.4 | 0.05 | 0.3 | −1.25 | 7 |
GW_DELAY | Groundwater delay time (days) | 35 | 450 | 45 | 0.41 | −0.95 | 8 |
Surlag | Surface runoff lag coefficient | 1 | 24 | 1.21 | 0.55 | 0.62 | 9 |
The sensitivity analysis for the SWAT model shown in Table 3 revealed that CN is the most sensitive and surface runoff lag time (SURLAG) is the least sensitive parameter. The high sensitivity of CN could be due to the fact it is the function of the key flow-controlling parameters such as land use, hydrologic conditions, topography, and soil hydrologic group (Fanta & Sime 2022; Gelete et al. 2023b). Moreover, the land use pattern in the Gilgel Abay watershed is dominated by destructive agricultural practices that could affect the natural infiltration and runoff relationship of the soil. Therefore, it is logical to mention that runoff in the catchment is highly dependent on the CN of the soil. The SUFI-2 global sensitivity analysis also revealed that ALPHA_BF and SOL_K were identified as the second and third most sensitive parameters, respectively. This also shows the importance of groundwater aquifers in the hydrology of the study watershed. The first and second sensitive parameters were similar to other findings (Aliye et al. 2020; Fanta & Sime 2022; Gelete et al. 2023b).
The one-at-a-time approach of HEC-HMS sensitivity analysis showed that the soil CN, lag time (Tlag), initial abstraction (Ia), and Muskingum k and x coefficients were detected as the most sensitive parameters, respectively. During the sensitivity analysis of HEC-HMS, altering the values of these parameters resulted in a significant deviation of the predicted runoff values from their earlier value. The results of this study are in agreement with the findings by Zelelew & Melesse (2018); and Gelete et al. (2023b).
The third conceptual model employed in this study, HBV, used Monte Carlo optimization to identify and rank the parameters according to their importance. The maximum and minimum bounds for each parameter were defined and the model was set to perform 700,000 runs for the catchment. The sensitive parameters of the HBV model are presented in (Table 4). As indicated in the table, FC, K2, β, and LP were the most sensitive parameters, respectively, in that order. FC has a significant influence on the production of flow (Gelete et al. 2023b), and contributes immensely to runoff in moist soil as the water is readily available for flow production. Conversely, its impact on runoff formation in dry soil is minimal, as the soil stores less water, resulting in little flow production. FC also contributes greatly to the formulation of HBV by splitting the effective precipitation into runoff and soil water (Abebe et al. 2010). This could be the other possible reason for the significant importance and high sensitivity of FC in the HBV model. The second-most important HBV parameter was K2, while β and LP are the third and fourth most important parameters. The significance of β, the exponent of the ratio of SM to FC, is obvious as it determines how much water is recharging the soil and how much is producing runoff. Tibangayuka et al. (2022) stated that the high sensitivity of LP and β indicates the influence of evaporation and rainfall on runoff generation in the study watershed. Numerous studies in the literature on rainfall–runoff simulation of different watersheds found similar results (Ouatiki et al. 2020; Bizuneh et al. 2021).
Calibration parameters of the HBV model and their ranks
Parameter . | Explanation . | Unit . | Lower . | Upper . | Rank . |
---|---|---|---|---|---|
FC | SM storage | mm | 100 | 1,000 | 1 |
K2 | Recession coefficient (lower storage) | Day−1 | 0.001 | 0.1 | 2 |
β | Shape coefficient | – | 1 | 6 | 3 |
LP | SM threshold for evaporation reduction | – | 0.4 | 0.7 | 4 |
UZL | Reservoir threshold | mm | 0 | 100 | 5 |
PERC | Percolation to groundwater | mm/d | 0 | 0.25 | 6 |
K0 | Recession coefficient | Day−1 | 0.05 | 0.5 | 7 |
K1 | Recession coefficient (upper storage) | Day−1 | 0.01 | 0.1 | 8 |
MAXBAS | Base of the weight function | Day | 13 | 24 | 9 |
Parameter . | Explanation . | Unit . | Lower . | Upper . | Rank . |
---|---|---|---|---|---|
FC | SM storage | mm | 100 | 1,000 | 1 |
K2 | Recession coefficient (lower storage) | Day−1 | 0.001 | 0.1 | 2 |
β | Shape coefficient | – | 1 | 6 | 3 |
LP | SM threshold for evaporation reduction | – | 0.4 | 0.7 | 4 |
UZL | Reservoir threshold | mm | 0 | 100 | 5 |
PERC | Percolation to groundwater | mm/d | 0 | 0.25 | 6 |
K0 | Recession coefficient | Day−1 | 0.05 | 0.5 | 7 |
K1 | Recession coefficient (upper storage) | Day−1 | 0.01 | 0.1 | 8 |
MAXBAS | Base of the weight function | Day | 13 | 24 | 9 |
In AI-based models (FFNN, ANFIS, and SVR), the NNE has proven to be an effective model for evaluating the sensitivity of input datasets to output, as the NNE can successfully capture the nonlinear characteristics and dimensionality of hydrologic and meteorological datasets (Nourani & Sayyah Fard 2012). In this study, the key inputs for the runoff were identified using the ANN-based sensitivity assessment technique. The discharge (Q), rainfall (P), and temperature (T), each until four lag times were used to forecast runoff using FFNN. Then, the performance of each input was evaluated by NSE goodness of fit and ranked according to their significance for the rainfall–runoff modeling (Table 5). Accordingly, discharge and rainfall each with four lags were relevant to rainfall–runoff modeling and considered as inputs. Afterward, different input combinations were used to train the AI-based models employed in the current study and only the best results were discussed. The best result of the AI-based models was obtained using an input combination of Qt−1, Qt−2, Pt, Pt−1, and Qt−3.
Input sensitivity for AI-based models
Inputs parameters . | Mean NSE . | Rank . |
---|---|---|
Qt−1 | 0.831,7 | 1 |
Qt−2 | 0.824,2 | 2 |
Pt | 0.782,6 | 3 |
Pt−1 | 0.772,4 | 4 |
Qt−3 | 0.770,4 | 5 |
Qt−4 | 0.769,8 | 6 |
Pt−2 | 0.754,9 | 7 |
Pt−3 | 0.742,1 | 8 |
Pt−4 | 0.739,2 | 9 |
T | 0.187,7 | 10 |
Tt−1 | 0.185,5 | 11 |
Tt−2 | 0.182,4 | 12 |
Tt−3 | 0.178,6 | 13 |
Tt−4 | 0.175,5 | 14 |
Inputs parameters . | Mean NSE . | Rank . |
---|---|---|
Qt−1 | 0.831,7 | 1 |
Qt−2 | 0.824,2 | 2 |
Pt | 0.782,6 | 3 |
Pt−1 | 0.772,4 | 4 |
Qt−3 | 0.770,4 | 5 |
Qt−4 | 0.769,8 | 6 |
Pt−2 | 0.754,9 | 7 |
Pt−3 | 0.742,1 | 8 |
Pt−4 | 0.739,2 | 9 |
T | 0.187,7 | 10 |
Tt−1 | 0.185,5 | 11 |
Tt−2 | 0.182,4 | 12 |
Tt−3 | 0.178,6 | 13 |
Tt−4 | 0.175,5 | 14 |
Results of single models
The six applied models (SWAT, HEC-HMS, HBV, ANFIS, FFNN, and SVR) were used separately to simulate the runoff using gauge- and satellite-based rainfall datasets. The results obtained for the models using both data sources are presented in Table 5. Based on the performance criteria in Table 6, among the process-based models, the SWAT model using satellite and gauge-based rainfall displayed a more reliable performance than HBV and HEC-HMS in daily runoff simulation; however, it performed less than AI-based models. The result obtained by SWAT exhibited NSE values of 0.81 and 0.784, and 26.818 and 29.742 m3/s for gauge-driven and CMORPH rainfall products, respectively, at the validation stage. From the satellite-driven rainfall products, the CMORPH-based rainfall–runoff model outperformed over 3B42RT and 3B42 rainfall product-based models. The gauge rainfall data-based models are more accurate than satellite rainfall-based models; this could be because of the bias that occurred by the satellites while apprehending rainfall information. As presented by Nourani et al. (2021a, 2021b), the 3B42 and 3B42RT underestimate most of the peak rainfall which could reduce the performance of the rainfall–runoff modeling. As indicated in Table 6, the performance of the SWAT model for both datasets is far better than the acceptable range (NSE = 0.5). Hence, it is normal to conclude that SWAT can effectively simulate runoff by using satellite-driven rainfall products. A semi-distributed physically based hydrologic model is more appropriate for a medium-sized watershed with moderate to hilly topographic conditions (Wittwer 2013). Gilgel Abay watershed has heterogeneous physiographic characteristics and is a medium-sized watershed. Based on the guidelines set by Moriasi et al. (2015), the SWAT model gave very good performance (NSE > 0.8) for the gauged rainfall dataset, and good performance (NSE > 0.7) for all the satellite rainfall data. The results of this study show that SWAT reproduces most of the observed discharges and is superior to the other physically based models. Therefore, it is worth mentioning that the SWAT model is suitable for modeling the rainfall runoff of the Gilgel Abay catchment.
Results of applied single models
Model . | Rainfall source . | Structure . | NSE . | RMSE(m3/s) . | ||
---|---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | |||
SWAT | Gauge | – | 0.872 | 0.81 | 23.267 | 26.818 |
CMORPH | – | 0.836 | 0.784 | 25.334 | 29.742 | |
3B42RT | – | 0.814 | 0.762 | 26.52 | 31.345 | |
3B42 | – | 0.805 | 0.752 | 28.42 | 32.68 | |
HEC-HMS | Gauge | – | 0.837 | 0.776 | 25.837 | 29.701 |
CMORPH | – | 0.809 | 0.759 | 26.84 | 32.067 | |
3B42RT | – | 0.782 | 0.744 | 27.452 | 33.462 | |
3B42 | – | 0.7724 | 0.734 | 29.24 | 35.547 | |
HBV | Gauge | – | 0.842 | 0.792 | 25.045 | 31.271 |
CMORPH | – | 0.812 | 0.763 | 26.31 | 32.35 | |
3B42RT | – | 0.805 | 0.754 | 27.621 | 33.24 | |
3B42 | – | 0.786 | 0.748 | 28.54 | 34.465 | |
ANFIS | Gauge | Gaussian | 0.913 | 0.864 | 20.147 | 23.588 |
CMORPH | Triangular | 0.885 | 0.838 | 22.734 | 25.82 | |
3B42RT | Gaussian | 0.862 | 0.812 | 24.52 | 27.52 | |
3B42 | Gaussian | 0.846 | 0.794 | 26.52 | 29.25 | |
FFNN | Gauge | 5-10-1 | 0.857 | 0.816 | 23.454 | 27.814 |
CMORPH | 5-9-1 | 0.839 | 0.805 | 24.7 | 29.57 | |
3B42RT | 5-13-1 | 0.816 | 0.782 | 25.65 | 31.42 | |
3B42 | 5-16-1 | 0.805 | 0.772 | 27.22 | 32.52 | |
SVR | Gauge | RBF | 0.848 | 0.809 | 23.871 | 28.67 |
CMORPH | RBF | 0.831 | 0.790 | 24.927 | 31.333 | |
3B42RT | RBF | 0.808 | 0.772 | 26.34 | 32.52 | |
3B42 | RBF | 0.785 | 0.764 | 28.42 | 34.21 |
Model . | Rainfall source . | Structure . | NSE . | RMSE(m3/s) . | ||
---|---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | |||
SWAT | Gauge | – | 0.872 | 0.81 | 23.267 | 26.818 |
CMORPH | – | 0.836 | 0.784 | 25.334 | 29.742 | |
3B42RT | – | 0.814 | 0.762 | 26.52 | 31.345 | |
3B42 | – | 0.805 | 0.752 | 28.42 | 32.68 | |
HEC-HMS | Gauge | – | 0.837 | 0.776 | 25.837 | 29.701 |
CMORPH | – | 0.809 | 0.759 | 26.84 | 32.067 | |
3B42RT | – | 0.782 | 0.744 | 27.452 | 33.462 | |
3B42 | – | 0.7724 | 0.734 | 29.24 | 35.547 | |
HBV | Gauge | – | 0.842 | 0.792 | 25.045 | 31.271 |
CMORPH | – | 0.812 | 0.763 | 26.31 | 32.35 | |
3B42RT | – | 0.805 | 0.754 | 27.621 | 33.24 | |
3B42 | – | 0.786 | 0.748 | 28.54 | 34.465 | |
ANFIS | Gauge | Gaussian | 0.913 | 0.864 | 20.147 | 23.588 |
CMORPH | Triangular | 0.885 | 0.838 | 22.734 | 25.82 | |
3B42RT | Gaussian | 0.862 | 0.812 | 24.52 | 27.52 | |
3B42 | Gaussian | 0.846 | 0.794 | 26.52 | 29.25 | |
FFNN | Gauge | 5-10-1 | 0.857 | 0.816 | 23.454 | 27.814 |
CMORPH | 5-9-1 | 0.839 | 0.805 | 24.7 | 29.57 | |
3B42RT | 5-13-1 | 0.816 | 0.782 | 25.65 | 31.42 | |
3B42 | 5-16-1 | 0.805 | 0.772 | 27.22 | 32.52 | |
SVR | Gauge | RBF | 0.848 | 0.809 | 23.871 | 28.67 |
CMORPH | RBF | 0.831 | 0.790 | 24.927 | 31.333 | |
3B42RT | RBF | 0.808 | 0.772 | 26.34 | 32.52 | |
3B42 | RBF | 0.785 | 0.764 | 28.42 | 34.21 |
Note: a-b-c in the FFNN structure (in Table 6) represents the number of inputs, number of hidden neurons, and number of outputs, respectively. Also, the best results of the ANFIS, SVR, and FFNN models were achieved using an input combination of Qt−1, Qt−2, Pt, Pt−1, and Qt−3.
After SWAT, the HBV model gave the second-best predictive models outperforming the HEC-HMS. It yielded NSE values of 0.842 and 0.792, and RMSE values of 25.045 and 31.271 m3/s during the calibration and testing sets, respectively, when the gauge rainfall was used. For the CMORPH rainfall source, the HBV model provides NSE values of 0.812 and 0.763, and RMSE values of 26.31 and 32.35 m3/s during the calibration and validation phases, respectively (Table 6). Less prediction performance was obtained using 3B42RT and 3B42-based rainfall data with validation set NSE values of 0.754 and 0.748, respectively. Based on Moriasi et al.'s (2015) model performance rating, HBV using 3B42RT, and 3B42-based rainfall data led to good prediction results. The least accurate modeling performance was obtained using the HEC-HMS model with the validation period NSE and RMSE values of 0.776 and 29.701 m3/s, respectively for the gauge rainfall data. For CMORPH, 3B42RT, and 3B42 rainfall sources, the HEC-HMS model led the validation NSE of 0.759, 0.744, and 0.734. Based on the performance metrics shown in Table 5, the SWAT model outperformed both the HEC-HMS and HBV models for all data sources. According to Gelete et al. (2023b), the better performance of the SWAT model could be due to its ability to divide the study watershed into sub-watersheds with uniform hydrologic conditions.
As AI-based modeling, a Sugeno-type ANFIS was used and the MFs were determined by hybrid optimization algorithms. The Gaussian, triangular, and trapezoidal types of MFs were iterated until the best runoff result and ANFIS architecture were obtained. The ANFIS structure and MFs that reproduced the best rainfall–runoff result with optimal epoch are given in Table 5 for both gauge and satellite datasets. The ANFIS model generated the best prediction result, excelling all the physical and other AI-based models with validation phase NSE of 0.864 and 0.838 for gauge and CMORPH rainfall products, respectively. It led to smaller RMSE values compared to the other computing models. The better performance of the ANFIS model could be down to its ability to capture nonlinear and complex problems as it is a hybrid of the learning capacity of ANN and the reasoning of FIS (Nourani et al. 2021a, 2021b). The integrations of the fast-learning capability of ANN and the fuzzy inference system make the ANFIS model a more accurate one.
The second-best prediction results were obtained using the FFNN model. The FFNN trained via BP and Levenberg-Marquardt (LM) algorithms with one hidden layer and variable hidden neurons were used to model rainfall runoff. The ideal number of hidden neurons was determined by trial-and-error in the range of 9–16 for both the satellite and gauge rainfall data. It is noted that the FFNN trained by gauge data with an NSE of 0.816 is superior to the satellite-driven rainfall–runoff model in the validation phase. In FFNN modeling, it is practical to choose rainfall datasets with a smaller number of hidden neurons. Among the satellite datasets, the FFNN modeling based on the CMORPH rainfall dataset was superior to the modeling based on the 3B42RT and 3B42 rainfall datasets. Hence, FFNN modeling with the CMORPH rainfall dataset could model runoff with a short running time and minimal cost. Moreover, FFNN was able to accurately reproduce low flows, but it is less accurate at modeling peak flows.
The RBF kernel was used to build SVR rainfall–runoff models based on both satellite and gauge rainfall datasets. SVR also can use other kernels such as polynomial and sigmoidal, but RBF was selected because it requires fewer tuning parameters and has already been shown to outperform the other kernels (Sharghi et al. 2018). The SVR model led the third most accurate modeling result, surpassing all the physical-based models. The SVR results for gauge and satellite rainfall datasets are presented in Table 6. The NSE values of SVR rainfall–runoff modeling obtained in the validation phase are 0.809 and 0.79 for the gauge and CMORPH precipitation datasets, respectively. SVR could better reproduce low flow but underestimated high flows for both rainfall datasets.
Based on the statistical performance metrics shown in Table 6, the accuracy of the employed predictive algorithms was in the order of ANFIS > FFNN > SVR > SWAT > HBV > HEC- HMS. Based on Table 5, the application of the best AI model (i.e., ANFIS) boosted the accuracy of HEC-HMS, HBV, and the SWAT models using the gauge rainfall by 20.6, 16.16, and 12.86%, respectively, based on the NSE value in the validation phase.
Scatter plot of observed versus simulated flows: (a) SWAT, (b) HEC-HMS, (c) HBV, (d) ANFIS, (e) FFNN, and (f) SVR using the gauge rainfall data.
Scatter plot of observed versus simulated flows: (a) SWAT, (b) HEC-HMS, (c) HBV, (d) ANFIS, (e) FFNN, and (f) SVR using the gauge rainfall data.
Scatter plot of observed versus simulated flows: (a) SWAT, (b) HEC-HMS, (c) HBV, (d) ANFIS, (e) FFNN, and (f) SVR using the CMORPH rainfall data.
Scatter plot of observed versus simulated flows: (a) SWAT, (b) HEC-HMS, (c) HBV, (d) ANFIS, (e) FFNN, and (f) SVR using the CMORPH rainfall data.
Boxplot of observed versus simulated runoff by SWAT and ANFIS using gauge rainfall and satellite rainfall at the validation phase.
Boxplot of observed versus simulated runoff by SWAT and ANFIS using gauge rainfall and satellite rainfall at the validation phase.
Results of modeling by rainfall data fusion
To improve the modeling performance, gauge and satellite-driven rainfall products were combined to simulate runoff, and the effects of rainfall fusion on runoff simulation were evaluated (Table 7). The fusion of rainfall datasets from the three satellite sources and gauges was deployed into the models to simulate runoff. Before satellite-derived rainfall data are used for hydrologic modeling, they should usually be ‘bias-corrected’ by a statistical relationship to gauge precipitation data (Bitew et al. 2012). The most common approaches for correcting the bias of satellite precipitation data are based on the following processes: First, the bias of the satellite rainfall dataset is calculated as the ratio of the average satellite rainfall products on the specific grid covered by the gauging station to the corresponding gauge precipitation records. Second, the originally obtained satellite rainfall dataset is multiplied by the bias obtained in step 1, and the bias could be removed. However, in the present study, the raw satellite dataset was imposed into the models along with measured data, which could serve as a strategy to correct the bias of satellite data. For HBV, the WAE of rainfall records from both data sources was used as the combined input because HBV accepts only a single climate file. The rainfall–runoff results of rainfall fusion (Table 7) showed that it could meaningfully enhance the performance of modeling when compared with individual satellite data-based modeling but it only marginally improved the gauge rainfall data-based modeling results (see Table 6). In the specific case of using satellite rainfall datasets, the rainfall–runoff modeling improvement could be related to the bias correction by rainfall fusion mechanisms. The gauge rainfall data reproduced more accurate runoff than satellite rainfall products. The logic behind this could be the fact that gauge-based rainfall could capture the most accurate and valid hydrological information that can represent physical processes at the watershed level. The quality of satellite-estimated rainfall depends on several factors, such as cloud coverage condition of the sky, revisit time of rainfall measuring spacecrafts and their corresponding orbital locations. The unsteady characteristics of the mentioned factors could cause bias in the rainfall estimation process. The fusion of rainfall products from several sources enhanced the robustness of the models when compared with the results of single-source rainfall data. Therefore, it is worth mentioning that gauge rainfall data could correct the bias of satellite rainfall datasets and enhance the rainfall–runoff simulation capability of the models.
Modeling performance for rainfall datasets fusion
Model . | Structure . | NSE . | RMSE(m3/s) . | ||
---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | ||
SWAT | – | 0.884 | 0.821 | 21.846 | 26.075 |
HEC-HMS | – | 0.843 | 0.785 | 24.561 | 28.643 |
HBV | – | 0.855 | 0.814 | 23.526 | 28.042 |
ANFIS | Gaussian | 0.925 | 0.875 | 18.426 | 21.84 |
FFNN | 8-12-1 | 0.875 | 0.827 | 21.547 | 25.258 |
SVR | RBF | 0.858 | 0.818 | 22.542 | 27.681 |
Model . | Structure . | NSE . | RMSE(m3/s) . | ||
---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | ||
SWAT | – | 0.884 | 0.821 | 21.846 | 26.075 |
HEC-HMS | – | 0.843 | 0.785 | 24.561 | 28.643 |
HBV | – | 0.855 | 0.814 | 23.526 | 28.042 |
ANFIS | Gaussian | 0.925 | 0.875 | 18.426 | 21.84 |
FFNN | 8-12-1 | 0.875 | 0.827 | 21.547 | 25.258 |
SVR | RBF | 0.858 | 0.818 | 22.542 | 27.681 |
The performance of all the applied models was generally good in both the calibration and validation stages. Nevertheless, satellite rainfall-based models indicated slight deviations from observed runoff; thus, the modeling performance was lower. The possible cause for the deviations could be the precision of satellite sensors to retrieve rainfall information and the topographic heterogeneity of the study area. Gebremichael et al. (2014) stressed that the CMORPH, 3B42RT, and 3B42 satellite rainfall sources might overrate daily rainfall in lowland and underrate it in plateau areas. The topography of the Gilgel watershed is heterogeneous and diverges between 1,866 and 3,543 m above sea level (see Figure 5), therefore, the rainfall is more prone to topographic influence. The accuracy of satellite rainfall estimation could also be influenced by the algorithms they used to transform the retrieved information into the rainfall. In this regard, the sensors that use microwave-based algorithms indicate superiority over infrared wave-based algorithms (Bitew & Gebremichael 2010). As indicated in Table 6, the microwave algorithm-based CMORPH-based modeling reproduced runoff more accurately than the infrared microwave combination-based 3B42RT and infrared-based 3B42 rainfall products.
The runoff simulation for the period of consideration indicates that the applied models can reproduce the observed runoff in the catchment for both individual rainfall data and rainfall fusion data (see Tables 6 and 7). However, it is noted that each model is not performed equally in rainfall–runoff modeling and could not apprehend the physical process of the watershed. For example, the performance of ANFIS surpassed the other models at both calibration and validation phases for both rainfall datasets, as shown in the scatter (Figures 7 and 8) and box plot (Figure 9).
The findings of the physically based models of this study are compared with the results of Bizuneh et al. (2021); Chathuranika et al. (2022); Gelete et al. (2023a) and Shekar & Vinay (2021) and demonstrate similarity. For example, Gelete et al. (2023a) compared HBV, HEC-HMS, and SWAT models for runoff simulation in the agricultural catchment and reported the superiority of the SWAT model with NSE values of 0.847 and 0.809 for calibration and validation sets, respectively. Bizuneh et al. (2021) compared HBV and SWAT models, and Chathuranika et al. (2022) compared HEC-HMS and SWAT models. In both studies, the SWAT model outperformed the competing physical models with validation set NSE values of 0.8 and 0.82, respectively. Similarly, a fair agreement was observed between this study and other studies that compared AI-based models and process-based models and reported the superiority of AI-based models (Young et al. 2017; Senent-Aparicio et al. 2019).
Simulated runoff using gauge rainfall via SWAT, HEC-HMS, HBV, ANFIS, FFNN, and SVR (a) at the validation step and (b) details for the year 2017.
Simulated runoff using gauge rainfall via SWAT, HEC-HMS, HBV, ANFIS, FFNN, and SVR (a) at the validation step and (b) details for the year 2017.
As shown in Figure 10, the simulated runoff by HBV and SVR is closer to the observed runoff value in dry seasons, however, HEC-HMS underestimated low flows. ANFIS overestimated low flows but it captured peak flows well. It is noted that SWAT and FFNN are fairly good at simulating low flow but they could capture average flow well.
The results on the selected dates show that each model could derive different results at different times of the year. Therefore, combining the results of different models through the ensemble method could enhance the simulation capacity of the modeling and provide more accurate results. To this end, the results of each model were used as inputs, and two linear (SAE and WAE) and one nonlinear (NNE) ensemble model were carried out and described in the following section.
Results of ensemble techniques
Ensemble modeling can improve the rainfall–runoff modeling capacity of the individual models (SWAT, HEC-HMS, HBV, ANFIS, FFNN, and SVR). To further improve the efficiency of rainfall–runoff modeling, the ensemble technique was performed in three different scenarios, as explained in Section 2.4. The weights of WAE were determined by NSE at the validation stage according to Equation (8). The NNE technique was developed by FFNN via BP trained with one hidden layer and variable numbers of neurons until the optimal epoch was reached. The results of ensemble modeling are given in Table 8. From the employed ensemble models, it is noted that the performance of NNE surpasses SAE and WAE techniques in both calibration and validation phases for all ensemble scenarios. NNE modeling of scenario 1 improved the modeling accuracy of low-performed (3B42) satellite rainfall-based modeling by 17.4, 14.16, and 16.5% for SVR, ANFIS, and FFNN, respectively, at the validation phase (see Tables 7 and 8). Likewise, the NNE scenario 1 improved the fusion modeling of SVR, ANFIS, and FFNN by 11.6, 5.4, and 10.6%, respectively, at the validation step (see Tables 7 and 8). The superior performance of the nonlinear ensemble (NNE) method over the linear methods could be due to its strength to effectively handle the nonlinear and complex hydrologic process (Nourania et al. 2020). The nonlinear ensemble method, particularly the SAE, performed better than the least-performing individual modes used in each scenario and performed less well than the best individual models. According to Umar et al. (2021), the possible reason could be that arithmetic averaging gives values between the lowest values and the highest values in the set. Also, in all the scenarios, the WAE was slightly better than the SAE. This could be due to the weight assigned to each individual model based on their importance.
Results of ensemble techniques for different scenarios
Scenarios . | Ensemble model . | NSE . | RMSE (m3/s) . | ||
---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | ||
Scenario 1 | SAE | 0.884 | 0.848 | 20.422 | 24.564 |
WAE | 0.894 | 0.862 | 19.256 | 23.724 | |
NNE | 0.957 | 0.925 | 15.452 | 18.265 | |
Scenario 2 | SAE | 0.862 | 0.801 | 23.413 | 27.767 |
WAE | 0.874 | 0.812 | 22.414 | 26.522 | |
NNE | 0.926 | 0.884 | 16.405 | 19.547 | |
Scenario 3 | SAE | 0.935 | 0.881 | 19.221 | 22.467 |
WAE | 0.948 | 0.891 | 18.054 | 21.623 | |
NNE | 0.964 | 0.931 | 14.306 | 17.468 |
Scenarios . | Ensemble model . | NSE . | RMSE (m3/s) . | ||
---|---|---|---|---|---|
Calibration . | Validation . | Calibration . | Validation . | ||
Scenario 1 | SAE | 0.884 | 0.848 | 20.422 | 24.564 |
WAE | 0.894 | 0.862 | 19.256 | 23.724 | |
NNE | 0.957 | 0.925 | 15.452 | 18.265 | |
Scenario 2 | SAE | 0.862 | 0.801 | 23.413 | 27.767 |
WAE | 0.874 | 0.812 | 22.414 | 26.522 | |
NNE | 0.926 | 0.884 | 16.405 | 19.547 | |
Scenario 3 | SAE | 0.935 | 0.881 | 19.221 | 22.467 |
WAE | 0.948 | 0.891 | 18.054 | 21.623 | |
NNE | 0.964 | 0.931 | 14.306 | 17.468 |
In scenario 2 ensemble modeling, the NNE technique improved 3B42 rainfall-based models by 17, 15.4, and 14.9% for HEC-HMS, HBV, and SWAT, respectively, at the validation phase (Tables 6 and 8). Similarly, it enhanced the performance of fusion modeling by 11.2, 8, and 7.1% for HEC-HMS, HBV, and SWAT, respectively, at the validation phase (Table 8).
Based on NSE values, ensemble modeling using the fusion of rainfall data could meaningfully enhance the rainfall–runoff modeling performance of single models when it is compared with modeling using separate rainfall from each source and rainfall data fusion (Tables 6–8). The NNE technique in scenario 3 improved the accuracy of the low-performing modeling of HEC-HMS, HBV, SWAT, SVR, FFNN, and ANFIS in the validation phase by up to 21.2, 19.7, 19.2, 18, 17, and 14.7%, respectively. It also improved rainfall fusion-based modeling accuracy of HEC-HMS, HBV, SWAT, SVR, FFNN, and ANFIS by 15.7, 12.6, 11.8, 12.1, 11.2, and 6%, respectively, in the validation phase.
Comparing the accuracy of the three scenarios of the ensemble techniques considered in this study, it is noted that the ensemble technique of scenario 3 was more accurate than scenarios 1 and 2 in both the calibration and validation phases. In this scenario, the ensemble technique was able to take advantage of the individual models involved and the input rainfall datasets. It can be concluded that the fusion of precipitation data allows the models to capture the most important features of rainfall from each source. The physically based models are generally able to understand the physics of hydrologic parameters involved in the rainfall–runoff process. AI-based models are also good at representing nonlinear relationships among watershed parameters. The ensemble technique in scenario 3 combines the outputs of both groups of models and therefore could further improve modeling efficiency by combining the benefits of all models involved in rainfall–runoff modeling. Based on the results, it is worth noting that the ensemble of runoff modeling could meaningfully increase the simulation performance of individual models using separate rainfall data from diverse sources.
Scatter plots for (a) ANFIS-gauge, (b) ANFIS-CMORPH, (c) ANFIS-fusion, and (d) NNE fusion ensemble in the validation phase.
Scatter plots for (a) ANFIS-gauge, (b) ANFIS-CMORPH, (c) ANFIS-fusion, and (d) NNE fusion ensemble in the validation phase.
The preeminence of NNE over SAE and WAE might lie in the ability of the nonlinear models to understand well the nonlinear and complicated physical relationship between rainfall and runoff. The linear models have shown lower performance than the NNE for the following reasons: Unlike the NNE technique, the SAE and WAE techniques are linear and might work well only when examining the direct relationship between the inputs and outputs of the models. Therefore, the drawbacks of individual models could propagate and syndicate through linear ensemble models, since the models linearly combine the outputs of the individual models.
Taylor diagram indicating the performances of SAE, WAE, and NNE techniques for fusion of rainfall data.
Taylor diagram indicating the performances of SAE, WAE, and NNE techniques for fusion of rainfall data.
CONCLUSIONS
This study was aimed at rainfall–runoff modeling by SWAT, HEC-HMS, HBV, ANFIS, FFNN, and SVR using five gauging stations and three satellite-based rainfall datasets in the Gilgel Abay catchment in the Upper Blue Nile Basin, Ethiopia. For this purpose, 10 years (2009–2018) of daily hydro-climatological in addition to spatial data were utilized. The most sensitive parameters of rainfall-runoff modeling for each model were investigated, and CN2 and ALPHA_BF for SWAT, the initial abstraction (Ia) and lag time (Tlag) for HEC-HMS, and FC and K2 for HBV were obtained as the first and second most sensitive parameters, respectively. The relevant input sets for AI-based models were also identified using the nonlinear sensitivity analysis method and accordingly, discharge and rainfall values for four time lags were found to be sensitive inputs of modeling.
The performances of the employed models were assessed using RMSE and NSE values. The findings of this study demonstrated that the ANFIS model showed superior performance in all input scenarios with NSE values of 0.864, 0.838, and 0.875 for gauge, satellite, and fusion rainfall data, respectively, for the validation set. In all the employed individual models, the fusion of rainfall datasets from diverse sources showed a significant improvement over the rainfall–runoff modeling results using only satellite rainfall datasets, and it also revealed a slight improvement over gauge rainfall-based runoff modeling.
Afterward, two linear (SAE and WAE) and one nonlinear (NNE) ensemble techniques were applied in three different scenarios. In all ensemble scenarios, NNE improved the performance of modeling in both the calibration and validation phases. The NNE technique in scenario 3 improved the accuracy of the low-performing modeling of HEC-HMS, HBV, SWAT, SVR, FFNN, and ANFIS in the validation phase by up to 21.2, 19.7, 19.2, 18, 17, and 14.7%, respectively. It also improved rainfall fusion-based modeling accuracy of HEC-HMS, HBV, SWAT, SVR, FFNN, and ANFIS by 15.7, 12.6, 11.8, 12.1, 11.2, and 6%, respectively, in the validation phase. Among the ensemble techniques used, NNE was a powerful and accurate ensemble method for rainfall–runoff simulation because the model was able to explore the nonlinear relationships of the hydrologic process. Overall, the outcome of the current study would be a groundbreaking step toward utilizing rainfall datasets combined from multiple satellite sources in data-sparse, ungagged, and unevenly gauged catchments. This could be a good option to get reliable rainfall datasets in developing countries. Furthermore, future studies should focus on validating satellite rainfall products using the local rain-gauge time series data as the reference data; hence, the relevance and credibility of satellite datasets would be better verified. In addition, future studies should investigate the effects of regional and global climate change on the rainfall–runoff process using a satellite rainfall dataset.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.