The modeling of the rainfall–runoff (RR) process is a key component for water resources projects, planning, and management for which conceptual and data-driven modeling techniques are utilized. However, these techniques in the modeling of the RR process have their own benefits, and their performance need to be explored in real basins. In this paper, five conceptual models (namely, AWBM, Sacramento, SimHyd, SMAR, and TANK) and an artificial neural network (ANN) model have been developed and their performances have been assessed using the rainfall, runoff, and other climatic data derived from the Bird Creek Basin, USA. The results obtained from the study suggest that the SimHyd performed the best among all the conceptual models during testing. The ANN model performance in simulating the RR process was found to be the best among all the models developed in this study during testing with the highest values of R = 0.941, E = 0.994 and all threshold statistics and least values of AARE = 38.9 and NRMSE = 0.031. Overall, it can be concluded that although the conceptual models are highly comprehensible, the ANN models are able to simulate the flow more accurately than any of the conceptual models developed in this study.

  • Five conceptual models with a varied number of parameters are developed for rainfall–runoff (RR) process in a basin.

  • Artificial neural network (ANN) model is developed for the same data set.

  • Using standard statistical measures, the performances have been evaluated.

  • It was found that the ANN RR model outperformed all the conceptual RR models, but is less comprehensible as compared to their conceptual counterparts.

The rainfall–runoff (RR) process plays a crucial role in the majority of water resource management activities. For modeling the RR process, mainly two approaches are used: conceptual models and data-driven models. The conceptual models involve simple mathematical equations of water movement based on mass and energy conservation principles which were first introduced by Mulvany in 1850. Later many researchers came up with various conceptual RR models that varied in their complexity: Soil Moisture and Accounting Model (SMAR) (O'Connell et al. 1970), SACRAMENTO (Buchtele 1993), SimHyd (Chiew & McMahon 1994), AWBM (Boughton 1996), PREVIS (Birikundavyi et al. 2002). The application of conceptual RR modeling is a prevalent method for simulating hydrological phenomena within catchment areas and apart from these models, the HEC-HMS and SWAT models have been explored by many researchers (Pandey et al. 2017). Recent studies have also shown new models being developed to better represent the hydrological processes in heterogeneous catchments (Soni et al. 2022; Vidyarthi & Jain 2023a, b). One such model is the Variable Contributing Area (D-VCA) model proposed by Zeng & Chu (2021), which considers the spatial variations of soil characteristics and land use to simulate the variability of runoff generation across a catchment. Although the conceptual models are comprehensive, the requirement of large information of basin in terms of data and their performance limits their applicability during operational use. Moreover, the choice of the modeling technique for a specific catchment depends upon the data availability, their simple calibration procedure, and their performance in modeling the RR process. Due to these reasons, data-driven techniques came into existence which provide higher accuracy and less data requirement in their model development. Statistical, stochastic and machine learning methods come under data-driven technique. Mohammadi et al. (2019) used cross-wavelet–linear programming-Kalman filter and GIUH methods in RR modeling. Ghorbani et al. (2010) investigated the application of catastrophe theory to examine its suitability to identify possible discontinuities in the RR process. Artificial neural network (ANN), which is one of the data-driven techniques, gained popularity within the field of hydrology and water resources engineering including open channel hydraulics (Ayaz et al. 2023; Shekhar et al. 2023; Baranwal & Das 2024), RR process, groundwater (GW) modeling and drought forecasting (Azmathullah et al. 2005; See et al. 2009; Kashani et al. 2016; Vidyarthi & Jain 2020; Rahimzad et al. 2021; Vidyarthi & Jain 2023a; b); and few researchers reported the superiority of ANNs over conceptual techniques in modeling the RR process (Demirel et al. 2009). Nevertheless, despite the superior accuracy of ANNs in comparison to conceptual models, the opaque nature of ANN creates skepticism among field operators and policymakers, limiting its adoption in water resource projects, planning, and management (Vidyarthi & Jain 2023a; b). Thus, it is essential to develop models for simulating the RR process in a basin using both conceptual and ANN techniques and evaluate their performance and comprehensibility to establish their efficacy in operational use. A study on the comparison of conceptual models with the ANN model is few and far and only a few conceptual models have been compared with the ANN model. Therefore, it is interesting to compare the popular and established conceptual models with the ANN technique.

This study aims to compare the performance of five popular conceptual models with the performance of the ANN model in simulating the RR process. The specific objectives of the study are to (1) develop conceptual AWBM, Sacramento, SimHyd, SMAR, and TANK models for RR process simulation; (2) develop an ANN model for RR process simulation; (3) compare the performance of conceptual models; (4) compare the best conceptual model with the ANN model in simulating RR process. The average rainfall and stream flow data on a daily timescale derived from the Bird Creek River basin have been employed to develop all the models in this study.

Several standard statistical performance metrics have been employed to validate all the developed models. The paper commences with a concise overview of both the conceptual models and ANN, followed by a comprehensive explanation of the methodology utilized in model development. Subsequently, the results and discussion section are presented. The paper ends with the concluding remarks at the last.

In this study, five different conceptual models with varied number of parameters and the calibration process are developed for simulating the RR process. The models are compared on the basis of their performance in capturing the low, medium and high runoff (peak discharge). For the same basin, the ANN model is also developed and their performance is also compared with the developed conceptual models in this study. The methods and approaches utilized in this study are discussed briefly in the subsequent sections.

Conceptual modeling technique

A conceptual rainfall–runoff (CRR) model employs a distinct approach where the movement of water is not described by the mass, energy, and momentum equations, rather a simplified conceptual representation of physics involved in the process is adopted. This depiction often comprises interconnected storage units and streamlined budgeting techniques, which always ensure a comprehensive mass balance encompassing all inflows, outflows, and changes in storage. In this study, five CRR models are developed, namely, AWBM, Sacramento, SimHyd, SMAR, and TANK models. A brief discussion is given in the following subsections.

Australian water balance model

The Australian Water Balance Model (AWBM, Boughton 1996) is a conceptual RR model, the schematic diagram containing the equations and corresponding parameters of which are shown in Figure 1.
Figure 1

Schematic diagram of the AWBM model.

Figure 1

Schematic diagram of the AWBM model.

Close modal

The AWBM model consists of three surface storage elements for simulating partial runoff areas and their water balance (WB) is calculated for each three of them separately. The AWBM model works on eight parameters: A1, A2, and A3 representing the corresponding areas of the catchments; C1, C2, and C3 are surface storage capacities of the corresponding three surface storages; surface flow recession (KS) and base flow recession (KB) constant. The overall outlet runoff is thus a combination of baseflow and the routed surface runoff.

Sacramento model

The Sacramento model is a hydrological model designed to simulate the daily stream flow in response to rainfall and E data. Its structure, as illustrated in Figure 2, utilizes soil moisture to replicate the WB in the catchment area. Rainfall contributes to soil moisture storage, which is offset by evaporation (E) and outflow.
Figure 2

Structure of the Sacramento model.

Figure 2

Structure of the Sacramento model.

Close modal

The size and saturation of these storage compartments dictate how much rain is absorbed, the actual ET, and the lateral/vertical water movement from the storage. Excess rainfall beyond absorption becomes runoff, passing through an empirical unit hydrograph or similar mechanism. Additionally, the model incorporates water movements laterally from soil moisture stores, influencing resulting stream flow. The Sacramento Model consists of five distinct storage components: upper zone tension water (UZTW), upper zone free water (UZFW), lower zone tension water (LZTW), lower zone primary free water (LSFWP), and lower zone supplementary free water (LZFWS). The model incorporates 16 parameters in total. Among these, five parameters define the dimensions of soil moisture reservoirs, three parameters each are responsible for computing lateral outflow rates, percolation from upper to lower soil moisture reservoirs, and overall system losses, while an additional two parameters handle the calculation of direct runoff.

SimHyd model

The SimHyd model is a CRR model designed to predict the runoff using rainfall and potential evapotranspiration (PET) data on a daily timescale, utilizing a total of seven parameters. The model structure is depicted in Figure 3. In the SimHyd framework, the process gets started with the daily filling of the interception store by rainfall that gets depleted by evaporating water. Any remaining rainfall undergoes as infiltration determining the infiltration capacity. When infiltration capacity is exceeded, it results in runoff. The moisture that has penetrated the soil undergoes processing by a soil moisture function, guiding it along different routes, which include interflow, GW storage, and soil moisture storage. The interflow is determined as the ratio of soil moisture level to soil moisture capacity. The GW recharge estimation is also dependent on a linear function involving both soil moisture, and any surplus moisture directed into the soil moisture storage. The ET from this store is calculated using the soil moisture; however, it's capped to not exceed the controlled rate of areal PET determined by atmospheric conditions. Any surplus in the soil moisture storage results in an overflow into the GW storage. Base flow is modeled as a linear recession from the GW storage. Consequently, the model calculates runoff generation from three main sources: runoff due to excess infiltration, interflow (which includes saturation excess runoff), and base flow.
Figure 3

Structure of the SimHyd model.

Figure 3

Structure of the SimHyd model.

Close modal

Soil Moisture and Accounting Model

The SMAR is a lumped CRR model based on soil moisture conditions, the schematic diagram for which is shown in Figure 4.
Figure 4

Structure of the SMAR model.

Figure 4

Structure of the SMAR model.

Close modal

The model offers daily assessments of various hydrological components for the entire catchment, including surface runoff, GW discharge, ET, and soil profile leakage. The surface runoff encompasses different forms such as overland flow, saturation excess runoff, and saturated through-flow from perched GW areas, characterized by rapid response times. The SMAR model is comprised of two consecutive components: a WB component and a routing component. It relies on time series data of rainfall and E to simulate the catchment outlet discharge (Tan & O'Connor 1996).

TANK model

The TANK model features a straightforward design, consisting of four vertically arranged tanks in a series configuration, as depicted in Figure 5.
Figure 5

Structure of the Tank model.

Figure 5

Structure of the Tank model.

Close modal

In this method, precipitation is added to the top tank, and E is gradually deducted as we progress downward through the tanks. As each tank is depleted, any E deficit is drawn from the subsequent tank in sequence until all the tanks are emptied, and the lateral outlets generate the calculated runoff values. The result from the uppermost tank indicates surface runoff, the second tank's output signifies intermediate runoff, the third tank's output represents sub-base runoff, and the fourth tank's output indicates base flow. This tank-oriented model is employed to assess daily discharge by integrating daily precipitation and E inputs.

All the conceptual models have been developed using the Rainfall–Runoff Library (RRL) in this study. The RRL is a library created by the Cooperative Research Centre for Catchment Hydrology (CRC), Australia in 2004. Its design aims to replicate catchment runoff through the utilization of daily precipitation, runoff, and ET/PET data, making it suitable for catchments with areas ranging from 10 to 10,000 km². This library currently contains five RR models as explained above, eight optimizers for the calibration of parameters, and a choice of ten objective functions. A detailed description of all these models is available on the RRL website and its user manual, http://www.toolkit.net.au/Tools/RRL/documentation. The sum of square error (SSE) and Genetic Algorithm (GA) are utilized as the objective function and as an optimizer, respectively, in developing all conceptual models in this study. A detailed description of the GA optimization technique can be found in Deb (1999).

Artificial neural network

ANN is a computational system inspired by the functioning of biological neural networks. A typical two-layered multilayered perceptron (MLP), the structure of which is illustrated in Figure 6, normally consisting of three layers, one each of input, hidden, and output layers, is able to mimic any non-linear system with an excellent approximation. The connections between adjacent layers' neurons are known as a ‘connection weight’. In the first step, the sum of weighted inputs passes from the non-linear activation function to give the input for the subsequent layer in the forward direction, known as the feed-forward step. In the back-propagation step, the training is done by minimizing generally, the mean square errors (MSEs) by using the gradient-descent optimization technique to find the best set of connection weights using either supervised or unsupervised training mechanism.
Figure 6

Three-layered ANN structure.

Figure 6

Three-layered ANN structure.

Close modal

The main objective of supervised training involves reducing the output layer error by finding connection strengths that make the ANN outputs match or come closer to the desired targets. A widely used approach for supervised training is the back-propagation training algorithm, which is commonly employed in various engineering applications (Zurada 1994).

The effectiveness of the models created in this study is assessed using four established statistical measures, namely, average absolute relative error (AARE), normalized root mean square error (NRMSE), Pearson correlation coefficient (R), and threshold statistics (TS). These error metrics have been commonly utilized in the assessment of hydrological models, and their comprehensive explanation can be found in Vidyarthi et al. (2020).

This section provides a comprehensive discussion of the study area, data, and the procedural approach taken in developing both the conceptual and ANN models for the modeling of the RR process.

Study area and data

The study area to conduct this study is Bird Creek basin in Oklahoma, USA and the basin outlet at which the runoff is simulated is near Sperry, approximately 10 km north of Tulsa city having a drainage area of approximately 2,344 km2 as shown in Figure 7. The Bird Creek Basin has a humid climate with significant rainfall occurring during the year. The basin is a frequent site of flooding in the Tulsa area and became devastating after its population growth in the east and north of downtown Tulsa. Bird Creek had many major tributaries including Birch Creek, Hominy Creek and Mingo Creek and many minor tributaries which contribute to historical flooding problems in the Tulsa area. The estimation of runoff, peak discharge and design flood is essential in this basin as the runoff, resulting into flood in many parts of the basin, is contributed by many small to medium-sized tributary subbasins and the rise in the population along the river. Bird Creek meets to Verdigris River near Catoosa as one of its major tributaries. Spring and summer are the wettest seasons with rain arriving in the form of convective showers with thunderstorms. The rainfall and other climatic data acquired from the basin are employed in this study. The PET is calculated using Penman's equation and the actual evapotranspiration (ET) is calculated by comparing the total daily rainfall given by Haan (1972) as follows:
(1)
Figure 7

Map of the Bird Creek Basin.

Figure 7

Map of the Bird Creek Basin.

Close modal

The training (or calibration) set covers the period 01 March 1995 to 25 May 2003 and the testing (or validation) set covers the period 26 May 2003 to 31 November 2008. The basic statistics of the data are outlined in Table 1. The maximum runoff is found to be 24.11 mm/day with an average value of 0.75 mm/day with right-skewed and high peaked runoff data.

Table 1

Basic statistics of the data

StatisticsRainfall (mm/day)Runoff (mm/day)PET (mm/day)
Maximum 135.4 24.11 9.15 
Minimum 0.024 0.12 
Average 2.8 0.75 3.62 
Std. dev. 9.16 1.71 1.99 
Skewness 5.43 5.24 0.15 
Kurtosis 39.83 37.99 −0.97 
StatisticsRainfall (mm/day)Runoff (mm/day)PET (mm/day)
Maximum 135.4 24.11 9.15 
Minimum 0.024 0.12 
Average 2.8 0.75 3.62 
Std. dev. 9.16 1.71 1.99 
Skewness 5.43 5.24 0.15 
Kurtosis 39.83 37.99 −0.97 

Conceptual model development

All the conceptual models (AWBM, Sacramento, SimHyd, SMAR, and TANK) are developed by using the RRL toolkit employing the rainfall, runoff, and PET/ET data for the above-specified period. The RRL toolkit has eight built-in optimizer techniques. In the present study, a GA optimizer has been adopted. The different combinations of mutation probability and population size and the number of iterations were taken in many trials. From the trial-and-error procedure, the best set of optimized GA parameters is given in Table 2.

Table 2

Best GA parameters for different models

ModelGA optimizer parameters
IterationsProb. mutationNb. points (Population)Trapez. PDF
AWBM 500 0.01 72 1.5 
Sacramento 300 0.01 144 1.5 
SimHyd 300 0.01 63 1.5 
SMAR 300 0.01 81 1.5 
TANK 300 0.01 162 1.5 
ModelGA optimizer parameters
IterationsProb. mutationNb. points (Population)Trapez. PDF
AWBM 500 0.01 72 1.5 
Sacramento 300 0.01 144 1.5 
SimHyd 300 0.01 63 1.5 
SMAR 300 0.01 81 1.5 
TANK 300 0.01 162 1.5 

The calibrated values of all the parameters of these five conceptual models were obtained and presented in Table 3.

Table 3

Calibrated parameter values of various conceptual models

ModelNo. of parameterParameter values
AWBM A1 = 0.134, A2 = 0.433, BFI = 0.3686, C1 = 49.4, C3 = 141.96, C3 = 178.43, KB = 0.898, KS = 0.9843 
Sacramento 16 Adimp = 0, Lzfpm = 22.549, Lzfsm = 36.078, Lzpk = 0.0196, Lzsk = 0.239, Lztwm = 200.78, Pctim = 0, Pfree = 0.847, Rexp = 1.2, Rserv = 0.3, Sarva = 0.0099, Side = 0, Ssout = 0.001, Uzfwm = 66.51, Uzk = 0.086, Uztwm = 43.137, Zperc = 51.45 
SimHyd Baseflow Coeff = 0.1176, Impervious Threshold = 1.882, Infiltr Coeff = 398.43, Infilt Shape = 0.392, Interflow Coeff = 0.0117, Pervious Fraction = 0.984, Rainfall Interception Store Capacity = 0.725, Recharge Coeff = 0.50, Soil Moisture Store Capacity = 253.43, 
SMAR C= 0.494, G = 0.827, H = 0.149, Kg = 0.207, N = 1.35, NK = 0.8369, T= 0.9176, Y= 3,941.176, Z = 921.5686 
TANK 18 H11 = 278.43, a11 = 0.74, a12 = 0.05, a21 = 0.20, a31 = 0.19, a41 = 0.48, alpha = 0.039, b1 = 0.0196, b2 = 1, b3 = 0.854, C1 = 85.098, C2 = 90.98, C3 = 37.64, C4 = 87.84, H12 = 54.59, H21 = 92.55, H31 = 0, H41 = 18.03 
ModelNo. of parameterParameter values
AWBM A1 = 0.134, A2 = 0.433, BFI = 0.3686, C1 = 49.4, C3 = 141.96, C3 = 178.43, KB = 0.898, KS = 0.9843 
Sacramento 16 Adimp = 0, Lzfpm = 22.549, Lzfsm = 36.078, Lzpk = 0.0196, Lzsk = 0.239, Lztwm = 200.78, Pctim = 0, Pfree = 0.847, Rexp = 1.2, Rserv = 0.3, Sarva = 0.0099, Side = 0, Ssout = 0.001, Uzfwm = 66.51, Uzk = 0.086, Uztwm = 43.137, Zperc = 51.45 
SimHyd Baseflow Coeff = 0.1176, Impervious Threshold = 1.882, Infiltr Coeff = 398.43, Infilt Shape = 0.392, Interflow Coeff = 0.0117, Pervious Fraction = 0.984, Rainfall Interception Store Capacity = 0.725, Recharge Coeff = 0.50, Soil Moisture Store Capacity = 253.43, 
SMAR C= 0.494, G = 0.827, H = 0.149, Kg = 0.207, N = 1.35, NK = 0.8369, T= 0.9176, Y= 3,941.176, Z = 921.5686 
TANK 18 H11 = 278.43, a11 = 0.74, a12 = 0.05, a21 = 0.20, a31 = 0.19, a41 = 0.48, alpha = 0.039, b1 = 0.0196, b2 = 1, b3 = 0.854, C1 = 85.098, C2 = 90.98, C3 = 37.64, C4 = 87.84, H12 = 54.59, H21 = 92.55, H31 = 0, H41 = 18.03 

After the calibration procedure, the simulated runoff obtained from developed models is compared with the observed runoff by comparing the statistical measures during both training and testing periods for all the models developed in this study.

ANN model development

A two-layered MLP architecture is employed for the development of all the ANN models in this study. Each ANN model for the basin uses lagged-in-time rainfall and runoff as inputs and produces runoff at t time steps. The ANN forecast formulation is as follows:
(2)
The value of optimal lag, m for precipitation, and n for runoff were obtained by the use of auto-correlation function (ACF), cross-correlation function (CCF), and partial auto-correlation function (PACF) values as given in Table 4 and shown in Figure 8. The flow and rainfall at different lags having values of ACF more than 0.45 and CCF more than 0.3 are taken as input variables.
Table 4

Correlation coefficients of Q(t)

ACF
CCF
PACF
FlowValueRainfallValueFlowValue
Q (t–1) 0.76 P (t–1) 0.49 Q (t–1) 0.76 
Q (t–2) 0.53 P (t–2) 0.34 Q (t–2) −0.13 
Q (t–3) 0.46 P (t–3) 0.18 Q (t–3) 0.25 
Q (t–4) 0.43 P (t–4) 0.14 Q (t–4) 0.01 
Q (t–5) 0.40 P (t–5) 0.13 Q (t–5) 0.12 
Q (t–6) 0.40 P (t–6) 0.13 Q (t–6) 0.07 
Q (t–7) 0.39 P (t–7) 0.12 Q (t–7) 0.07 
Q (t–8) 0.38 P (t–8) 0.12 Q (t–8) 0.05 
Q (t–9) 0.37 P (t–9) 0.12 Q (t–9) 0.04 
Q (t–10) 0.34 P (t–10) 0.12 Q (t–10) −0.02 
ACF
CCF
PACF
FlowValueRainfallValueFlowValue
Q (t–1) 0.76 P (t–1) 0.49 Q (t–1) 0.76 
Q (t–2) 0.53 P (t–2) 0.34 Q (t–2) −0.13 
Q (t–3) 0.46 P (t–3) 0.18 Q (t–3) 0.25 
Q (t–4) 0.43 P (t–4) 0.14 Q (t–4) 0.01 
Q (t–5) 0.40 P (t–5) 0.13 Q (t–5) 0.12 
Q (t–6) 0.40 P (t–6) 0.13 Q (t–6) 0.07 
Q (t–7) 0.39 P (t–7) 0.12 Q (t–7) 0.07 
Q (t–8) 0.38 P (t–8) 0.12 Q (t–8) 0.05 
Q (t–9) 0.37 P (t–9) 0.12 Q (t–9) 0.04 
Q (t–10) 0.34 P (t–10) 0.12 Q (t–10) −0.02 
Figure 8

Plot of ACF, CCF, and PACF.

Figure 8

Plot of ACF, CCF, and PACF.

Close modal
Based on the above analysis, a total of five significant input variables as m = 2, and n = 3 are identified. The neuron in the output layer provides the flow Q(t) at time step t, which is the target that is to model. The data underwent normalization within the range of 0.1 to 0.9, employing the following min-max formula:
(3)
In the normalization process, Xn represents the normalized value, where ‘a’ is the lower limit of the normalized range, ‘b’ is the upper limit of the normalized range, and X0 is the value to be normalized. Additionally, Xmax and Xmin refer to the maximum and minimum values in the dataset that require normalization. The values of a = 0.1 and b = 0.9 are taken in this study. The same formula was used in the inverse direction to calculate the denormalized values of flow () at the output neuron. After selecting input and output variables, in the next step, the ANN architecture with a configuration of five input nodes, a variable number of hidden nodes denoted as ‘N,’ and one output node is selected where the optimal value of N is determined using an iterative trial-and-error approach. The logistic sigmoid activation function is utilized as the transfer function for both the hidden and output layers. The training of the ANN architectures utilized the well-known Feed-Forward Back-Propagation (FFBP) algorithm, employing batch learning with the incorporation of a momentum factor during the training process. A number of trial values of learning and momentum constant are taken. The learning rate and momentum constant values of 0.005 and 0.075, respectively, are obtained as the best, respectively, and these values are consistently applied during the training of all the architectures. The best value of N is obtained by varying N from 1 to 20. The Levenberg–Marquardt (LM) algorithm was employed to minimize the MSE at the output layer. The training of the ANN model concluded either when the MSE reached 0.0001 or after 50,000 iterations. The optimal ANN architecture is chosen based on error statistics, as illustrated in Figure 9.
Figure 9

Plot of error statistics vs. hidden neurons during training.

Figure 9

Plot of error statistics vs. hidden neurons during training.

Close modal

The optimum number of hidden neurons on the basis of all error statistics shown in Figure 9 is determined to be 14, therefore, 5–14–1 is found to be the best architecture to simulate the RR process in the study area.

First, the results obtained from the conceptual models are discussed and the best conceptual models so obtained are then compared with the ANN model in the simulation of the RR process.

Results from conceptual models

The outcomes, represented in Table 5, encompass diverse performance metrics for all the developed conceptual models. Upon evaluating the results during training, it is apparent that the SimHyd model outperforms the other conceptual models developed in this study, with its highest R, E, TS10 and TS100 values of 0.75, 0.776, 6.2 and 79.1, respectively, and the smallest NRMSE, MSE, and AARE values of 0.057, 1.344, and 114.9, respectively during training, and highest R, E, and TS100 values of 0.703, 0.464, and 75.1, respectively, and the smallest NRMSE, and MSE values of 0.049, and 1.402, respectively during testing. Figures 10 and 11 present the scatter plots of all the conceptual models during calibration and validation, respectively. Figure 12 presents the time series plots of all the conceptual models during validation. The scatter plots and the time series plots also show that the SimHyd model mimics the observed runoff better than other conceptual models developed in this study.
Table 5

Performance statistics of different conceptual models

ModelERNRMSEMSEAARETS10TS50TS75TS100
During Training/Calibration 
 AWBM 0.524 0.711 0.061 1.539 126.1 6.2 32.5 57.4 75.3 
 Sacramento 0.511 0.731 0.059 1.432 161.7 5.7 32.0 48.4 62.4 
 SimHyd 0.776 0.750 0.057 1.344 114.9 6.2 32.1 54.2 79.1 
 SMAR 0.641 0.626 0.073 2.181 116.5 1.0 4.4 7.8 22.2 
 TANK 0.774 0.578 0.075 2.333 299.4 5.2 27.9 42.7 56.1 
During Testing/Validation 
 AWBM 0.216 0.626 0.061 2.164 112.1 6.4 37.4 60.4 74.5 
 Sacramento 0.463 0.621 0.049 1.416 156.4 5.9 31.3 47.5 61.1 
 SimHyd 0.464 0.703 0.049 1.402 130.6 6.0 32.4 51.1 75.1 
 SMAR 0.148 0.653 0.061 2.182 100.1 0.8 4.1 8.2 12.8 
 TANK 0.311 0.548 0.058 1.932 216.3 5.8 32.0 44.0 52.2 
ModelERNRMSEMSEAARETS10TS50TS75TS100
During Training/Calibration 
 AWBM 0.524 0.711 0.061 1.539 126.1 6.2 32.5 57.4 75.3 
 Sacramento 0.511 0.731 0.059 1.432 161.7 5.7 32.0 48.4 62.4 
 SimHyd 0.776 0.750 0.057 1.344 114.9 6.2 32.1 54.2 79.1 
 SMAR 0.641 0.626 0.073 2.181 116.5 1.0 4.4 7.8 22.2 
 TANK 0.774 0.578 0.075 2.333 299.4 5.2 27.9 42.7 56.1 
During Testing/Validation 
 AWBM 0.216 0.626 0.061 2.164 112.1 6.4 37.4 60.4 74.5 
 Sacramento 0.463 0.621 0.049 1.416 156.4 5.9 31.3 47.5 61.1 
 SimHyd 0.464 0.703 0.049 1.402 130.6 6.0 32.4 51.1 75.1 
 SMAR 0.148 0.653 0.061 2.182 100.1 0.8 4.1 8.2 12.8 
 TANK 0.311 0.548 0.058 1.932 216.3 5.8 32.0 44.0 52.2 
Figure 10

Scatter plots between observed and calculated flows during calibration of conceptual models: (a) AWBM, (b) Sacramento, (c) SimHyd, (d) SMAR, and (e) TANK.

Figure 10

Scatter plots between observed and calculated flows during calibration of conceptual models: (a) AWBM, (b) Sacramento, (c) SimHyd, (d) SMAR, and (e) TANK.

Close modal
Figure 11

Scatter plots between observed and calculated flows during validation of conceptual models: (a) AWBM, (b) Sacramento, (c) SimHyd, (d) SMAR, and (e) TANK.

Figure 11

Scatter plots between observed and calculated flows during validation of conceptual models: (a) AWBM, (b) Sacramento, (c) SimHyd, (d) SMAR, and (e) TANK.

Close modal
Figure 12

Time series plots between observed and calculated flow during validation of conceptual models: (a) AWBM, (b) Sacramento, (c) SimHyd, (d) SMAR, and (e) TANK.

Figure 12

Time series plots between observed and calculated flow during validation of conceptual models: (a) AWBM, (b) Sacramento, (c) SimHyd, (d) SMAR, and (e) TANK.

Close modal
To show the efficacy of different conceptual models, the models are assessed for their capabilities in modeling the low, medium and high flows and for that, the time series plots for low, medium and high flows are presented in Figure 13. From Figure 13, it is evident that the SimHyd model which is the simplest model (having only seven parameters for calibration) captured the flow dynamics of the basin by mimicking high flows extremely well, the medium flows well and low flows reasonably well showing its robustness in simulating RR process among all the conceptual models developed in this study. This may be because of the use of the maximum value of evapotranspiration (i.e., PET) by the SimHyd model in RR simulation.
Figure 13

Time series plot of observed and calculated flow model for (a) low, (b) medium and (c) high flows during testing: (i) AWBM, (ii) Sacramento, (iii) SimHyd, (iv) SMAR, and (v) TANK.

Figure 13

Time series plot of observed and calculated flow model for (a) low, (b) medium and (c) high flows during testing: (i) AWBM, (ii) Sacramento, (iii) SimHyd, (iv) SMAR, and (v) TANK.

Close modal

Comparison results from best conceptual model and ANN model

The performance of the best conceptual model, i.e., SimHyd is compared with the ANN model for simulating the RR process and presented in this sub-section. The values of the performance statistics obtained from SimHyd and ANN models are given in Table 6. From Table 6 it is clear that the performance of ANN model is far better than the SimHyd conceptual model with highest values of R (0.885), E (0.960), TS10 (38), TS50 (83.2), TS75 (90.1) and TS100 (93.7), and lowest values of NRMSE (0.074), MSE (0.450), and AARE (34.9) during training, and highest values of R (0.941), E (0.994), TS10 (11.7), TS50 (80.4), TS75 (90.4) and TS100 (91.7), and lowest values of NRMSE (0.031), MSE (0.079), and AARE (38.9) during testing. To visually assess model performance, the scatter plots are presented in Figure 14(a) and 14(b) for the training and testing periods, respectively. The time series plots during the testing period for SimHyd and ANN models are presented in Figure 15. From Figures 14 and 15, it is clearly visualized that the ANN model is capturing the RR process better than SimHyd. Thus, overall, the ANN model is the best among all the models developed in this study.
Table 6

Performance statistics of simHyd and ANN models

ModelERNRMSEMSEAARETS10TS50TS75TS100
During training/calibration 
 SimHyd 0.776 0.750 0.057 1.344 114.9 6.2 32.1 54.2 79.1 
ANN (5-14-1) 0.960 0.885 0.074 0.450 34.9 38.0 83.2 90.1 93.7 
During testing/validation 
 SimHyd 0.464 0.703 0.049 1.402 130.6 6.0 32.4 51.1 75.1 
ANN (5-14-1) 0.994 0.941 0.031 0.079 38.9 11.7 80.4 90.4 91.7 
ModelERNRMSEMSEAARETS10TS50TS75TS100
During training/calibration 
 SimHyd 0.776 0.750 0.057 1.344 114.9 6.2 32.1 54.2 79.1 
ANN (5-14-1) 0.960 0.885 0.074 0.450 34.9 38.0 83.2 90.1 93.7 
During testing/validation 
 SimHyd 0.464 0.703 0.049 1.402 130.6 6.0 32.4 51.1 75.1 
ANN (5-14-1) 0.994 0.941 0.031 0.079 38.9 11.7 80.4 90.4 91.7 
Figure 14

Scatter plots between observed and calculated flows (a) during calibration/training and (b) during validation/testing.

Figure 14

Scatter plots between observed and calculated flows (a) during calibration/training and (b) during validation/testing.

Close modal
Figure 15

Time series plots between observed and calculated flow during validation/testing: (a) SimHyd and (b) ANN.

Figure 15

Time series plots between observed and calculated flow during validation/testing: (a) SimHyd and (b) ANN.

Close modal
To identify the performance of SimHyd and ANN models in simulating various magnitudes of flows, the time series plots are drawn separately for identified high, medium, and low flows during the validation/testing period and presented in Figure 16. It can be observed from Figure 16 that the ANN model is able to capture all the low, medium and high flows accurately. The SimHyd model is mostly overestimating the low flows in the basin. The better representation of low flow in the input set might improve the accuracy in modeling low flows.
Figure 16

Time series plot of observed and calculated flow from the best models from each category for (a) low, (b) medium, and (c) high flows during testing.

Figure 16

Time series plot of observed and calculated flow from the best models from each category for (a) low, (b) medium, and (c) high flows during testing.

Close modal

Modeling the RR process holds significant importance in water resources project management. The runoff is estimated using various methods and broadly divided into two categories, conceptual techniques or data-driven techniques. However, each method has its own limitations and accuracies in runoff estimation. Their comparison must be assessed before utilizing them in operational uses. Few studies on comparison of a few conceptual models and data-driven techniques are available, the comparison of many conceptual models with ANN is still missing and needs to be assessed. This study was initiated with the aim of the development of five conceptual models, namely AWBM, Sacramento, SimHyd, SMAR, and TANK, along with a distinct ANN model, tailored specifically for the Bird Creek basin in Oklahoma, USA and comparing their performances in runoff estimation. For the conceptual model development, the RRL toolkit is used, while the ANN model is developed using a feed-forward neural network with one hidden layer, and one each of the input and output layers trained using the LM algorithm in the back-propagation step. The inputs are selected on the basis of cross-correlation, auto-correlation, and partial auto-correlation functions. Model performance evaluation was carried out using five different standard statistical performance evaluation statistics, namely AARE, R, E, NRMSE, and TS. The MSE is used as the objective function. The following conclusions are made based on this study:

  • Conceptual models perform reasonably in simulating the RR process.

  • Out of all the conceptual models, namely, AWBM, Sacramento, SimHyd, SMAR, and TANK models developed in this study, the SimHyd performed the best. The SMAR and TANK models are not able to simulate the runoff of the basin.

  • The SimHyd and ANN models both demonstrate the capability to accurately capture low, medium, and high flows. However, the ANN model outperformed all the conceptual models including SimHyd model in simulating the RR process.

The conclusions drawn from this study are based on the models developed using data from a single basin. These findings could be prominently concluded by initiating analogous research on diverse catchments with varying hydro-climatic conditions. In this study, the performance of five conceptual models with a data-driven technique, i.e., ANN is compared, while several conceptual models and data-driven models are in existence and their performances may be assessed which can be instrumental in improving the planning, design, operation, and management of various water resource activities.

The data used in this study have been obtained freely from USGS (https://waterdata.usgs.gov/nwis) and is duly acknowledged.

There was no funding received to carry out this research.

All relevant data are available from an online repository at https://waterdata.usgs.gov/nwis.

The authors declare there is no conflict.

Ayaz
M.
,
Chourasiya
S.
&
Danish
M.
(
2023
)
Performance analysis of different ANN modelling techniques in discharge prediction of circular side orifice
,
Model. Earth Syst. Environ.
, 10, 273–283.
https://doi.org/10.1007/s40808-023-01766-7
.
Azmathullah
H. M.
,
Deo
M. C.
&
Deolalikar
P. B.
(
2005
)
Neural networks for estimation of scour downstream of a ski-jump bucket
,
Journal of Hydraulic Engineering, ASCE
,
131
(
10
),
898
908
.
DOI: 10.1061/ASCE0733-94292005131:10898
.
Baranwal
A.
&
Das
B. S.
(
2024
)
Live-Bed scour depth modelling around the bridge pier using ANN-PSO, ANFIS, MARS, and m5tree
,
Water Resources Management
,
38
,
4555
4587
.
https://doi.org/10.1007/s11269-024-03879-9
.
Birikundavyi
S.
,
Labib
R.
,
Trung
H. T.
&
Rousselle
J.
(
2002
)
Performance of neural networks in daily streamflow forecasting
,
Journal of Hydrologic Engineering
,
7
(
5
),
392
398
.
Boughton
W. J.
(
1996
)
AWBM water balance model calibration and operation manual
,
CRC for Catchment Hydrology
.
Buchtele
J.
(
1993
)
Runoff changes simulated using a rainfall-runoff model
,
Water Resources Management
,
7
,
273
287
.
https://doi.org/10.1007/BF00872285
.
Chiew
F. H. S.
&
McMahon
T. A.
(
1994
)
Application of the daily rainfall-runoff model MODHYDROLOG to 28 Australian catchments
,
Journal of Hydrology
,
153
(
1994
),
383
416
.
Deb
K.
(
1999
)
An introduction to genetic algorithms
,
Sadhana
,
24
,
293
315
.
https://doi.org/10.1007/BF02823145
.
Demirel
M. C.
,
Venancio
A.
&
Kahya
E.
(
2009
)
Flow forecast by SWAT model and ANN in Pracana basin, Portugal
,
Advances in Engineering Software
,
40
,
467
473
.
https://doi.org/10.1016/j.advengsoft.2008.08.002
.
Ghorbani
M. A.
,
Khatibi
R.
,
Sivakumar
B.
&
Cobb
L.
(
2010
)
Study of discontinuities in hydrological data using catastrophe theory
,
Hydrological Sciences Journal
,
55
(
7
),
1137
1151
.
https://doi.org/10.1080/02626667.2010.513477
.
Haan, C. T. (1972) A water yield model for small watersheds, Water Resources Research, 8 (1), 58–69.
Kashani
M. H.
,
Ghorbani
M. A.
,
Dinpashoh
Y.
&
Shahmorad
S.
(
2016
)
Integration of Volterra model with artificial neural networks for rainfall-runoff simulation in forested catchment of northern Iran
,
Journal of Hydrology
,
540
,
340
354
.
https://doi.org/10.1016/j.jhydrol.2016.06.028
.
Mohammadi
F.
,
Fakheri Fard
A.
&
Ghorbani
M. A.
(
2019
)
Application of cross-wavelet–linear programming–Kalman filter and GIUH methods in rainfall–runoff modeling
,
Environmental Earth Sciences
,
78
,
168
.
https://doi.org/10.1007/s12665-019-8133-3
.
Mulvany
T. J.
(
1850
)
On the Use of self-Registering rain and flood gauges
,
Proceedings of Institution of Civil Engineers
,
4
(
2
),
1
8
.
O'Connell
P. E.
,
Nash
J. E.
&
Farrel
J. P.
(
1970
)
‘Riverflow forecasting through conceptual models’, part 2, the brosna catchment at ferbane
,
Journal of Hydrology
,
10
,
317
329
.
Pandey
B. K.
,
Gosain
A. K.
,
Paul
G.
& Khare, D. (
2017
)
Climate change impact assessment on hydrology of a small watershed using semi-distributed model
,
Applied Water Science
,
7
,
2029
2041
.
https://doi.org/10.1007/s13201-016-0383-6
.
Rahimzad
M.
,
Moghaddam
A.
,
Hosam
A.
,
Soltani
J.
,
Mehr
A. D.
&
Kwon
H.-H.
(
2021
)
Performance comparison of an LSTM-based deep learning model versus conventional machine learning algorithms for streamflow forecasting
,
Water Resources Management
,
35
,
4167
4187
.
https://doi.org/10.1007/s11269-021-02937-w
.
See, L., Jain, A., Dawson, C. & Abrahart, R. (2009) Visualisation of Hidden Neuron Behaviour in a Neural Network Rainfall-Runoff Model. In: Abrahart, R. J., See, L. M., Solomatine, D. P. (eds) Practical Hydroinformatics. Water Science and Technology Library, vol 68. Springer, Berlin, Heidelberg.
https://doi.org/10.1007/978-3-540-79881-1_7.
Shekhar
D.
,
Das
B. S.
,
Devi
K.
, Khuntia, J. R. & Karmaker, T. (
2023
)
Discharge estimation in a compound channel with converging and diverging floodplains using ANN–PSO and MARS
,
Journal of Hydroinformatics
,
25
(
6
),
2479
2499
.
https://doi.org/10.2166/hydro.2023.145
.
Soni
P.
,
Medhi
H.
,
Sagar
A.
, Garg, P., Singh, A. & Karna, U. (
2022
)
Runoff estimation using digital image processing for residential areas
,
Journal of Water Supply: Research and Technology-Aqua
,
71
(
8
),
938
948
.
Tan
B. Q.
&
O'Connor
K. M.
(
1996
)
Application of an empirical infiltration equation in the SMAR conceptual model
,
Journal of Hydrology
,
185
,
225
295
.
Vidyarthi
V. K.
&
Jain
A.
(
2020
)
Knowledge extraction from trained ANN drought classification model
,
Journal of Hydrology
,
585
,
124804
.
https://doi.org/10.1016/j.jhydrol.2020.124804
.
Vidyarthi
V. K.
&
Jain
A.
(
2023
)
Development of simple semi-distributed approaches for modelling complex rainfall–runoff process
,
Hydrological Sciences Journal
, 68 (7), 998–1015.
doi:10.1080/02626667.2023.2197117
.
Vidyarthi
V. K.
&
Jain
A.
(
2023
)
Does ANN really acquire the physics of the system? A study using conceptual components from an established water balance model
,
Journal of Hydroinformatics
,
25
(
4
),
1380
1395
.
https://doi.org/10.2166/hydro.2023.025
.
Vidyarthi
V. K.
,
Jain
A.
&
Chourasiya
S.
(
2020
)
Modeling rainfall-runoff process using artificial neural network with emphasis on parameter sensitivity
,
Modeling Earth Systems and Environment
,
6
,
2177
2188
.
https://doi.org/10.1007/s40808-020-00833-7
.
Zeng
L.
&
Chu
X.
(
2021
)
A new probability-embodied model for simulating variable contributing areas and hydrologic processes dominated by surface depressions
,
Journal of Hydrology
,
602
,
126762
.
https://doi.org/10.1016/j.jhydrol.2021.126762
.
Zurada
J. M.
(
1994
)
Introduction to Artificial Neural Systems
.
Mumbai, India
:
Jaico Publishing House
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).