ABSTRACT
Effective water resource management in gauged catchments relies on accurate runoff prediction. For ungauged catchments, empirical models are used due to limited data availability. This study applied artificial neural networks (ANNs) and empirical models to predict runoff in the Bhima River basin. Among the tested models, the ANN-5 model, which utilized rainfall and one-day delayed rainfall as inputs, demonstrated superior performance with minimal error and high efficiency. Statistical results for the ANN-5 model showed excellent outcomes during both training (R = 0.95, NSE = 0.89, RMSE = 17.39, MAE = 0.12, d = 0.97, MBE = 0.12) and testing (R = 0.94, NSE = 0.88, RMSE = 11.47, MAE = 0.03, d = 0.97, MBE = 0.03). Among empirical models, the Coutagine model was the most accurate, with R = 0.82, MBE = 74.36, NSE = 0.94, d = 0.82, KGE = 0.76, MAE = 70.01, MAPE = 20.6%, NRMSE = 0.22, RMSE = 87.4, and DRV = −9.2. In contrast, Khosla's formula (KF) significantly overestimated runoff. The close correlation between observed and ANN-predicted runoff data underscores the model's utility for decision-makers in inflow forecasting, water resource planning, management, and flood forecasting.
HIGHLIGHTS
A novel model (ANN) is proposed for monthly runoff prediction.
Four empirical models for rainfall-runoff modeling were compared with the ANN model monthly runoff prediction.
Rainfall (P), evaporation (E), and temperature (T) were selected as inputs of the forecasting of monthly runoff.
The proposed model and Coutagine relationship (COR) achieve satisfactory forecasting results.
NOTATION/ACRONYMS
- °C
degrees Celsius
- ANFIS
adaptive-network-based fuzzy inference system
- ANN
artificial neural network
- cm
centimetre
- CN
curve number
- COR
Coutagine relationship
- d
index of agreement
- DII
Department of Irrigation, India
- DRV
deviation of runoff volume
- E
evaporation
- Et−1
one-day lagged evaporation
- GoM
Government of Maharashtra
- IDE
Inglis and DeSouza's approach
- KF
Khosla's formula
- KGE
Kling–Gupta efficiency
- Lm
monthly losses
- MAE
mean absolute error
- MAPE
mean absolute percentage error
- MBE
mean biased error
- MCM
million cubic meters
- mm
millimetre
- NRMSE
normalized root mean square error
- NSE
Nash–Sutcliffe model efficiency coefficient
- P
precipitation/rainfall
- Pm
monthly precipitation/rainfall
- Pt−1
one-day lagged rainfall
- Q
runoff
- R
Pearson's correlation coefficient
- R2
coefficient of determination
- Rm
monthly runoff of watershed
- RMSE
root mean squared error
- SCS
Soil Conservation Service
- T
temperature
- Tm
mean monthly temperature
- Tt−1
one-day lagged temperature
- t–1
one day lag
- USA
United States of America
- WRD
Water Resources Department
INTRODUCTION
Climate change, driven by an increase in the greenhouse effect, is set to redistribute temperatures across time and space, impacting key hydrological processes such as precipitation. This will lead to significant shifts in the spatial and temporal availability of water resources within river basins (Berlemann & Steinhardt 2017). Changes in runoff volume and distribution are expected due to climate change, making its assessment vital for water resource management. This includes planning for hydropower, irrigation, and infrastructure such as dams and irrigation systems to adapt to these changes effectively.
However, accurate runoff estimation is challenging for hydrological planning and engineering aspects but essential for managing water resources for efficient utilization (Kumar et al. 2021). Determining runoff in catchments is of critical significance for mitigating droughts and floods, maintaining the ecosystem health of aquatic systems, and checking on the water quality of surface water reservoirs (Lane et al. 2022). The ineffective management of runoff causes water resources unavailability at spatiotemporal scales for various applications. Since it is impossible to gauge every watershed in a developing nation like India, an indirect method of quantifying runoff generation is required. The best way to deal with such unaffected watersheds is through condition modeling. Relationship between precipitation and runoff is highly unpredictable within a catchment and is influenced by several characteristics of precipitation, basin characteristics and drainage systems (Hamdan et al. 2021).
The most complex aspect of hydrological research is forecasting runoff for specific rainfall in specific regions (Li et al. 2015). There are a variety of hydrological models readily accessible for estimating runoff, most of which are sophisticated and require substantial inputs. For rainfall-runoff modeling, scientists across the regions globally have created multiple models (Leitzke & Adamatti 2021). These models are divided into two types: conceptual models and physically based models. Simple mathematical representations of the hydrological processes that occur in a watershed are conceptual models. The Soil Conservation Service (SCS) curve number (CN) model, the rational method, and other conceptual models are examples. Physically based models are more sophisticated and rely on a thorough understanding of the physical processes that take place in a watershed. Physical and conceptual models need a thorough understanding and knowledge of the water cycle.
Artificial intelligence may forecast runoff more precisely than conventional techniques (Van et al. 2020). Data-driven approaches are more accurate, precise, and more versatile (Achite et al. 2023; Elbeltagi et al. 2023a, 2023b; Markuna et al. 2023). Artificial neural network (ANN) has been frequently used in water resource assessments in recent years because of its strength in dealing with non-linear and non-stationary data issues (Araghinejad 2013). It is a promising tool for accurate modeling of complicated processes and for generating insight from the learned relationships, both of which would help the developer understand the process being studied and assess the model (Kumar et al. 2022). Several ANN designs have been successfully used to simulate and forecast hydrological and weather variables such as rainfall, runoff, and sediment loads (Saroughi et al. 2023). In several experiments, ANN outperformed traditional statistical modeling approaches (Elbeltagi et al. 2023c).
Recent study reveals that many scientists have employed ANN to describe complicated and non-linear interactions (Di Franco & Santurro 2021). Loyeh & Jamnani (2017) assessed the effectiveness of different rainfall-runoff models for the Liqvan watershed in Iran. The findings demonstrated that the ANN technique offered a viable and viable substitute to conceptual models for simulations and forecasting in watershed modeling. Numerous investigations utilized the ANN for rainfall-runoff modeling, which is recognized as a black box technique (Roy & Singh 2020; Turhan 2021). The adaptive-network-based fuzzy inference system (ANFIS) model combines the inference process of fuzzy mathematics with the connectionist capability of ANN and has been applied for rainfall-runoff modeling for the past two decades (Chang et al. 2018). This model has been utilized effectively in numerous rainfall-runoff modeling (Chang & Chen 2018). El-Shafie et al. (2011) studied the rainfall-runoff model using the ANN approach Ourika basin at Aghbalou station in Morocco and found promising and satisfactory results with coefficient of determination (R2) = 0.948 and 0.917 for calibration and validation datasets, respectively. Tokar & Johnson (1999) developed rainfall-runoff modeling using ANN considering temperature, snowmelt equivalent, evaporation, or stream flow at previous periods as an input variable. The reported ANN model provides a more systematic approach to these input variables. Gholami et al. (2010) study applied ANN to simulate the rainfall-runoff process using data from field sampling plots in conjunction with rainfall and hydro-metric data (initial loss, soil antecedent moisture condition (AMC), and the time to peak of the basin) and reported favorable results (training (R2 = 0.96, cross-validation R2 = 0.95, and test R2 = 0.81). Pramanik & Panda (2009) investigated two machine learning (ML) algorithms (ANFIS and ANN) by utilizing daily upstream flow data to forecast daily downstream flows. The study demonstrated that the coupled neural gradient network outperforms the Levenberg–Marquardt and gradient descent algorithms, and ANFIS showed that its runoff estimation in outlier data conditions is more precise. The study concluded that the ANFIS algorithm could more accurately predict barrage outflow than the ANN model.
The model analyzed the numerous configurations of lag times in streamflow time-series data and chosen the most appropriate input variables for the modeling procedure using ML techniques. Madhusoodhanan et al. (2012) concluded that the empirical methodologies commonly used are inefficient for precisely estimating the basin yield in the Western Ghats rivers of Kerala. This envisages the necessity of soft computing techniques for better prediction of the basin yield to achieve sustainable development of the watersheds. Rawat et al. (2021) assessed annual runoff in an ungauged agricultural watershed using the SCS-CN and empirical mathematical methods. Results showed that the Inglis and DeSouza IDS model can simulate annual runoff as closely as the SCS-CN model and has the lowest RMSE value of 7.75, and the ranking of this model was ranked first compared with the other eight models.
Empirical models complement conceptual models. Due to their simplicity in runoff assessment, these are simple and convenient for rainfall-runoff modeling (Jaiswal et al. 2020). These empirical models establish a persistent link between input and output functions without considering watershed characteristics. There is a strong association between rainfall, runoff, and temperature for assessing runoff flow in several watersheds in India (Reddy et al. 2020). Numerous empirical models, such as Coutagine, Turc, and Khosla, can be employed to estimate runoff (Chakravarti et al. 2015). Runoff estimation is highly desirable in development and management of water resources for sustainability. Thus, it requires having gauging stations in the catchment, which is not feasible in all the watersheds. Under such conditions and data-scarce situations, runoff prediction could be feasible by employing empirical approaches and soft computing techniques such as ANN.
From the country's perspective, reliable runoff predictions can guide governments in developing effective water management policies, allocating resources for flood control and drought mitigation infrastructure, and enhancing regional water security. Accurate assessments of future water availability can inform national and regional adaptation strategies, helping communities prepare for potential water scarcity or flooding. Besides that, such type of prediction can guide/preparation of natural disaster management plans to mitigate and minimize the impact. Accurate assessments of water availability/runoff availability under future climate scenarios can guide policymakers in setting realistic and achievable targets for water conservation, emission reduction, and climate adaptation. Moreover, from a research perspective, this kind of study may help in better understanding the complex hydrological phenomena, complex runoff dynamics, and environmental processes. Accurate runoff predictions through ANN-based models can contribute to achieving by ensuring sustainable water management, reducing water-related issues, and improving sanitation access globally. Therefore, the present study was carried out to explore and capability of soft computing and empirical models' performance in predicting monthly runoff. The primary goal of this investigation was to evaluate the effectiveness of the ANN model in predicting runoff from the Bhima watershed in Western Maharashtra and compare predicted runoff with the empirical models with minimum input data requirements.
MATERIALS AND METHODS
Study area description
Dataset
Monthly descriptive statistics of runoff and weather parameters
Measures . | Q (MCM) . | P (mm) . | T (°C) . | E (mm) . |
---|---|---|---|---|
Mean | 32.832 | 259.821 | 31.973 | 149.202 |
Std. deviation | 63.346 | 443.125 | 3.290 | 68.701 |
Coefficient of variation | 1.929 | 1.705 | 0.103 | 0.460 |
Variance | 4,012.688 | 196,359.347 | 10.827 | 4,719.884 |
Skewness | 2.349 | 1.864 | 0.581 | 0.925 |
Kurtosis | 5.137 | 2.757 | −0.759 | −0.439 |
Shapiro–Wilk | 0.592 | 0.656 | 0.926 | 0.862 |
P-value of Shapiro–Wilk | <0.001 | <0.001 | <0.001 | <0.001 |
Range | 287.890 | 2,069.000 | 12.560 | 264.770 |
Minimum | 0.000 | 0.000 | 26.480 | 52.430 |
Maximum | 287.890 | 2,069.000 | 39.040 | 317.200 |
Measures . | Q (MCM) . | P (mm) . | T (°C) . | E (mm) . |
---|---|---|---|---|
Mean | 32.832 | 259.821 | 31.973 | 149.202 |
Std. deviation | 63.346 | 443.125 | 3.290 | 68.701 |
Coefficient of variation | 1.929 | 1.705 | 0.103 | 0.460 |
Variance | 4,012.688 | 196,359.347 | 10.827 | 4,719.884 |
Skewness | 2.349 | 1.864 | 0.581 | 0.925 |
Kurtosis | 5.137 | 2.757 | −0.759 | −0.439 |
Shapiro–Wilk | 0.592 | 0.656 | 0.926 | 0.862 |
P-value of Shapiro–Wilk | <0.001 | <0.001 | <0.001 | <0.001 |
Range | 287.890 | 2,069.000 | 12.560 | 264.770 |
Minimum | 0.000 | 0.000 | 26.480 | 52.430 |
Maximum | 287.890 | 2,069.000 | 39.040 | 317.200 |
Rainfall, runoff, and mean temperature data for Bhima River catchment.
Probability distribution function of 15-year daily rainfall from 2000 to 2014 for Bhima River catchment.
Probability distribution function of 15-year daily rainfall from 2000 to 2014 for Bhima River catchment.
ANN approach
An ANN constitutes a fundamental element within artificial intelligence methodologies, designed to emulate the human brain's functionality during the analysis and processing of information through an intricate network of interconnected neurons. Comprising hundreds of thousands of artificial neurons, the ANN utilizes nodes as processing units, categorized into input and output units. Information is received by the input units according to an internal weight system. Consequently, the neural network, situated in the hidden layer, endeavors to assimilate the provided information with the aim of generating an output report. The application of ANN to the modeling and forecasting of complicated and unpredictable processes is beneficial. With the help of ANN, it is possible to build a neural system employing information gathered from the historical and previous investigation findings datasets to forecast future phenomena, even without a comprehensive grasp of the physical parameters that affect both the current and the future. The workflow typically involves the following steps:
Data partitioning
To evaluate the effectiveness of ANN models, the data on weather and runoff were split into two distinct sets: the first set, known as the calibration data, comprised 70% of the total data, and the second set, known as the validation data, made up the remaining 30%. This procedure was utilized in the research (Chakravarti et al. 2015). The ANN network was trained initially using the training dataset. Later, validation data was utilized to evaluate the newly developed neural network model. Cross-validation is a strategy utilized frequently in ANN modeling and dramatically influences the approach for accessible data that are segregated (Vabalas et al. 2019). It is possible to use it to determine when it is time to stop training and to evaluate the capacity for speculation possessed by various models. For example, to make an accurate prediction of the runoff in the Bhima watershed, the output from the training data was cross-verified with validation datasets.
ANN architecture development
Input layer neurons
The number of neurons in the input layer is determined by the number of parameters utilized for runoff estimation. As input layer neurons in the current work, various combinations of input climate variables like rainfall, evaporation, and maximum temperature were investigated. The logistic sigmoid transfer function is used in both the hidden and output layers. The performance of created models was tested using various combinations of input layers.
Hidden layer neurons
Hidden layer neurons rely on several parameters, including input and output neurons and a training procedure. In the current study, the number of neurons in the input layer was half that of those in the hidden layer.
Output layer neurons
The target variables determine the number of neurons in the output layer. A single neuron has been fixed in the output layer for the current modeling investigation. During the calibration and validation of the ANN model, the target values, i.e., neurons in the output layer, are part of the input layer.
Empirical models for rainfall-runoff modeling
Watershed development and appropriate water resource estimation of runoff from watersheds are critical. It has become a difficult task in an ungauged watershed. Various research studies have been undertaken in several watersheds, and several empirical equations have been created. Because of their ease, accuracy, and limited data requirements, many studies and research used these empirical equations for hydrological investigations and estimating yearly or monthly runoff in watersheds. The following are some empirical equations used to estimate runoff from watersheds in various regions of India and worldwide.
Inglis and DeSouza's approach (IDE)
Khosla's formula (KF)
Coutagine relationship (COR)

Department of Irrigation, India (DII)
Performance evaluation
To assess the effectiveness of the models developed in this study, a comprehensive range of standard statistical evaluation measures was utilized. Specifically, ten distinct statistical performance indices were employed: Pearson's correlation coefficient (R), mean absolute error (MAE), root mean squared error (RMSE), mean bias error (MBE), mean absolute percentage error (MAPE), normalized root means square error (NRMSE), deviation of runoff volume (DRV), index of agreement (d), Nash–Sutcliffe model efficiency coefficient (NSE), and Kling–Gupta efficiency (KGE). These statistical parameters can be calculated using the following expressions:
. | ||
---|---|---|
1. | Pearson's correlation coefficient (R) | ![]() |
2. | Index of agreement (d) | ![]() |
3. | Mean absolute error (MAE) | ![]() |
4. | Mean biased error (MBE) | ![]() |
5. | Root mean squared error (RMSE) | ![]() |
6. | Nash–Sutcliffe model efficiency coefficient (NSE) | ![]() |
7. | Normalized root mean square error (NRMSE) | ![]() |
8. | Mean absolute percentage error (MAPE) | ![]() |
9. | Kling–Gupta efficiency (KGE) | ![]() |
10. | Deviation of runoff volume | ![]() |
. | ||
---|---|---|
1. | Pearson's correlation coefficient (R) | ![]() |
2. | Index of agreement (d) | ![]() |
3. | Mean absolute error (MAE) | ![]() |
4. | Mean biased error (MBE) | ![]() |
5. | Root mean squared error (RMSE) | ![]() |
6. | Nash–Sutcliffe model efficiency coefficient (NSE) | ![]() |
7. | Normalized root mean square error (NRMSE) | ![]() |
8. | Mean absolute percentage error (MAPE) | ![]() |
9. | Kling–Gupta efficiency (KGE) | ![]() |
10. | Deviation of runoff volume | ![]() |
Here is the observed runoff at ith data point,
is the predicted runoff at ith data point, N is the number of runoff data points,
is the mean observed runoff,
is the mean predicted runoff, R is the Pearson's correlation coefficient value, rm is the average of observed values, cm is the average of predicted values, rd is standard deviation of observation values, and cd is standard deviation of predicted values.
In addition to these indicators, radar diagrams, box and whisker plots, and Taylor charts were utilized to graphically represent the findings of our study. Finally, a comprehensive statistical analysis was conducted to compare the empirical model results with the observed data. The model having least error and near to zero value of MAE, MBE, RMSE, NRMSE, MAPE, and DRV; and R, d, NSE, and KGE values of 1 stands for perfect fits and considered as ideal/best model.
RESULTS AND DISCUSSION
This study utilized volumetric runoff data from the Bhima River basin and weather dynamics from 2000 to 2014 in Maharashtra, India, to evaluate the predictive capabilities of the ANN soft computing technique and empirical models. The dataset was divided into calibration data for model training and validation data for testing. Performance evaluation involved both qualitative assessment through visual representation and quantitative analysis using various statistical model performance indices.
Input feature selection
Selecting meteorological factors that influence runoff generation is crucial as the initial step in model development. However, runoff generation within a catchment is a multifaceted process influenced not only by meteorological variables but also by the catchment's geomorphological behaviors. In catchments where geomorphological characteristics remain unchanged, runoff is primarily regulated by weather variables such as rainfall, temperature, and evaporation. Therefore, understanding the interplay between meteorological variables is essential for developing predictive models. The best input feature selection approach was utilized for selection of input variables. Based on the values of statistical indices, combination of rainfall (P), one-day lagged rainfall (Pt−1), evaporation (E), one-day lagged evaporation (Et−1), temperature (T), and one-day lagged temperature (Tt−1) were found to be the best features explaining the variability of the monthly runoff data (Table 2). The best input selection approach method's results showed that the combination of best features resulted in the lowest MSE value of 550.84. Similarly, other statistical indices namely, coefficient of determination (R2), adjusted R2, Mallows' Cp, Alkale's AIC, Schwarz's SBC, and Amemia's PC gave the values of 0.87, 0.0.86, 5.11, −1,141.96, 1,161.11, and 0.14, respectively, and shown by blue color in Table 2. This table highlights the best subset of input features for runoff prediction. For all other variable combinations, the values of input selection indices have been displayed in Table 2.
The best subset of input features for runoff prediction
No. of variables . | Variables . | MSE . | R² . | Adjusted R² . | Mallows’ Cp . | Akaike's AIC . | Schwarz's SBC . | Amemiya's PC . |
---|---|---|---|---|---|---|---|---|
1 | Pt−1 | 720.79 | 0.82 | 0.82 | 55.73 | 1,186.45 | 1,192.84 | 0.18 |
2 | P, Pt−1 | 598.24 | 0.85 | 0.85 | 17.25 | 1,153.89 | 1,163.47 | 0.15 |
3 | P, Pt−1, E | 580.80 | 0.86 | 0.85 | 12.63 | 1,149.55 | 1,162.32 | 0.15 |
4 | P, Pt−1, E, Et−1 | 552.18 | 0.86 | 0.86 | 4.53 | 1,141.43 | 1,157.39 | 0.14 |
5 | P, Pt−1, E, Et−1, Tt−1 | 550.84 | 0.87 | 0.86 | 5.11 | 1,141.96 | 1,161.11 | 0.14 |
6 | P, Pt−1, E, Et−1, T, Tt−1 | 553.65 | 0.87 | 0.86 | 7.00 | 1,143.84 | 1,166.19 | 0.14 |
No. of variables . | Variables . | MSE . | R² . | Adjusted R² . | Mallows’ Cp . | Akaike's AIC . | Schwarz's SBC . | Amemiya's PC . |
---|---|---|---|---|---|---|---|---|
1 | Pt−1 | 720.79 | 0.82 | 0.82 | 55.73 | 1,186.45 | 1,192.84 | 0.18 |
2 | P, Pt−1 | 598.24 | 0.85 | 0.85 | 17.25 | 1,153.89 | 1,163.47 | 0.15 |
3 | P, Pt−1, E | 580.80 | 0.86 | 0.85 | 12.63 | 1,149.55 | 1,162.32 | 0.15 |
4 | P, Pt−1, E, Et−1 | 552.18 | 0.86 | 0.86 | 4.53 | 1,141.43 | 1,157.39 | 0.14 |
5 | P, Pt−1, E, Et−1, Tt−1 | 550.84 | 0.87 | 0.86 | 5.11 | 1,141.96 | 1,161.11 | 0.14 |
6 | P, Pt−1, E, Et−1, T, Tt−1 | 553.65 | 0.87 | 0.86 | 7.00 | 1,143.84 | 1,166.19 | 0.14 |
Performance evaluation of the ANN model
The statistical model evaluation metrics namely, R, RMSE, NSE, MAE, MBE, and d were used to assess the performance of the ANN model for the Bhima River watershed. It can be observed from Table 3 that the ANN model (model 5) with inputs of rainfall, one-day previous rainfall, and evaporation has done well in both calibration and validation cases. The ANN-M5 model obtained the best R, NSE, RMSE, MAE, d, and MBE statistics of 0.95, 0.89, 17.39, 0.12, 0.97, and 0.12, respectively, during training. While, during the testing phase, the R, NSE, RMSE, MAE, d, and MBE statistics were found as 0.94, 0.88, 11.47, 0.03, 0.97, and 0.03, respectively. In contrast to model 5, model 2 (input as rainfall and E) still has the lowest statistics (R = 0.94, RMSE = 21.01, NSE = 0.80, and d = 0.94). The outcome suggests that rainfall and evaporation are insufficient to predict runoff from the watershed. Compared with other models, the model with the input combination of rainfall, evaporation, and temperature performed well during the training period (R = 0.95, RMSE = 16.58, NSE = 0.90).
Performance of ANN models in predicting monthly runoff
Period . | Model . | Input . | R . | NSE . | RMSE . | MAE . | d . | MBE . |
---|---|---|---|---|---|---|---|---|
Training | M1 | P | 0.89 | 0.77 | 25.74 | 2.43 | 0.92 | −2.43 |
M2 | P, E | 0.94 | 0.85 | 21.01 | 2.98 | 0.95 | −2.98 | |
M3 | P, Pt−1 | 0.95 | 0.89 | 17.39 | 0.24 | 0.97 | 0.24 | |
M4 | P, Pt−1, E, Et−1 | 0.95 | 0.90 | 17.31 | 0.21 | 0.97 | 0.21 | |
M5 | P, Pt−1, E | 0.95 | 0.89 | 17.39 | 0.12 | 0.97 | 0.12 | |
M6 | P, Pt−1, E, T | 0.93 | 0.87 | 19.37 | 1.41 | 0.96 | 1.41 | |
M7 | P, Pt−1, E, Et−1, T, Tt−1 | 0.94 | 0.88 | 18.72 | 5.48 | 0.97 | 5.48 | |
M8 | P, E, T | 0.95 | 0.90 | 16.58 | 0.00 | 0.97 | 0.00 | |
Testing | M1 | P | 0.91 | 0.83 | 13.52 | 0.14 | 0.95 | 0.14 |
M2 | P, E | 0.90 | 0.80 | 14.74 | 0.19 | 0.94 | 0.19 | |
M3 | P, Pt−1 | 0.94 | 0.88 | 11.64 | 0.42 | 0.96 | −0.42 | |
M4 | P, Pt−1, E, Et−1 | 0.91 | 0.82 | 14.05 | 0.06 | 0.95 | −0.06 | |
M5 | P, Pt−1, E | 0.94 | 0.88 | 11.47 | 0.03 | 0.97 | 0.03 | |
M6 | P, Pt−1, E, T | 0.91 | 0.82 | 14.07 | 0.71 | 0.95 | 0.71 | |
M7 | P, Pt−1, E, Et−1, T, Tt−1 | 0.94 | 0.86 | 12.22 | 1.75 | 0.96 | 1.75 | |
M8 | P, E, T | 0.93 | 0.87 | 11.89 | 0.37 | 0.96 | −0.37 |
Period . | Model . | Input . | R . | NSE . | RMSE . | MAE . | d . | MBE . |
---|---|---|---|---|---|---|---|---|
Training | M1 | P | 0.89 | 0.77 | 25.74 | 2.43 | 0.92 | −2.43 |
M2 | P, E | 0.94 | 0.85 | 21.01 | 2.98 | 0.95 | −2.98 | |
M3 | P, Pt−1 | 0.95 | 0.89 | 17.39 | 0.24 | 0.97 | 0.24 | |
M4 | P, Pt−1, E, Et−1 | 0.95 | 0.90 | 17.31 | 0.21 | 0.97 | 0.21 | |
M5 | P, Pt−1, E | 0.95 | 0.89 | 17.39 | 0.12 | 0.97 | 0.12 | |
M6 | P, Pt−1, E, T | 0.93 | 0.87 | 19.37 | 1.41 | 0.96 | 1.41 | |
M7 | P, Pt−1, E, Et−1, T, Tt−1 | 0.94 | 0.88 | 18.72 | 5.48 | 0.97 | 5.48 | |
M8 | P, E, T | 0.95 | 0.90 | 16.58 | 0.00 | 0.97 | 0.00 | |
Testing | M1 | P | 0.91 | 0.83 | 13.52 | 0.14 | 0.95 | 0.14 |
M2 | P, E | 0.90 | 0.80 | 14.74 | 0.19 | 0.94 | 0.19 | |
M3 | P, Pt−1 | 0.94 | 0.88 | 11.64 | 0.42 | 0.96 | −0.42 | |
M4 | P, Pt−1, E, Et−1 | 0.91 | 0.82 | 14.05 | 0.06 | 0.95 | −0.06 | |
M5 | P, Pt−1, E | 0.94 | 0.88 | 11.47 | 0.03 | 0.97 | 0.03 | |
M6 | P, Pt−1, E, T | 0.91 | 0.82 | 14.07 | 0.71 | 0.95 | 0.71 | |
M7 | P, Pt−1, E, Et−1, T, Tt−1 | 0.94 | 0.86 | 12.22 | 1.75 | 0.96 | 1.75 | |
M8 | P, E, T | 0.93 | 0.87 | 11.89 | 0.37 | 0.96 | −0.37 |
Line diagram and scattered plot of observed and ANN-predicted runoff during calibration.
Line diagram and scattered plot of observed and ANN-predicted runoff during calibration.
Line diagram and scattered plot of observed and ANN-predicted runoff during validation.
Line diagram and scattered plot of observed and ANN-predicted runoff during validation.
Performance assessment of empirical models
The most accurate method for assessing runoff involves utilizing hydro-meteorological data within a catchment area. However, in ungauged catchments with limited hydro-meteorological data, determining runoff can be challenging. In such cases, empirical models offer a feasible solution for runoff assessment. In this study, the calculated runoff from empirical models was compared with observed runoff data from the catchment. Statistical analysis results of the various methods employed in this investigation are presented in Table 4.
Performance assessment of empirical models
Statistical indices . | IDE . | COR . | DII . | KF . |
---|---|---|---|---|
R | 0.84 | 0.82 | 0.81 | 0.79 |
MBE | 70.4 | 74.36 | 36.2 | 51.09 |
NSE | 0.93 | 0.94 | 0.95 | 0.89 |
d | 0.89 | 0.82 | 0.91 | 0.79 |
KGE | 0.67 | 0.76 | 0.53 | 0.75 |
MAE | 80.78 | 70.01 | 83.66 | 85.98 |
MAPE (%) | 25.7 | 20.6 | 25.6 | 25.8 |
NRMSE | 0.25 | 0.22 | 0.26 | 0.26 |
RMSE | 101.61 | 87.4 | 103.14 | 101.67 |
DRV | − 18 | − 9.2 | − 13 | − 19 |
Statistical indices . | IDE . | COR . | DII . | KF . |
---|---|---|---|---|
R | 0.84 | 0.82 | 0.81 | 0.79 |
MBE | 70.4 | 74.36 | 36.2 | 51.09 |
NSE | 0.93 | 0.94 | 0.95 | 0.89 |
d | 0.89 | 0.82 | 0.91 | 0.79 |
KGE | 0.67 | 0.76 | 0.53 | 0.75 |
MAE | 80.78 | 70.01 | 83.66 | 85.98 |
MAPE (%) | 25.7 | 20.6 | 25.6 | 25.8 |
NRMSE | 0.25 | 0.22 | 0.26 | 0.26 |
RMSE | 101.61 | 87.4 | 103.14 | 101.67 |
DRV | − 18 | − 9.2 | − 13 | − 19 |
Results revealed that the COR model was found best based on the values of statistical performance indices. The COR model showed a little runoff volume deviation from the observed data with a DRV value of −9.2 (overestimation), followed by the DII, IDE, and KF models with DRV values of −13, −18, and −19, respectively. All the empirical models overestimated runoff volume compared with the observed runoff volume. Among all the four empirical models, the COR model resulted in the best statistical performance indicators values namely, R = 0.82, MBE = 74.36, NSE = 0.94, d = 0.82, KGE = 0.76, MAE = 70.01, MAPE = 20.6%, NRMSE = 0.22, RMSE = 87.4, and DRV = −9.2. These results are consistent with the findings of Khopade & Oak (2014). Overestimation of the runoff was reported maximum by the KF model.
Variation between predicted and observed runoff by various approaches.
Radar chart shows the model performance of the selected empirical and ANN-5 model.
Radar chart shows the model performance of the selected empirical and ANN-5 model.
Taylor diagram of ANN-5, IDE, COR, DII, and KF models during the testing period at the Bhima River catchment.
Taylor diagram of ANN-5, IDE, COR, DII, and KF models during the testing period at the Bhima River catchment.
The findings of this study are highly promising and encouraging. In comparison to empirical equations, the ANN models demonstrated superior performance in modeling rainfall runoff. It was evident that the proposed ANN models yielded robust runoff predictions based on their statistical performance. This study suggests that empirical mathematical models hold potential for estimating annual runoff from ungauged watersheds. While differing perspectives exist regarding the application of empirical models in runoff modeling studies, advancements in artificial intelligence techniques and the accessibility of various soft computing algorithms have led to more accurate rainfall-runoff modeling. Consequently, such approaches may be prioritized in ungauged watersheds to enhance estimation accuracy.
The ANN method is more suitable for predicting runoff than classical empirical models. Based on this result, conventional modeling approaches have difficulty producing a reliable model because of rainfall-runoff's inherent non-linearity and the complexity of the hydrologic process. Therefore, the proposed approach for calculating rainfall-runoff relationships can be a handy and efficient tool. However, considering that the architecture of the ANN differs from the other AI models in the research, a comparative analysis may not be sufficient for the current study. As a result, for future research, it is possible to enhance the current model so that it is comparable to the ANN model to carry out comparative analysis in a reasonable time frame.
The findings of the current study may have applications as real-time rainfall and water level data can be used as inputs to ANN models, enabling real-time flood forecasting and issuing timely warnings to vulnerable communities downstream. Integrating real-time runoff predictions with groundwater resource assessments can help optimize water use across surface and groundwater sources. Accurate water demand forecasts based on ANN models can inform water pricing policies and incentivize water conservation practices in agriculture, industries, and domestic use. Floodplain zoning regulations and infrastructure design standards should consider predicted flood risks based on ANN models. Encouraging rainwater harvesting at individual and community levels can help mitigate water scarcity during dry periods. Water allocation shares, pricing mechanisms, and environmental regulations should be informed by reliable runoff predictions to ensure equitable and sustainable water use. The Bhima River basin is shared by multiple states. Effective water management requires collaborative efforts and data sharing among riparian states. By incorporating the insights gained from improved ANN-based runoff prediction models, policymakers can develop more informed and effective strategies for managing water resources, mitigating flood risks, and adapting to climate change in the Bhima River basin.
The overall potential source of error in the estimation of ANN-based runoff prediction may include insufficient number of data points, overfitting, underfitting, biased data which may not adequately represent diversity of the real-world scenarios. Besides that need to select the different ML models for further improvement in the accuracy need to be assessed for better accuracy in prediction of runoff under various climate parameters. Besides that, the ANN model performance can be done by increasing dataset size, using data augmentation, optimizing model architecture, applying regularization, and fine-tune learning rates during training, exploring ensemble methods, and considering transfer learning for enhanced accuracy and generalization need to be taken into consideration for better predictability of runoff through the model.
CONCLUSIONS
This study evaluated the capability and accuracy of various empirical and ANN models in predicting the runoff volume in the Bhima catchment. The ANN model demonstrated superior performance compared with the empirical models, successfully simulating runoff in other sub-basins with similar characteristics. This underscores the importance of predictive models in efficient water resource management and future planning in any catchment area or drainage basin. The study revealed that among the empirical approaches, the COR provided the best results, followed by the Inglis and DeSouza formula. In data-limited situations, these empirical approaches can effectively assess runoff. The ANN model's estimations closely matched observed runoff data, aiding decision-makers in inflow forecasting, water resources planning, management, and flood forecasting. However, the study is limited by constrained data sources. Additionally, the empirical models do not account for other catchment morphological characteristics affecting runoff generation. Given the site-specific nature of empirical equations, their performance may vary under different conditions, necessitating tailored coefficients for specific catchments. Furthermore, the incorporation of ensemble models or Bayesian techniques could enhance model robustness, enabling thorough performance evaluation and probabilistic forecasts, thus improving runoff prediction reliability in the study region.
ACKNOWLEDGEMENTS
The authors are thankful to the Indian Meteorological Department, India and Water Resources Department, Government of Maharashtra for sharing data to conduct this investigation.
AUTHOR CONTRIBUTIONS
Conceptualization, P.S. and S.R.B.; methodology, P.S. and S.R.B.; software, P.D., J.R., and V.G.; validation, P.D., J.R., and V.G.; formal analysis, P.S. and S.R.B.; investigation, S.R.B.; resources, S.R.B.; data curation, P.S.; writing – original draft preparation, P.S. and S.R.B.; writing – review and editing, P.D., J.R., V.G., A.S., R.K.T., and D.K.V.; visualization, P.D., D.K.V., and J.R.; supervision, S.R.B.; project administration, S.R.B. All authors have read and agreed to the published version of the manuscript.
FUNDING
This research received no external funding.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.