Extreme weather conditions like floods and droughts call for careful planning and management of water resources in order to prevent fatalities and other negative effects. Modern soft computing and machine learning approaches have provided a solution for simulating these hydrological phenomena despite the complexity and non-linear character of these phenomena, which depend on diverse parameters. Distributed or semi-distributed models for large-size watershed areas with geographical irregularity and heterogeneity necessitate a substantial amount of high-quality spatial data. This research uses 40 years (1981–2020) of daily rainfall-runoff data to illustrate the application of two data-driven models, random forest regression (RFR) and feed-forward neural network (FFNN), for semi-arid, large-size watershed areas. To understand the effect of input data, different input–output combinations were considered to simulate eight rainfall-runoff models. Results show that both RFR and FFNN models have successfully performed but RFR model performance is best with correlation coefficient values of 0.9928 (M6) and 0.9926 (M1).

  • Spatially well-distributed, low rain-gauge density data can provide good results.

  • RFR model performance for peak-flow estimation is better than FFNN.

  • A data-driven model proves a better option for the large watershed with low rain-gauge density for flood warning systems.

Among all extreme weather events, floods and heavy rains claimed not only the highest casualties worldwide but also the loss of property, infrastructure, and agriculture. Heavier precipitation, more frequent hurricanes, and recently experienced flash flooding at Death Valley National Park are some of the key ways of climate change that has increased the flood risks. Record-breaking heavy rainfall events are projected to increase along with temperatures through the 21st century. As the frequency of extreme weather events is increasing at an alarming rate, early warning around 3–6 h prior to the event using technology, advanced models, observations, and relevant computing are needed. Not only heavy rains in a short time but even a reasonable amount of rainfall can cause severe damage because of manmade environments such as increased urbanisation, alterations in natural drainage systems, and changing land use. On the other hand, water scarcity due to droughts and overuse is also a big challenge in front of the societies and countries. Hence, it necessitates sustainable water management that will help to determine future water requirements for livelihood and irrigation purposes. Hence, maximum research has been carried out and is still required on rainfall-runoff analysis. A variety of models have been used to develop the relationship, including empirical, conceptual, deterministic, stochastic, physical, and data-driven models. The use of soft computing and machine learning techniques such as neural networks, genetic algorithms, and fuzzy logic that are able to model complex or unknown relationships is becoming very useful and feasible (Chandwani et al. 2015).

ANN remained the most popular tool and has been used for a wide range of hydrological processes such as rainfall-runoff modelling, flood frequency analysis, stream flow predictions, sedimentation, reservoir inflow prediction, water quality modelling, and ground water modelling, but the majority of work is in rainfall-runoff modelling (Maier et al. 2010). The rainfall-runoff relationship is a physical phenomenon that plays an important role in water resource management planning and flood forecasting. The rainfall-runoff model is a mathematical equation describing the intricate relationship between variables and parameters. The parameters that need to be considered for rainfall-runoff modelling are the topographical and hydrological features, land use and land cover, size of the watershed (small, medium, or large), density of rain-gauge (RG) stations, quality, and length of the data series (Sinha et al. 2015).

An artificial neural network (ANN) model is a data-driven machine learning modelling approach that gives good results not only for frequent disastrous flood-prone areas (Kisi et al. 2013; Ruslan et al. 2014; Chaipimonplin & Vangpaisal 2015; Setiono & Hadiani 2015; Kumar & Yadav 2021) but also for arid and semi-arid regions (Riad et al. 2004; Solaimani 2009; Ghumman et al. 2011; Aichouri et al. 2015; Parmar et al. 2016; Hussain et al. 2017). Based on time steps considered for modelling, a rainfall-runoff relation has been developed for hourly, six-hourly, daily, weekly, monthly, and annually recorded data, which also depends on the size of the basin as precipitation water that needs to travel the distance up to the basin outlet will vary accordingly. Based on the size of the watershed, most of the rainfall-runoff simulation is carried out on a medium-sized watershed with an area ranging from 250 to 2,500 km2 (Kalteh 2008; Zadeh et al. 2010; Rezaeianzadeh et al. 2013; Ruslan et al. 2014). Comparatively, less work is carried out on mini, micro, and milli watersheds. For example, Jain & Indurthy (no date) demonstrated a comparative analysis of event-based rainfall-runoff modelling in a watershed of less than 1 km2 and discovered that the ANN is the most suitable technique. A study done by Kovář et al. (2015) on the watershed of an area of 26.58 km2 has been noticed that it is hard to estimate runoff for extremely small areas due to the possibility of its severity and torrential water flow. As the watershed's spatial variance rises with size, larger watersheds have heterogeneous characteristics. With an increasing watershed area, more water can be stored. The increase in area will result in a rise in both the quantity and variety of characteristics. For instance, the nature of the ground profile, which includes topography, soil qualities, and land use, will vary as the size of the watershed increases. Yet data-driven models such as ANN have given good results without consideration of the physical, chemical, or biological characteristics of the basin. Rajurkar et al. (2002) demonstrated that coupling of the ANN with a multiple-input, single-output model predicts the daily runoff values with high accuracy for a large-size catchment area of 17,157 km2 where the spatial variation of rainfall is accounted for by subdividing the catchment and treating the average rainfall of each sub-catchment as a parallel and separate lumped input to the model. Patel & Joshi (2017) simulated an ANN model using the feed-forward back propagation algorithm for establishing monthly and annual rainfall-runoff correlations for the Dharoi Watershed of an area of 21,674 km2 and the results indicated that the ANN model had a good ability to capture the relationship between input and output. The aim of this paper is to develop a rainfall-runoff model for a semi-arid, large-size Manjira watershed area 9,960 km2 spread along the Deccan plateau of India by using random forest regression (RFR) and feed-forward neural network (FFNN) with different input and output parameters and compare the accuracy. The paper is organised as follows: The next section gives detailed information about the study area and data description, followed by a section explaining theory and methodology. Chapter results and discussion are given next, while the last chapter gives a concluding remark.

For the present study, Manjira River basin in Maharashtra is selected, which is one of the important tributaries of Godavari River (Figure 1). The river Manjira rises in the Balaghat range in the Beed district of Maharashtra at an altitude of about 823 m. The river flows in a generally east and southeasterly direction for 515 km. The total length of the river from the source to its confluence with the Godavari, at an altitude of 329 m is about 724 km (Central Water Commission and India Meteorological Department 2014). Though the climate of Maharashtra is tropical monsoon, the Manjira basin, which is spread over the Marathwada region, covers the semi-arid portions of Beed, Latur, and Osmanabad districts. The mean daily temperature is between 18 and 22 °C during the winter and above 22 °C during the remaining months. The average annual rainfall over the Manjira River basin is 882 mm. Saigaon is the stream gauge location on the Manjira River, located at 18°03′ N and 77° 03′ E at an altitude of 542.723 m. For calculating the drainage area contributing to the runoff at the Saigaon stream gauge location, the shuttle radar topography mission (SRTM) of 30-m resolution and the digital elevation model (DEM) from the USGS website are used. By using this DEM, the drainage basin area is delineated for the Saigaon stream gauge by using QGIS software, which is observed to lie between 75°20′ and 77°04′ east longitudes and 17°50′ and 19° north latitude, as shown in the diagram. The total area of the drainage basin is 9,960 km2 up to the Saigaon river gauge station. The interior of the basin is a plateau with elevation ranges of 841–542 m and a general eastward slope of 0–2.93 m/km. About 70% of the land cover of the drainage basin is under crops, while 10% is barren land and 10% is grass land. As the major source of livelihood is agriculture, water remains of prime importance. It has become necessary to address not only draughts that affect agriculture but also heavy floods due to excess rainfall in a short period of time in recent years that cause damage to crops. In terms of the geological characteristics of the drainage basin, 90% of the soil type is loamy, while 10% is clayey.
Figure 1

Location map of study area Manjira sub watershed.

Figure 1

Location map of study area Manjira sub watershed.

Close modal
40-year daily rainfall data are collected from the Indian Meteorological Department (IMD) and the National Hydrological Project (NHP), Nashik, India. Initially, six RGs received daily rainfall data from IMD for the period of 40 years, from June 1981 to June 2020, as given in Table 1 (Figure 2). According to the Indian Standard (IS 4987-1968) recommendation, in plain areas, RG density must be one station per 520 km2, and as compared to the total drainage area of 9,960 km2, six RGs are quite few. So additional 15 RGs daily rainfall data are collected from NHP, Nashik, Maharashtra, as given in Table 2 (Figure 3). For the same time period, daily runoff data at the stream gauge located at Saigaon (Latitude: 18°03′00″, Longitude: 77°03′00″) are collected from the Central Water Commission, Krishna and Godavari Basin, Hyderabad (KGBH).
Table 1

IMD RGs

Sr. No.NameLatitudeLongitude
Patoda 18° 43′ 12″ 75° 42′ 0″ 
Chowasala 18° 48′ 0″ 75° 28′ 48″ 
Kallam 18° 34′ 12″ 76° 1′ 12″ 
Ausa 18° 15′ 0″ 76° 30′ 0″ 
Latur 18° 24′ 0″ 76° 48′ 0″ 
Halsoor 18° 1′ 12″ 77° 1′ 12″ 
Sr. No.NameLatitudeLongitude
Patoda 18° 43′ 12″ 75° 42′ 0″ 
Chowasala 18° 48′ 0″ 75° 28′ 48″ 
Kallam 18° 34′ 12″ 76° 1′ 12″ 
Ausa 18° 15′ 0″ 76° 30′ 0″ 
Latur 18° 24′ 0″ 76° 48′ 0″ 
Halsoor 18° 1′ 12″ 77° 1′ 12″ 
Table 2

NHP, Nashik RGs

Sr. No.NameLatitudeLongitude
Alni 18° 17′ 30″ 76° 0′ 55″ 
Digholamba 18° 40′ 44″ 76° 17′41″ 
Jadhala 18° 38′ 0″ 73° 42′ 31″ 
Jagalpur 18° 37′ 45″ 77° 04′ 11″ 
Karajkheda 18° 03′ 03″ 76° 16'33″ 
Limbaganesh 18° 48′ 00″ 75° 40′ 0″ 
Limbla 18° 55′ 21″ 76° 17′ 31″ 
Matola 18° 04′ 57″ 76° 24′ 25″ 
Nalegaon 18° 25′ 11″ 76° 48'47″ 
10 Nitur 18° 14′ 25″ 76° 46′ 38″ 
11 Padoli 18° 11′ 54″ 76° 16′ 43″ 
12 Tadola 18° 22′ 33″ 76° 03′ 0″ 
13 Ujani 18° 42′ 42″ 76° 41′ 25″ 
14 Yellamghat 18° 47′ 17″ 75° 49′ 35″ 
15 Yeoti 18° 55′ 55″ 76° 14'31″ 
Sr. No.NameLatitudeLongitude
Alni 18° 17′ 30″ 76° 0′ 55″ 
Digholamba 18° 40′ 44″ 76° 17′41″ 
Jadhala 18° 38′ 0″ 73° 42′ 31″ 
Jagalpur 18° 37′ 45″ 77° 04′ 11″ 
Karajkheda 18° 03′ 03″ 76° 16'33″ 
Limbaganesh 18° 48′ 00″ 75° 40′ 0″ 
Limbla 18° 55′ 21″ 76° 17′ 31″ 
Matola 18° 04′ 57″ 76° 24′ 25″ 
Nalegaon 18° 25′ 11″ 76° 48'47″ 
10 Nitur 18° 14′ 25″ 76° 46′ 38″ 
11 Padoli 18° 11′ 54″ 76° 16′ 43″ 
12 Tadola 18° 22′ 33″ 76° 03′ 0″ 
13 Ujani 18° 42′ 42″ 76° 41′ 25″ 
14 Yellamghat 18° 47′ 17″ 75° 49′ 35″ 
15 Yeoti 18° 55′ 55″ 76° 14'31″ 
Figure 2

DEM with IMD RG location.

Figure 2

DEM with IMD RG location.

Close modal
Figure 3

IMD & NHP RG location.

Figure 3

IMD & NHP RG location.

Close modal
The climate of the Saigaon watershed area is semi-arid, with distinct wet and dry seasons. To understand the rainfall pattern, the annual rainfall for the six IMD RGs covering the entire drainage area is plotted (Figure 4). It shows the range of annual rainfall depth from a minimum of 81.8 mm at Patoda in 2011 to a maximum of 1,416.2 mm at Kallam in 1998. Overall, years 1986, 1994, 2011, and 2015 show comparatively less annual rainfall, while years 1990, 1998, and 2016 show the heaviest rainfall.
Figure 4

Annual rainfall graph (years 1981–2020).

Figure 4

Annual rainfall graph (years 1981–2020).

Close modal

Data processing is an important step that is carried out by first identifying the percentage of missing data for each RG station, and RGs with missing data of more than 30% are eliminated. For selected RGs, missing data were estimated with data from other well-associated gauges at a minimum distance by the normal ratio method, as the normal ratio method provides the most accurate estimations of missing data and is observed to be the most reliable method (Armanuos et al. 2020). To understand the relationship of each RG's rainfall value with runoff in terms of water level and discharge, correlation

Computation has been carried out (Figure 5). It can be observed that water level has a higher correlation with all RGs as compared to discharge. RGs in Latur, Hulsoor, and Ausa show a higher correlation with water level as compared to others because of their close proximity. Similarly, Ausa with Latur and Chowsala with Patoda are seen to be highly correlated because of their close proximity with respect to each other.
Figure 5

Correlation plot.

Figure 5

Correlation plot.

Close modal
The present work uses two data-driven models: the random forest regressor (RFR) and the FFNN. The model development process is carried out with different input combinations and outputs. Model simulation is done by considering only 6 IMD RG stations separately and totalling 18 (IMD RG + NHP RG) RG stations with additional NHP and Nashik RGs to understand the effect of the same on model results. For weighted rainfall calculation, Thiessen polygons are generated by considering 6 RGs as well as 18 RGs by using the QGIS software tool, as shown in Figures 6 and 7, respectively. During the model development process, simulation is carried out by taking discharge as well as gauge level as output parameters. As the region is semi-arid, nearly 80% of the annual rainfall is received during the monsoon season. Hence model calibration is done by considering monsoon season days, which is seasonal data, as well as all 365 days, which is full-year data.
Figure 6

Thiessen polygon for six RGs.

Figure 6

Thiessen polygon for six RGs.

Close modal
Figure 7

Thiessen polygon for 18 RGs.

Figure 7

Thiessen polygon for 18 RGs.

Close modal

The models with different structures are presented in Table 3.

Table 3

Input–output combination of the models

ModelInputAbbreviation for inputOutputFull-year data/seasonal data
M1 6 RGs rainfall P(1–6) Discharge (Q) Full year 
M2 6 RGs rainfall P(1–6) Discharge (Q) Seasonal 
M3 6 RGs rainfall P(1–6) Level (L) Full year 
M4 6 RGs rainfall P(1–6) Level (L) Seasonal 
M5 6 RGs, weighted rainfall (Theisen polygon method) Weighted
P(1–6) 
Level (L) Full year 
M6 18 RGs P(1–18) Discharge (Q) Full year 
M7 18 RGs P(1–18) Level (L) Full year 
M8 18 RGs, weighted rainfall (Theisen polygon method) Weighted
P(1–18) 
Level (L) Full year 
ModelInputAbbreviation for inputOutputFull-year data/seasonal data
M1 6 RGs rainfall P(1–6) Discharge (Q) Full year 
M2 6 RGs rainfall P(1–6) Discharge (Q) Seasonal 
M3 6 RGs rainfall P(1–6) Level (L) Full year 
M4 6 RGs rainfall P(1–6) Level (L) Seasonal 
M5 6 RGs, weighted rainfall (Theisen polygon method) Weighted
P(1–6) 
Level (L) Full year 
M6 18 RGs P(1–18) Discharge (Q) Full year 
M7 18 RGs P(1–18) Level (L) Full year 
M8 18 RGs, weighted rainfall (Theisen polygon method) Weighted
P(1–18) 
Level (L) Full year 

Random forest regressor

A popular supervised machine learning algorithm for classification and regression issues is random forest (Figure 8). Using various samples, it constructs decision trees and uses their average for classification and the majority vote for regression. The steps involved in RFR are given below.

Step 1: In a random forest, n random records are taken from the dataset having k number of records.

Step 2: Individual decision trees are created for each sample.

Step 3: Each decision tree will generate an output.

Step 4: For classification and regression, the final result is based on the majority vote or average, respectively.

Artificial neural network

An ANN, also called a neural network, is a computing system inspired by the biological neural networks that constitute the human brain. Such systems ‘learn’ to perform tasks by considering examples, generally without being programmed with any task-specific rules. The ANN is an idea of knowledge in the field of artificial intelligence designed by adopting the human nervous system. The process of training the ANN has many types and uses, including perceptron and back propagation. The ANN models constructed in this study were of the FFNN type. In the feed-forward model, the information is only processed in one direction. While the data may pass through multiple hidden nodes, it always moves in one direction and never backwards. The detailed theoretical information about ANN can be found here in ASCE Task Committee on Application of Artificial Neural Networks in Hydrology (2000).

An RFR and FFNN were developed for all eight models with different input and output combinations mentioned in Table 3. The performance indicators used for the study are mean absolute error (MAE), root mean square error (RMSE), coefficient of efficiency (), and correlation coefficient (R). MAE calculates the agreement between the average simulated and observed watershed runoff, and RMSE calculates the overall agreement of hydrograph shape. Table 4 gives the tabular performance of the RFR and FFNN, respectively. To compare the results and visualise model-wise performance, a scatter plot is plotted for the RFR and FFNN, as shown in Figures 9 and 10, respectively. Time series plots were also developed for extreme events observed in August–October 1998 (shown in Figure 11) and in September–October 2016 (shown in Figure 12).
Table 4

Model results

Performance evaluatorModelM1M2M3M4M5M6M7M8
MAE RFR 3.9816 6.9224 0.1024 0.1644 0.1024 3.9812 0.1429 0.0967 
FFNN 0.0313 0.0514 0.1533 0.1197 0.1546 0.0383 0.1480 0.1514 
RMSE RFR 20.6617 27.5881 0.1703 0.2749 0.1703 20.3023 0.2383 0.1617 
FFNN 0.1179 0.1414 0.2223 0.1794 0.2158 0.1296 0.2193 0.2212 
R2 RFR 0.9852 0.9820 0.9764 0.9599 0.9764 0.9857 0.9556 0.9807 
FFNN 0.9327 0.9018 0.9206 0.9593 0.9277 0.9365 0.9288 0.9235 
R RFR 0.9926 0.9910 0.9881 0.9797 0.9903 0.9928 0.9775 0.9779 
FFNN 0.9657 0.9496 0.9595 0.9794 0.9632 0.9677 0.9638 0.9610 
Performance evaluatorModelM1M2M3M4M5M6M7M8
MAE RFR 3.9816 6.9224 0.1024 0.1644 0.1024 3.9812 0.1429 0.0967 
FFNN 0.0313 0.0514 0.1533 0.1197 0.1546 0.0383 0.1480 0.1514 
RMSE RFR 20.6617 27.5881 0.1703 0.2749 0.1703 20.3023 0.2383 0.1617 
FFNN 0.1179 0.1414 0.2223 0.1794 0.2158 0.1296 0.2193 0.2212 
R2 RFR 0.9852 0.9820 0.9764 0.9599 0.9764 0.9857 0.9556 0.9807 
FFNN 0.9327 0.9018 0.9206 0.9593 0.9277 0.9365 0.9288 0.9235 
R RFR 0.9926 0.9910 0.9881 0.9797 0.9903 0.9928 0.9775 0.9779 
FFNN 0.9657 0.9496 0.9595 0.9794 0.9632 0.9677 0.9638 0.9610 

Bold values of correlation coefficient 0.9926 of model M1 and 0.9928 of model M6 are the highest results of RFR that signifies that RFR model performs better for discharge as output. While 0.9794 of model M4 and 0.9677 of model M6 are the highest results of FFNN models for seasonal level and full year discharge as output respectively. In short, these bold values are the top two highest results in RFR and FFNN models.

Figure 8

Random forest regressor.

Figure 8

Random forest regressor.

Close modal
Figure 9

Scatter plot for RFR.

Figure 9

Scatter plot for RFR.

Close modal
Figure 10

Scatter plot for FFNN.

Figure 10

Scatter plot for FFNN.

Close modal
Figure 11

Time series plot for training state (from 20 August to 30 October 1998).

Figure 11

Time series plot for training state (from 20 August to 30 October 1998).

Close modal
The results of the models are shown in Table 4. Based on the above results, the model performance graph for the correlation coefficient is plotted as shown in Figure 12, which gives clear insight into the performance of the models.
Figure 12

Time series plot for the testing state (from 14 September to 14 October 2016).

Figure 12

Time series plot for the testing state (from 14 September to 14 October 2016).

Close modal
Figure 13

Model-wise performance plot.

Figure 13

Model-wise performance plot.

Close modal

As mentioned in Table 3, models M1–M5 were calibrated with 6 RG data as input, while models M6–M8 were calibrated with 18 RG data. Seventy percent of the dataset is used for training, 15% for validation, and 15% for testing. According to the World Meteorological Organisation (1976), for arid regions, one RG station for 1,500–10,000 km2 is recommended, while the Bureau of Indian Standard (BIS) suggests one RG station for up to 500 km2 (Subramanya 2008). Results have shown that RFR and FFNN models with spatially well-distributed six RGs (1,660 km2 per RG) data as input gave the same or comparatively better results. As the Manjira watershed area belongs to a semi-arid climatic zone, to identify the effect of a dry period during which runoff is 0, model calibration is done by taking full-year days (M1 and M3) as well as seasonal day data (1 June–31 October) as input (M2 and M4). It is observed that seasonal data input FFNN model (M4 – 0.9794) performance is better while whole year data input RFR model (M1 – 0.9926) performance is best. Models M1, M2, and M6 are calibrated with discharge as the output parameter. RFR models with discharge as an output performed better than FFNN. The correlation coefficient of FFNN varies from 0.9496 to 0.9794, whereas the correlation coefficient of RFR varies from 0.9775 to 0.9928. The performance of the developed RFR model is equally good as that of the FFNN model. Model M6 is the best-performing RFR model where 18 RG data points are input and discharge is output, which is quite visible in the scatter plot (Figure 8 (M6)). Also, the time series plot for RFR-M6 shows high accuracy in peak-flow discharge. Furthermore, by observing time series plots M1, M2, and M6 for both 1998 and 2016 flood events, RFR estimates peak discharge values more precisely than FFNN. As during summer and most of the winter period, streamflow is dry, models M3, M4, M5, M7, and M8 with level as output are less effective to estimate 0 of the gauge reading, i.e., 542.723 m. In the case of FFNN, the M4 model with six RGs measuring rainfall in rainy season days as input and level as output is the best-performing model, giving a 0.9794 correlation coefficient. In the case of RFR, the M6 model with 18 RG data as input and discharge as output is the best-performing model, giving a 0.9928 correlation coefficient.

Model M5 and M8 results considering weighted rainfall input do not show any drastic improvement in model results as expected as watershed areas have a gentle slope of 0–2.93 m/km, which characterises less spatial variation in rainfall. The time series plot generated for peak flow from 14 September to 14 October 2016 clearly shows that level estimation is good as compared to discharge estimation for both RFR and FFNN.

In order to understand the impact of various input–output data combinations on the model and identify the best-fit model, this study uses RFR and FFNN models to simulate the rainfall-runoff process for large-size semi-arid watershed areas. It is observed that 6 RGs rainfall data as input performs equally well as compared to 18 RGs rainfall data as input. It means that with due consideration of the characteristics of the watershed area, even low-density, spatially well-distributed RG data can provide good results. So, the model can provide a solution for runoff estimation and subsequently flood forecasting in areas where data availability is an issue because of the low density of RGs. From the time series plot for training state (20 August–30 October 1998) as well as testing state (14 September–14 October 2016), it is clearly visible that the RFR model with level as output estimates peak flow more precisely than FFNN. Thus, it is again proven that artificial neural networks are capable of simulating complex rainfall-runoff relationships but need a large volume of data and are less accurate in peak-flow estimation. On the other hand, it is observed that both RFR and FFNN are less effective to estimate the zero value of gauge against the zero gauge value observed during dry season. The FFNN model performs well with a 0.9794 (M4) value of the correlation coefficient for P(1–18) as input and Level as output. But the RFR model performed better with 0.9928 (M6) and 0.9926 (M1) results for P(1–6) as input-discharge as output and P(1–18) as input-discharge as output, respectively. From the model performance graph (Figure 13), it clearly shows that for the M4 model, where rainfall data from six RG seasonal days are input and level is output, the FFNN result is most accurate while RFR is least accurate. The major effect of climate change is a change in the rainfall pattern, with heavy rainfall in a short period of time causing flash floods in arid and semi-arid regions. This necessitates the accurate estimation and forecasting of runoff that will help provide flood warnings. The simulation findings show that the data-driven models deliver useful information without an in-depth understanding of watershed characteristics and are useful for the management and planning of water resources.

The authors wish to sincerely thank Meteorological Department, Shivajinagar, Pune; Central Water Commission, Krishna Godavari Basin Organization, Hyderabad and Hydrology Project (SW), Jal Vidnyan Bhawan, Dindori Road, Nashik for providing the necessary data for the research work. The authors also acknowledge ADVIT AI Labs, Pune for technical support.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Aichouri, I., Hani, A., Bougherira, N., Djabri, L., Chaffai, H. & Lallahem, S.
2015
River flow model using artificial neural networks
. In:
Energy Procedia
.
Elsevier Ltd
, pp.
1007
1014
.
https://doi.org/10.1016/j.egypro.2015.07.832
.
Armanuos
A. M.
,
Al-Ansari
N.
&
Yaseen
Z. M.
2020
Cross assessment of twenty-one different methods for missing precipitation data estimation
.
Atmosphere
11
(
4
).
https://doi.org/10.3390/ATMOS11040389
.
ASCE Task Committee on Application of Artificial Neural Networks in Hydrology
2000
Task committee on application of artificial neural networks in hydrology, artificial neural networks in hydrology. II. Hydrologic application
.
Journal of Hydrologic Engineering
5
(
2
),
124
136
.
Central Water Commission and India Meteorological Department
2014
PMP Atlas for Godavari River Basin. I(November)
.
Chaipimonplin
T.
&
Vangpaisal
T.
2015
The efficiency of input determination techniques in ANN for flood forecasting, Mun Basin, Thailand
.
Journal of Water Resource and Hydraulic Engineering
4
(
2
),
131
137
.
https://doi.org/10.5963/jwrhe0402002
.
Chandwani, V., Vyas, S. K., Agrawal, V. & Sharma, G.
2015
Soft computing approach for rainfall-runoff modelling: a review
.
Aquatic Procedia
4
,
1054
1061
.
https://doi.org/10.1016/j.aqpro.2015.02.133
.
Ghumman, A. R., Yousry, M., Ghazaw, A. R. & Watanabe, S. K.
2011
Runoff forecasting by artificial neural network and conventional model
.
Alexandria Engineering Journal
50
(
4
),
345
350
.
https://doi.org/10.1016/j.aej.2012.01.005
.
Hussain, D., Usmani, A., Verma, D. K., Jamal, F. & Khan, M. A.
2017
Rainfall runoff modelling using artificial neural network
.
International Journal of Advance Research
.
Available at: www.ijariit.com.
Jain
A.
&
Indurthy
S. K. V. P.
no date
Comparative Analysis of Event-Based Rainfall-Runoff Modeling Techniques-Deterministic, Statistical, and Artificial Neural Networks
.
https://doi.org/10.1061/ASCE1084-069920038:293
.
Kalteh
A. M.
2008
pp. 53 58 ©Copyright by The University of Guilan, CJES Caspian Journal of Environmental Sciences. Available from: http://research.guilan.ac.ir/cjes.
Kisi
O.
,
Shiri
J.
&
Tombul
M.
2013
Modeling rainfall-runoff process using soft computing techniques
.
Computers and Geosciences
51
,
108
117
.
https://doi.org/10.1016/j.cageo.2012.07.001
.
Kovář, P., Hrabalíková, M., Neruda, M., Neruda, R., Šrejber, J., Jelínková, A. & Bačinová, H.
2015
Choosing an appropriate hydrological model for rainfall-runoff extremes in small catchments
.
Soil and Water Research
10
(
3
),
137
146
.
https://doi.org/10.17221/16/2015-SWR
.
Kumar
V.
&
Yadav
S. M.
2021
Real-time flood analysis using artificial neural network
.
Recent Trends in Civil Engineering
77
,
973
986
.
https://doi.org/10.1007/978-981-15-5195-6_71
.
Maier, H. R., Jain, A., Dandy, G. C. & Sudheer, K. P.
2010
Methods used for the development of neural networks for the prediction of water resource variables in river systems: current status and future directions
.
Environmental Modelling and Software
.
https://doi.org/10.1016/j.envsoft.2010.02.003
.
Parmar, H. V., Mashru, H. H., Vekariya, P. B., Rank, H. D., Kelaiya, J. H., Pardava, D. M., Patel, R. J. & Vadar, H. R.
2016
Establishment of Rainfall-Runoff Relationship for the Estimation Runoff in Semi-Arid Catchment, AICRP on Irrigation Water Management. Available from: www.arkgroup.co.in.
Patel
A. B.
&
Joshi
G. S.
2017
Civil Engineering Journal Modeling of Rainfall-Runoff Correlations Using Artificial Neural Network – A Case Study of Dharoi Watershed of a Sabarmati River Basin, India
.
Available from: www.CivileJournal.org.
Rajurkar
M. P.
,
Kothyari
U. C.
&
Chaube
U. C.
2002
Modélisation pluie – débit journalière à base de réseau de neurones artificiel
.
Hydrological Sciences Journal
47
(
6
),
865
877
.
https://doi.org/10.1080/02626660209492996
.
Rezaeianzadeh, M., Stein, A., Tabari, H., Abghari, H., Jalalkamali, N., Hosseinipour, E. Z. & Singh, V. P.
2013
Assessment of a conceptual hydrological model and artificial neural networks for daily outflows forecasting
.
International Journal of Environmental Science and Technology
10
(
6
),
1181
1192
.
https://doi.org/10.1007/s13762-013-0209-0
.
Riad, S., Mania, J., Bouchaou, L. & Najjar, Y.
2004
Rainfall-runoff model using an artificial neural network approach
.
Mathematical and Computer Modelling
40
(
7–8
),
839
846
.
https://doi.org/10.1016/j.mcm.2004.10.012
.
Ruslan, F. A., Samad, A.b.d. M., Zain, Z. M.d. & Adnan, R.
2014
Flood water level modeling and prediction using NARX neural network: Case study at Kelang river
. In:
Proceedings – 2014 IEEE 10th International Colloquium on Signal Processing and Its Applications, CSPA 2014
.
IEEE Computer Society
, pp.
204
207
.
https://doi.org/10.1109/CSPA.2014.6805748
.
Setiono & Hadiani, R.
2015
Analysis of rainfall-runoff neuron input model with artificial neural network for simulation for availability of discharge at Bah Bolon Watershed
. In:
Procedia Engineering
.
Elsevier Ltd
, pp.
150
157
.
https://doi.org/10.1016/j.proeng.2015.11.022
.
Sinha
S.
,
Singh
V.
&
Jakhanwal
M. P.
2015
Rainfall runoff modeling of Punpun River Basin using ANN – a case study
.
International Journal of Research in Engineering and Social Sciences
5
(
5
),
2249
9482
.
Available from: www.indusedu.org.
Solaimani
K.
2009
Rainfall-runoff prediction based on artificial neural network (A case study: Jarahi Watershed)
.
J. Agric. & Environ. Sci
5
(
6
),
856
865
.
Subramanya
K.
2008
Engineering Hydrology_Sabramanya.pdf
.
Zadeh, M. R., Amin, S., Khalili, D. & Singh, V. P.
2010
Daily outflow prediction by multi layer perceptron with logistic sigmoid and tangent sigmoid activation functions
.
Water Resources Management
24
(
11
),
2673
2688
.
https://doi.org/10.1007/s11269-009-9573-4
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).