The present study aims to evaluate the potentiality of Bidirectional Long Short-Term Memory (Bi-LSTM), Convolutional Neural Networks (CNNs), eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and Random Forest (RF) for predicting daily inflows to the Sri Ram Sagar Project (SRSP), Telangana, India. Inputs to the model are rainfall, evaporation, time lag inflows, and climate indices. Seven combinations (S1–S7) of inputs were made. Fifteen and a half years of data were considered, out of which 11 years were used for training. Hyperparameter tuning is performed with the Tree-Structured Parzen Estimator. The performance of the algorithms is assessed using Kling–Gupta efficiency (KGE). Results indicate that Bi-LSTM with combination S7 performed better than others, as evident from KGE values of 0.92 and 0.87 during the training and testing, respectively. Furthermore, the Stacking Ensemble Mechanism (SEM) has also been employed to ascertain its efficacy over other chosen algorithms, resulting in KGE values of 0.94 and 0.89 during training and testing. It has also been able to simulate peak inflow events satisfactorily. Thus, SEM is a better alternative for reservoir inflow predictions.

  • The Stacking Ensemble mechanism (SEM) has performed better than other ML algorithms (Bi-LSTM, CNN, RF, XGBoost, LGBM) in simulating the daily inflow to the Sri Ram Sagar Project, with KGE values of 0.94 and 0.89 in the training and testing.

  • The optimal combination of climate indices attributing to reservoir inflow was found using Shannon's entropy weighted measure.

  • SEM has performed satisfactorily in replicating the peak inflow events.

Reservoirs are essential for water resource management and cater to flood control, irrigation, hydropower generation, domestic and industrial purposes, etc. In this regard, inflow is a crucial factor in the optimal reservoir operation. Also, extreme rainfall events and higher evaporation rates due to climate change impact the inflow pattern. It necessitates accurate prediction of inflows with consideration of climate indices. Over the past decade, various researchers have proposed physical, conceptual, empirical, and machine learning (ML) models for simulating reservoir inflows (Alquraish et al. 2021). However, due to the extensive data requirements and underlying assumptions, physical and conceptual models may not mimic the inflow precisely. Also, the limitation of empirical-based algorithms considering mainly linearity in the time series is another challenge in forecasting studies (Wang et al. 2022).

The capability of addressing non-linearity, underlying patterns in the data, and advantage over conventional statistical algorithms make ML more appropriate for prediction (Jothiprakash & Magar 2012; Rozos et al. 2022; Xu et al. 2022a). However, the significant challenge is identifying a suitable algorithm that best fits a chosen situation. In this context, researchers are making efforts to explore multiple algorithms based on their applicability.

Random Forest (RF) is one of those techniques used in inflow prediction due to its simple mathematical background. Yang et al. (2017) applied RF, Support Vector Regression, and Artificial Neural Networks for inflow forecasting of Danjiangkou Reservoir, China, and Clair Engle Lake Reservoir, USA. It is found that climate indices influence the inflow. Similar work has been reported by Kim et al. (2019) using RF, Artificial Neural Networks, and Adaptive Neural-Based Fuzzy Inference System for several dams in South Korea. The inputs for their study included various climatic indicators. RF is found to be the best for two case studies. RF's limitations include handling asymmetric and higher dimensional data (Pes 2021). This prompted researchers to explore boosting algorithms, which have high computational efficiency and can handle large datasets (Pan et al. 2022).

Mehraein et al. (2022) predicted the monthly streamflow for three stations in Turkey using eXtreme Gradient Boosting (XGBoost), RF, Artificial Neural Networks, and nonlinear regression. Ibrahim et al. (2022) applied XGBoost, Adaptive Neuro Fuzzy Interference System, Multilayer Perceptron, and Support Vector Regression in forecasting inflow to Bukit Tabur Dam, Malaysia. Hyperparameter tuning was conducted with a grid search method. In both case studies, XGBoost is the best, echoing the outcome of other case studies (Ni et al. 2020; Dong et al. 2023). However, the challenge of boosting algorithms lies in effectively capturing temporal patterns in sequential data (Kiangala & Wang 2021).

It leads to the usage of Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks, which can handle nonlinear relationships and spatial-temporal patterns (Najafabadi et al. 2015; Semmelmann et al. 2022). Apaydin et al. (2020) employed three algorithms for inflow prediction of the Ermenek hydroelectric dam, Turkey, and LSTM is the best. Kim et al. (2021) employed LSTM and Gated Recurrent Unit to Andong and Imha Dam, South Korea. Employed algorithms performed well except for peak events. Khorram & Jehbez (2023) utilized LSTM to analyze the medium and high inflows of the Doroodzan reservoir in Iran. LSTM was performing better than Support Vector Machine (SVM) and Adaptive Neuro Fuzzy Interference System. Similar studies on LSTM for inflow prediction were reported by researchers (Latif et al. 2021; Skariah & Suriyakala 2022). LSTM was improvised further into bidirectional, i.e., Bi-LSTM, where sequential processing can be performed backward and forward. This aspect improves the ability to produce reliable predictions (Wegayehu & Muluneh 2022). The application of Bi-LSTM in daily reservoir inflow prediction is limited.

CNN has convolution and pooling layers in its architecture, enabling it to extract vital information from the input promptly. Shu et al. (2021) employed CNN to predict streamflow in three reservoirs in China. It was found that CNN performed exceedingly well as compared to artificial neural networks and extreme learning machines. Xu et al. (2022b) proposed an improved CNN with a Leaky ReLU activation function for predicting inflow in the Zhexi Reservoir, China. The model retained its accuracy for moderate inflows. Several other investigations have also demonstrated the effectiveness of CNN in simulating inflow (Li et al. 2022; Song 2022).

It has been observed that an ML algorithm performs best only in some scenarios. Researchers emphasized the mechanism of ensemble–learning, which is the combination of ML algorithms to improve prediction accuracy. Kermani et al. (2021) reviewed a variety of ensemble architectures of ML algorithms based on Boosting, Bagging, and Stacking approaches for modeling river flow and streamflow. Zhou et al. (2018) ensembled Artificial Neural Networks and regression to enhance the monthly prediction accuracy of Jinsha River, China. It performed better than the weighted average and simple mean. Huang et al. (2022) examined the efficacy of different Deep Learning and ML for forecasting the short-, medium-, and long-term inflows of Shimen Reservoir, Taiwan, using 14 years of rainfall data. They employed Ensemble Means and Switched Prediction approaches. They concluded that integrating forecasting techniques was more effective and stable than ML and deep learning algorithms. Lu et al. (2023) have employed the Stacking Ensemble Mechanism (SEM) with the integration of XGBoost, RF, Adaptive Boosting, and attention mechanism as the meta-model for simulating the daily inflow of the Fuchu reservoir. Results showed better performance of SEM as compared to other algorithms.

The climate change significantly impacts the streamflow pattern. Many large-scale climate indices as predictors have lately been introduced into hydrological modeling (Rajagopalan et al. 2019). These indices can identify extreme events based on long-term climate data records (Moradi et al. 2020). Islam & Imteaz (2020) considered two climate indicators, the Indian Ocean Dipole and El Nino Southern Oscillation, as predictors and trained the Autoregressive Integrated Moving Average with exogenous input in rainfall forecasting in Western Australia. Rajesh et al. (2023) have considered the North Atlantic Oscillation (NAO), Southern Oscillation Index (SOI), and Arctic Oscillation (AO) as input features to feed LSTM, RF, K-Nearest Neighbor, and Gradient Boosting Regressor for short-range forecasting of inflow to Bhadra Reservoir, India.

According to the comprehensive review, the following salient research gaps were observed:

  • Limited research has been conducted on ensemble mechanisms of ML algorithms based on stacking for predicting reservoir inflow.

  • A study identifying the significant climate indices associated with a reservoir inflow has yet to be conducted.

  • Light Gradient Boosting Machine (LGBM), with quicker training speed and minimal memory consumption, has not been extensively studied in the context of inflow prediction.

The mentioned research works motivated the implementation of ML algorithms to predict daily inflow to the Sri Ram Sagar Project (SRSP), Telangana, India. Water requirements from this project have increased manifold over the past few years due to substantial agricultural activities. This necessitates precise estimation of inflows to the project, which will help water managers with effective planning.

Thus, the envisaged objectives that can be studied for SRSP are summarized as follows:

  • (1)

    To identify input variables, including climate indices suitable for reservoir inflow prediction

  • (2)

    To explore the applicability of ML algorithms, namely, XGBoost, LGBM, Bi-LSTM, CNN, and RF, for simulating inflows

  • (3)

    To develop SEM with a suitable ML algorithm based on objective (2).

The remaining manuscript is organized as follows: The case study and data details are shown in Section 2. The framework for methods is detailed in Section 3. In Section 4, results and discussion are presented. Section 5 comprises the summary and conclusions.

SRSP is located at latitude of 18.96° North and longitude of 78.34° East in the upper Godavari basin near Pochampadu, a village in the Nizamabad district of Telangana, India. The total reservoir storage capacity is 3,171.48 Million Cubic Meters (MCM), and the active storage is 2,564.94 MCM (Kiran et al. 2021). Nirmal, Jagtial, Peddapalli, Jayashankar, Karimnagar, Mahabubabad, Hanumakonda, and Warangal districts in Telangana receive water for irrigation from the project SRSP and partly drinking water. Figure 1 represents an index map of the project with the selected rain gauge stations, namely, Betmogrra, Bhainsa, Sriram Sagar, Yelli, Nanded, Nizam Sagar, and Degloor, located upstream of the project.
Figure 1

Index map of SRSP.

Figure 1

Index map of SRSP.

Close modal

The daily reservoir inflow data of SRSP from August 2007 to December 2022 (5,362 inflow records) has been collected from the Department of Irrigation and Command Area Development, Telangana, India. Water mainly flows into the reservoir during the monsoon season (June–September). The rest of the year experiences minimal or no inflow.

Daily meteorological data (rainfall, Tmax, and Tmin) has been obtained from the India Meteorological Department, Pune, India. The Thiessen polygon approach is used to compute the mean areal rainfall over the catchment from seven rain gauge stations (Hwang et al. 2020). Evaporation is calculated using Ivanov, an empirical technique based on temperature and relative humidity (Muhammad et al. 2019). The minimum and maximum rainfall values are 0 and 229.05 mm, whereas these are 0.76 and 19.31 mm/day for evaporation; 0 and 11,846 m3/s for the inflow. The mean values of rainfall, evaporation, and inflow, respectively, are 2.06 mm, 7.42 mm/day, and 175.02 m3/s.

The present work studies the implication of climate phenomenon indices on reservoir inflow prediction. Information about these indices was downloaded from the respective home pages and used for modeling (not computed by the authors). Details are as follows:

  • The daily values of gridded ocean-atmospheric circulation indices, namely, Sea Level Pressure (SLP), Meridional Wind (MW), Greenland Blocking Index (GBI), Geopotential Height (GH), and Zonal Wind (ZW), were downloaded from the National Oceanic and Atmospheric Administration of the United States (NOAA) (https://psl.noaa.gov/data/timeseries/daily/) for the seven rain gauge stations and combined into a single series using the Thiessen polygon formula.

  • Similarly, teleconnection patterns, namely, Western Pacific Oscillation (WPA), Pacific North American Index (PNA), NAO, and AO, were downloaded from NOAA.

  • NINO1.2, NINO 3, NINO 3.4, and NINO 4 have been downloaded from the World Meteorological Organization (https://climexp.knmi.nl/start.cgi).

  • SOI is downloaded from https://www.longpaddock.qld.gov.au/soi/soi-data-files/

  • Dipole Mode Index (DMI) has been downloaded from Japan Agency for Marine-Earth Science and Technology (https://www.jamstec.go.jp/aplinfo/sintexf/e/iod/dipole_mode_index.html).

  • The climate indices data are globally averaged.

The present section provides brief information about techniques employed: XGBoost, LGBM, Bi-LSTM, CNN, RF, and SEM for simulating inflow.

XGBoost

The Gradient Boosting Mechanism is an ensemble of Decision Trees (DTs) to make predictions. XGBoost is an enhanced variant of Gradient Boosting Mechanism that incorporates a regularization term to prevent overfitting (Ma et al. 2021). Figure 2 describes the general structure of the XGBoost. Following are the various steps involved:
  • Initializing the model with baseline prediction such as mean, median, or mode of the target variable.

  • Residuals are calculated by subtracting the target value from the baseline prediction (Equation (1)).
    (1)
    where is the error for the ith sample, is the actual value of the target variable, and is the baseline prediction.
  • DT discovers the most prominent feature and attempts to predict the residuals. The training process aims to minimize the weighted square difference between the residuals of the actual values and the projected values. The gradient of the loss function determines the weights. The weighted squared difference is computed using Equation (2).
    (2)
    where is the weight of the ith sample, is prediction of the tth DT for ith sample, and denotes the loss for the ith sample.
  • The predictions are updated by adding the predicted residual to the baseline prediction (Equation (3)).
    (3)
  • Multiple DTs are constructed in a similar approach using Equation (4), and predictions are updated until the desired accuracy is achieved.
    (4)
    where is the final prediction value, and T represents the number of DTs.
Figure 2

XGBoost structure.

Figure 2

XGBoost structure.

Close modal

XGBoost consists of several parameters, such as n_estimators, max_depth, reg_alpha, reg_lambda, min_child_weight, gamma, max_leaf_nodes, max_delta_step, sub_sample, colsample_bylevel, and scale_pos_weight. Out of these, the first six are considered hyperparameters. The optimal hyperparameter values have been discussed in the results and discussion section.

LGBM

LGBM works on the ensemble of DTs in a Gradient Boosting framework. However, unlike conventional DT, the dataset of the feature column is split into bins, making the algorithm run faster. This technique is called histogram or bin splitting (Chakraborty et al. 2020). Another unique feature of the LGBM that enhances its efficiency and avoids overfitting is the gradient-based side sampling. At every split, it samples a subset of the features established on the gradient of the loss function.

Traditional Gradient Boosting algorithms develop trees depth-wise, which can result in a significant number of empty leaf nodes in larger datasets. In contrast, LGBM grows tree leaf-wise. A new branch is only added to the tree if it increases the accuracy of the model. LGBM predicts the target variable similarly to XGBoost, but the optimal feature split is identified using a distinct procedure. Figure 3 shows the leaf-wise expansion of the LGBM tree structure, where green nodes are expandable and red nodes are non-expandable.
Figure 3

Leaf-wise expansion in the LGBM structure.

Figure 3

Leaf-wise expansion in the LGBM structure.

Close modal

Parameters of LGBM are max_depth, n_estimators, learning_rate, num_leaves, max_bin, feature fraction, bagging_fraction, min_data_in_leaf, min_sum_hessian_in_leaf, bagging_freq, lambda_l1, lambda_l2, min_gain_to_split, and min_data_in_bin. Among them, the first seven are considered hyperparameters. Detailed information about hyperparameters is presented in the results and discussion section.

Bi-LSTM

LSTM replaces the implicit function of Recurrent Neural Networks with a memory unit consisting of input, forget, update, and output gates. The forget gate determines how much past information to forget. The input gate decides how much new input to add to the cell state. The update phase calculates new values by combining the outputs of the input and forget gates. The output gate determines the new hidden state premised on the updated cell state. A detailed description of the working of LSTM has been provided by Liang et al. (2021).

Bi-LSTM is an improvised version of LSTM capable of learning from the past and future of a sequence. This makes it suitable for regression problems in which the order of the input data is crucial (Roy et al. 2022). Figure 4 presents the architecture of Bi-LSTM. It incorporates two LSTM neural networks to perform the forward and backward calculations of the hidden vector and , respectively (Equations (5) and (6)). Both LSTMs learn from the sequence data, but one looks backward, and the other forwards. The results from the two LSTMs are then averaged to provide a forecast, (Equation (7)).
(5)
(6)
(7)
Figure 4

Bi-LSTM architecture.

Figure 4

Bi-LSTM architecture.

Close modal

are the weights for input to the forward layer and input to the backward layer. Here, denote the weights for hidden-to-hidden layers, are the weights for forward to the output layer and backward to the output layer, respectively.

The parameters of Bi-LSTM are lstm_units, dropout_rate, learning_rate, epochs, and activation function. All of these parameters are considered for hyperparameters. Detailed information about these hyperparameters is presented in the results and discussion section.

CNN

CNN is extensively employed for image recognition because of its feature extraction ability. For regression, 1D-CNN generally processes one-dimensional data (such as the sequence of numbers). This 1D-CNN applies multiple convolutional operations to the input data (Shu et al. 2021). The convolution layer is the fundamental unit of a CNN model, and the kernels that define its parameters are its most essential components. Each kernel assembles an input matrix of the training sample to compute a dot product. The resulting matrix is subsequently sent through an activation function to obtain the convolution features during the feature extraction step (Equation (8)). The optimal values of the hyperparameters have been discussed in the results and discussion section.
(8)
where is the weight matrix of the ith kernel, is the bias of ith kernel, is the convolution layer kernels, and f is the activation function.
Even after applying the convolution layer, the feature maps remain comparatively large. The pooling layer is utilized immediately after deployment to minimize the desired size. The input map is partitioned into several non-overlapping sub-regions during pooling. Subsequently, the down-sampling function is applied to each sub-region, resulting in a more compact map representation. Max-pooling is the most popular down-sampling algorithm, which picks the most outstanding value in each examined area of its inputs, represented as follows (Equation (9)):
(9)
where is the number of features in a sub-region.
After many convolution and pooling methods, various characteristics from the input layer are retrieved and identified, and the output maps have reduced dimensionality. The ultimately linked vector is the result of stretching and splicing the collected feature maps together. Then, a connection is made between W, b, fully connected vector p, and the projected value Q by establishing the relation shown in Equation (10). The steps of the CNN are presented in Figure 5.
(10)
Figure 5

CNN architecture.

Figure 5

CNN architecture.

Close modal

Filters, strides, kernel_size, pool_size, hidden_nodes, activation function, epochs, number of layers, learning_rate, and batch_size are all parameters of CNN. The first seven of them are termed hyperparameters. Detailed information on the optimal values of these hyperparameters is given in the results and discussion section.

RF

RF uses bagging and has a tree-based structure (Desai & Ouarda 2021). DTs are used as base learners to make predictions. Multiple trees reduce the influence of noise and outliers, rendering RF immune to overfitting. RF also serves as an embedded feature selection technique by generating an importance score, namely, Random Forest Importance (RFI) for each input variable (Carranza et al. 2021). The RF algorithm works in the following steps:
  • The entire dataset of size T will be randomly divided into subsample N using the bootstrap sampling technique. DT is constructed on bootstrapped data.

  • DT will recursively split the data into smaller subsets according to a threshold value (typically a point that minimizes the Mean Square Error) associated with the feature variable. The contribution of the feature to the reduction in prediction error can be determined by estimating the decrease in Mean Square Error. The decline in the Mean Square Error value is directly proportional to RFI.

  • RF predicts a new data point for each DT by passing data through it. The optimal number of trees is determined through hyperparameter tuning. The outputs of all the DTs are averaged to obtain the final prediction. Figure 6 outlines the sequential operation of the RF.

Figure 6

RF algorithm.

The RF algorithm comprises several parameters such as n_estimators, max_sample, max_features, min_sample_split, max_depth, max_leaf_nodes, and min_samples_leaf, out of which the first four are considered hyperparameters. The optimal values of the hyperparameters have been discussed in the results and discussion section.

SEM

SEM involves an algorithm (also known as a meta-model) that receives the outputs of individual ML algorithms as input and attempts to discover the optimal way to combine the input predictions into a more accurate output prediction. The predicted outcomes of each algorithm are regarded as a meta-feature (Gu et al. 2022). In addition, the original features of the dataset are included, resulting in a new dataset. In SEM, the choice of meta-model is crucial. The best-performing ML algorithm has been chosen as the meta-model for training the new dataset. Hyperparameter optimization of the meta-model is performed to determine the optimal architecture of the algorithm. Figure 7 shows the workflow of an ensemble of ML algorithms using SEM. The stepwise workflow of SEM is as follows (Lu et al. 2023):
Figure 7

Flowchart representing main steps in SEM.

Figure 7

Flowchart representing main steps in SEM.

Close modal

Step 1: Training of ML algorithms with hyperparameter tuning

Let and be the input features and target variable of the training dataset, respectively, and be the features of the testing dataset. Here, and p are the number of training and testing samples, and n is the number of features. Let represent the ith ML algorithm with hyperparameters (Equation (11))
(11)
serves as a reference for the process of tuning hyperparameter, which can be implemented using various methods. After obtaining the best parameters for the ith algorithm, each algorithm is trained on the training data (Equation (12)).
(12)
  • makes predictions on the training and testing set using Equations (13) and (14):
    (13)
    (14)

Step 2: Development of a meta-model

The training predictions of the ML algorithms () along with are combined to create a training meta-feature matrix (Equation (15)) and use the target variable to train a meta-model (Equation (17)). Hyperparameter tuning is performed to find out the optimal hyperparameters for (Equation (16)).
(15)
(16)
(17)

Step 3: Final predictions using the meta-model

The testing meta-feature matrix is developed using testing predictions of each ML algorithm ( and (Equation (18)).
(18)
finally makes predictions on , as shown in Equation (19).
(19)

Other related supporting approaches employed

The Bayesian sampling technique Tree-Structured Parzen Estimator (Nguyen et al. 2020; Sipper 2022) is used for hyperparameter tuning.

Spearman correlation coefficient (SCC) is a statistical metric that defines the association between two variables (Zhou et al. 2023).

Forward feature selection (FFS) is a dimensionality reduction technique based on selecting the most significant features of the dataset. At each stage, the algorithm adds one feature and attempts to train the model and calculate the efficiency of the model using root mean square error (Amiri et al. 2011). The features in this method are ranked using F-values. It is the ratio of the root mean square error with and without the feature of the model. It, thus, indicates how much the feature enhances model performance.

The entropy weighted method (EWM) is a method to assign weights to various features based on the decision matrix (Lee & Chang 2018). Mechanism of weight computations involves formulation of decision matrix, normalization, and entropy estimation.

Kling–Gupta efficiency (KGE) is the performance metric considering the ratio of variances, b, and the correlation between observed and predicted data (Mathevet et al. 2023). KGE ranges from −∞ to 1, with 1 representing maximum efficacy (Equation (20)).
(20)
where and are the standard deviations of the observed and predicted data, respectively. and are the means of the simulated and observed values. R shows the Pearson correlation coefficient between the observed and predicted data.
The methodological workflow of the present study is divided into four phases (refer to Figure 8).
  • The best climate indices attributed to reservoir inflow have been selected in the first phase.

  • Seven input scenarios are studied in the second phase to train the ML algorithms. These scenarios also incorporate the selected climate indices.

  • In the third phase, SEM is implemented, and a comparative assessment has been performed with other ML algorithms.

  • Comparative evaluation of ML algorithms simulating peak flow is performed in the fourth phase.

Figure 8

Methodological workflow of the study.

Figure 8

Methodological workflow of the study.

Close modal

This section divides the results into four phases (Figure 8).

Selection of best climate indices influencing reservoir inflow discharges

Fifteen climate indices are considered for this study based on previous research works (Moradi et al. 2020). RFI, FFS, and SCC methods determine the most significant index corresponding to daily reservoir inflow. The relative importance of the three feature selection methods has been calculated using EWM. The highest weight of 0.5 is assigned to the RFI, followed by SCC (0.26) and FFS (0.24). The normalized weighted scores are estimated for each index and ranked accordingly. Few studies (Rucker et al. 2015) assert that median divides are robust for identifying critical variables and are employed in the present study.

Table 1 shows the results of feature selection methods employed to find the most significant climate indices. Rankings of the index for each of the techniques have also been presented. The most significant climate indices attributed to reservoir inflow have been highlighted in bold.

Table 1

Ranking of climate indices based on features selection methods

Climate indicesRFI
SCC
FFS
EWM
ScoreRankAbsolute coefficientRankF-valuesRankNormalized weighted scoreFinal rank
AO 0.000 13 0.002 14 5.800 13 0.002 14 
GBI 0.201 0.321 176.340 0.194 
NAO 0.016 10 0.004 13 4.290 14 0.012 11 
PNA 0.001 13 0.002 14 1.360 15 0.001 15 
DMI 0.028 0.237 145.870 0.075 
SOI 0.001 13 0.193 90.010 0.042 
WPA 0.007 11 0.025 11 11.280 11 0.009 12 
GH 0.043 0.245 148.230 0.085 
MW 0.020 0.013 12 11.180 12 0.016 10 
SLP 0.039 0.038 10 32.970 10 0.034 
ZW 0.004 12 0.206 99.050 0.046 
NINO 1.2 0.022 0.290 67.710 0.06 13 
NINO 3 0.241 0.235 73.660 0.188 
NINO 3.4 0.124 0.241 115.540 0.125 
NINO 4 0.252 0.260 158.020 0.183 
Climate indicesRFI
SCC
FFS
EWM
ScoreRankAbsolute coefficientRankF-valuesRankNormalized weighted scoreFinal rank
AO 0.000 13 0.002 14 5.800 13 0.002 14 
GBI 0.201 0.321 176.340 0.194 
NAO 0.016 10 0.004 13 4.290 14 0.012 11 
PNA 0.001 13 0.002 14 1.360 15 0.001 15 
DMI 0.028 0.237 145.870 0.075 
SOI 0.001 13 0.193 90.010 0.042 
WPA 0.007 11 0.025 11 11.280 11 0.009 12 
GH 0.043 0.245 148.230 0.085 
MW 0.020 0.013 12 11.180 12 0.016 10 
SLP 0.039 0.038 10 32.970 10 0.034 
ZW 0.004 12 0.206 99.050 0.046 
NINO 1.2 0.022 0.290 67.710 0.06 13 
NINO 3 0.241 0.235 73.660 0.188 
NINO 3.4 0.124 0.241 115.540 0.125 
NINO 4 0.252 0.260 158.020 0.183 

Note: The most significant climate indices attributing to reservoir inflow have been highlighted in bold.

As observed from Table 1, no uniformity is found in the rankings across RFI, SCC, and FFS, which may be due to different philosophies of scoring patterns. According to FFS and SCC, GBI is the most significant index, whereas NINO 4 is the best predictor based on RFI. The other indices based on teleconnection patterns (AO, NAO, PNA, MW, and WPA) have exhibited the poorest performance across all three methods. This necessitates the usage of EWM.

The normalized weighted scores of the indices by EWM suggest that GBI is the most sensitive climatic indicator of reservoir inflow, followed by NINO 3 and NINO 4. The results show six such indices have weighted scores more significant than the median value of 0.06. Accordingly, the predictors (with respective weights), GBI (0.194), NINO3 (0.188), NINO4 (0.183), NINO3.4 (0.125), GH (0.085), and DMI (0.075) have been chosen as inputs for ML algorithms to simulate inflow and construct cause–effect models.

Selection of input scenarios and training of ML algorithms

The 70:30 splitting ratio has been chosen for training and testing (Khosravi et al. 2018). The training dataset contains 11 years (1 August 2007 to 31 July 2018), whereas testing is for 4½ years (1 August 2018 to 31 December 2022). In addition, standardization is performed wherever required. Table 2 provides a statistical description of the training (3,751 datasets) and testing (1,611 datasets). Information in the table includes mean, standard deviation, minimum, and maximum values of each feature.

Table 2

Statistical parameters of the input variables (P(t), E(t), Q(t), best climate indices (GBI, NINO3, NINO 3.4, NINO4, GH, DMI)) in the training and testing dataset

Variables Statistical ParametersP(t)E(t)Q(t)GBININO3NINO 3.4NINO 4GHDMI
(mm)(mm/day)(m3/s)(m)(°C)(°C)(°C)(m)(°C)
(1)(2)(3)(4)(5)(6)(7)(8)(9)
Mean 2.39 (2.30) 7.60 (7.26) 129.88 (267.60) 5,320.14 (5,329.54) 0.24 (−0.44) 0.20 (−0.44) 0.26 (−0.17) 80.39 (80.25) 0 (0.14) 
Standard deviation 5.93 (9.60) 4.83 (5.03) 549.42 (1,086.74) 179.68 (180.74) 0.91 (0.49) 0.91 (0.55) 0.70 (0.64) 33.99 (34.76) 0.51 (0.79) 
Minimum 0 (0) 0.79 (0.76) 0 (0) 4,853.03 (4,853.01) −1.70 (−1.45) −1.81 (−1.72) −1.56 (−1.21) −4.25 (−29.50) −1.54 (−1.37) 
Maximum 92.80 (229.05) 19.31 (18.80) 11,846 (11,786.41) 5,707.53 (5,703.43) 3.07 (0.74) 3.29 (0.90) 2.09 (1.32) 167.75 (164.00) 1.43 (3.09) 
Variables Statistical ParametersP(t)E(t)Q(t)GBININO3NINO 3.4NINO 4GHDMI
(mm)(mm/day)(m3/s)(m)(°C)(°C)(°C)(m)(°C)
(1)(2)(3)(4)(5)(6)(7)(8)(9)
Mean 2.39 (2.30) 7.60 (7.26) 129.88 (267.60) 5,320.14 (5,329.54) 0.24 (−0.44) 0.20 (−0.44) 0.26 (−0.17) 80.39 (80.25) 0 (0.14) 
Standard deviation 5.93 (9.60) 4.83 (5.03) 549.42 (1,086.74) 179.68 (180.74) 0.91 (0.49) 0.91 (0.55) 0.70 (0.64) 33.99 (34.76) 0.51 (0.79) 
Minimum 0 (0) 0.79 (0.76) 0 (0) 4,853.03 (4,853.01) −1.70 (−1.45) −1.81 (−1.72) −1.56 (−1.21) −4.25 (−29.50) −1.54 (−1.37) 
Maximum 92.80 (229.05) 19.31 (18.80) 11,846 (11,786.41) 5,707.53 (5,703.43) 3.07 (0.74) 3.29 (0.90) 2.09 (1.32) 167.75 (164.00) 1.43 (3.09) 

Note: Values in parentheses represent values corresponding to the testing dataset.

Training and testing datasets are statistically comparable. Rainfall (P(t)) in the catchment area during the training period has a mean of 2.39 mm, which is similar to that of the testing period (2.30 mm). The mean daily evaporation rate (E(t)) in the training period (7.60 mm/day) is relatively higher than that of testing (7.26 mm/day). Inflow (Q(t)) in the training phase has a slightly higher maximum value (11,846 m³/s) and lower mean value (129.88 m³/s) as compared to inflow in the testing phase, whose maximum value observed was 11,786.41 m³/s, and the mean was 267.60 m³/s. The maximum GBI values (5,707.53 and 5,707.43 m) for the training and testing phases are similar. The comparable mean values (5,320.14 m in the training dataset and 5,329.54 m in the testing dataset) suggest that average GBI stays consistent across both datasets. The standard deviation of GH in training is 33.99 m; it is marginally more significant at 34.76 m in testing. The standard deviation of the DMI in testing is relatively at a higher value, 0.79, compared to that of training (0.51), indicating higher variability of DMI in the testing dataset. The standard deviation of NINO 4 is similar for training (0.70 °C) and testing phase (0.64 °C). The NINO indices during training exhibit comparatively higher values than those during testing, as evidenced by their minimum, maximum, and mean values.

The Pearson correlation coefficient is adopted to determine the optimal time-lagged inflow days. Twelve-day lags are taken for this purpose. The coefficients of the lagging inflow values are displayed in Figure 9. Correlation decreases progressively as the delayed value of the inflow increases. Q(t − 12) has the lowest correlation of 0.29 with the current inflow.
Figure 9

Correlation of lagged inflow concerning current inflow.

Figure 9

Correlation of lagged inflow concerning current inflow.

Close modal

Q(t − 1) and Q(t − 2) have correlation coefficients of 0.87 and 0.72. These are chosen as the input variables to construct time series models as values exceed the threshold of 0.7 (Ratner 2009). Lagged days of rainfall and evaporation are also considered up to 2 days to maintain consistency.

Seven scenarios (S1–S7) are discussed, and detailed information is presented in Table 3. In all scenarios, Q(t) is the output. Scenarios S1 and S2 are based on time series models in which only lagging inflow values are used to train various ML algorithms. Scenarios S3–S6 contain cause-and-effect variables such as P(t), E(t) and climate indices. In scenario S7, the inputs of all preceding scenarios are combined. A generalized framework representing the input–output of the scenarios is presented (Equation (21)):
(21)
where Q(tn) is the inflow of the reservoir lagged by n days (m3/s), P(t) is the present-day rainfall (mm), P(tn) is the rainfall lagged by n days (mm), E(t) is the present-day evaporation (mm/day), E(tn) is the evaporation lagged by n days (mm/day).
Table 3

Input combinations for predicting reservoir inflow

ScenariosModel inputs
S1 Q(t − 2), Q(t − 1) 
S2 Q(t − 1) 
S3 P(t), P(t − 1), P(t − 2), E(t), E(t − 1), E(t − 2) 
S4 P(t), E(t
S5 NINO 3, NINO 3.4, NINO 4, GBI, DMI, GH 
S6 NINO 3, NINO 3.4, NINO 4, GBI, DMI, GH, P(t), E(t
S7 NINO 3, NINO 3.4, NINO 4, GBI, DMI, GH, P(t), P(t − 1), P(t − 2), E(t), E(t − 1), E(t − 2), Q(t − 1), Q(t − 2) 
ScenariosModel inputs
S1 Q(t − 2), Q(t − 1) 
S2 Q(t − 1) 
S3 P(t), P(t − 1), P(t − 2), E(t), E(t − 1), E(t − 2) 
S4 P(t), E(t
S5 NINO 3, NINO 3.4, NINO 4, GBI, DMI, GH 
S6 NINO 3, NINO 3.4, NINO 4, GBI, DMI, GH, P(t), E(t
S7 NINO 3, NINO 3.4, NINO 4, GBI, DMI, GH, P(t), P(t − 1), P(t − 2), E(t), E(t − 1), E(t − 2), Q(t − 1), Q(t − 2) 

Tables 48 illustrate the hyperparameter tuning of the ML algorithms using Tree-Structured Parzen Estimator for S1–S7. KGE values for training and testing are also part of these tables.

Table 4

Hyperparameter tuning and performance assessment of XGBoost

HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
n_estimators 100–1,000 400 310 115 105 255 555 755 
max_depth 1–40 13 27 
reg_alpha 0–1 0.6 0.5 0.1 0.6 0.5 0.7 
reg_lambda 0–1 0.2 0.2 0.4 0.5 0.8 0.5 0.8 
min_child_weight 1–10 
gamma 0–1 0.3 0.3 0.5 0.1 0.6 0.9 0.5 
Training KGE  0.81 0.75 0.56 0.41 0.99 0.97 0.99 
Testing KGE  0.8 0.79 0.51 0.4 0.73 0.69 0.85 
HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
n_estimators 100–1,000 400 310 115 105 255 555 755 
max_depth 1–40 13 27 
reg_alpha 0–1 0.6 0.5 0.1 0.6 0.5 0.7 
reg_lambda 0–1 0.2 0.2 0.4 0.5 0.8 0.5 0.8 
min_child_weight 1–10 
gamma 0–1 0.3 0.3 0.5 0.1 0.6 0.9 0.5 
Training KGE  0.81 0.75 0.56 0.41 0.99 0.97 0.99 
Testing KGE  0.8 0.79 0.51 0.4 0.73 0.69 0.85 
Table 5

Hyperparameter tuning and performance assessment of LGBM

HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
max_depth 3–20 12 14 19 17 
n_estimators 100–1,000 741 812 893 339 913 461 841 
learning_rate 0.001–0.1 0.09 0.04 0.003 0.02 0.04 0.03 0.01 
num_leaves 10–500 129 429 314 350 149 265 389 
max_bin 50–300 236 127 222 232 239 177 281 
feature_fraction 0.1–1 0.11 0.83 0.51 0.97 0.88 0.65 0.95 
bagging_fraction 0.1–1 0.7 0.75 0.3 0.7 0.75 0.9 0.55 
Training KGE  0.81 0.78 0.47 0.42 0.98 0.97 0.9 
Testing KGE  0.82 0.82 0.46 0.41 0.71 0.66 0.85 
HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
max_depth 3–20 12 14 19 17 
n_estimators 100–1,000 741 812 893 339 913 461 841 
learning_rate 0.001–0.1 0.09 0.04 0.003 0.02 0.04 0.03 0.01 
num_leaves 10–500 129 429 314 350 149 265 389 
max_bin 50–300 236 127 222 232 239 177 281 
feature_fraction 0.1–1 0.11 0.83 0.51 0.97 0.88 0.65 0.95 
bagging_fraction 0.1–1 0.7 0.75 0.3 0.7 0.75 0.9 0.55 
Training KGE  0.81 0.78 0.47 0.42 0.98 0.97 0.9 
Testing KGE  0.82 0.82 0.46 0.41 0.71 0.66 0.85 
Table 6

Hyperparameter tuning and performance assessment of Bi-LSTM

HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
lstm_units 64–256 230 182 148 208 249 188 95 
dropout_rate 0.0–0.5 0.12 0.37 0.41 0.32 0.49 0.23 0.05 
learning_rate 0.0001–0.1 0.001 0.001 0.003 0.003 0.008 0.015 0.005 
activation [relu, tanh, sigmoid] relu relu relu relu tanh tanh relu 
epochs 1–100 50 29 98 94 92 89 55 
Training KGE  0.78 0.8 0.24 0.17 0.7 0.86 0.92 
Testing KGE  0.83 0.81 0.22 0.14 0.56 0.7 0.87 
HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
lstm_units 64–256 230 182 148 208 249 188 95 
dropout_rate 0.0–0.5 0.12 0.37 0.41 0.32 0.49 0.23 0.05 
learning_rate 0.0001–0.1 0.001 0.001 0.003 0.003 0.008 0.015 0.005 
activation [relu, tanh, sigmoid] relu relu relu relu tanh tanh relu 
epochs 1–100 50 29 98 94 92 89 55 
Training KGE  0.78 0.8 0.24 0.17 0.7 0.86 0.92 
Testing KGE  0.83 0.81 0.22 0.14 0.56 0.7 0.87 
Table 7

Hyperparameter tuning and performance assessment of CNN

HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
filters 32–316 214 298 67 315 308 165 228 
kernel_size 3–7 
pool_size 1–3 
strides 1–3 
hidden nodes 64–256 159 181 243 255 90 227 213 
activation [relu, tanh, sigmoid] relu relu sigmoid tanh relu tanh relu 
Training KGE  0.75 0.76 0.14 0.18 0.47 0.62 0.85 
Testing KGE  0.78 0.81 0.14 0.17 0.43 0.53 0.86 
HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
filters 32–316 214 298 67 315 308 165 228 
kernel_size 3–7 
pool_size 1–3 
strides 1–3 
hidden nodes 64–256 159 181 243 255 90 227 213 
activation [relu, tanh, sigmoid] relu relu sigmoid tanh relu tanh relu 
Training KGE  0.75 0.76 0.14 0.18 0.47 0.62 0.85 
Testing KGE  0.78 0.81 0.14 0.17 0.43 0.53 0.86 
Table 8

Hyperparameter tuning and performance assessment of RF

HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
n_estimators 10–500 458 181 387 445 340 314 284 
max_depth 1–20 15 18 
min_samples_split 1–10 
min_samples_leaf 1–5 
Training KGE  0.82 0.81 0.49 0.42 0.84 0.82 0.85 
Testing KGE  0.81 0.82 0.46 0.4 0.63 0.62 0.84 
HyperparameterRange of parameterOptimal value
S1S2S3S4S5S6S7
n_estimators 10–500 458 181 387 445 340 314 284 
max_depth 1–20 15 18 
min_samples_split 1–10 
min_samples_leaf 1–5 
Training KGE  0.82 0.81 0.49 0.42 0.84 0.82 0.85 
Testing KGE  0.81 0.82 0.46 0.4 0.63 0.62 0.84 

XGBoost (refer to Table 4): In each scenario, the number of DT (n_estimators) varied; however, in most cases, the optimal value of this hyperparameter is on the lower side of the search space. In S1–S4, a minimum tree depth of 1 is achieved. In most of the instances, reg_alpha falls between 0.5 and 0.7. Similarly, the weight of the child nodes of the trees (min_child_weight) is primarily at the upper end of the range (6–9). Parameters such as max_leaf_nodes, max_delta_step have been assigned a default value of 1, whereas sub_sample, colsample_bylevel, scale_pos_weight have a value of 0.

XGBoost achieved its highest KGE values of 0.99, 0.97, and 0.99 in S5, S6, and S7 in training. Interestingly, KGE values in testing these scenarios are unsatisfactory, specifically for S5 and S6. In S3 and S4, the performance is unsatisfactory in training and testing.

LGBM (refer to Table 5): The optimal DT in the model (n_estimators) has been found in the upper side of the range (741–931) for all the scenarios except S4 and S6. The optimal depth of the DT (max_depth) for all scenarios varied across the search space. In most scenarios, the num_leaves hyperparameter has obtained values greater than 200, indicating higher model complexity. LGBM has obtained the optimal values for max_bin, and feature_fraction on the upper side of the search space for most scenarios. Default values of 20 and 0.003 have been set to parameters such as min_sum_hessian_in_leaf and min_data_in_leaf, respectively. The default values for other parameters, including lambda_l1, bagging_freq, min_gain_to_split, and lambda_l2 have been set to zero.

LGBM has performed similarly to XGBoost in all scenarios. In training, S5 has attained the highest KGE of 0.98, followed by S6 (0.97) and S7(0.9). In testing, S7 has achieved a KGE value of 0.85. Notably, the testing efficacy of S5 and S6 has decreased by 28 and 32% compared to training, indicating less model generalization capability in these scenarios.

Bi-LSTM (refer to Table 6): The optimal number of lstm_units in the LSTM layer ranges from 95 to 249 across S1–S7. The optimal values for dropout_rate and learning_rate varied across the search space. The ‘relu’ activation function was the best in all scenarios except S5 and S6. In most instances, optimal values of epochs are on the upper side of the range.

Bi-LSTM has achieved its highest KGE values of 0.92 and 0.87 in training and testing for S7. S1, S2, and S6 have also performed well, as evidenced by their training KGE values (0.78, 0.8, and 0.86) and testing KGE values (0.83, 0.81, and 0.7), respectively. The algorithm did not perform well in S3 and S4.

CNN (refer to Table 7): The number of filters in the convolution layer was on the upper side of the search space for all the scenarios except for S3 and S6. The optimal number of kernel_size varies from 3 to 7 across S1–S7. In S3–S7, the optimal pool size is two. Across all scenarios, the activation function is not consistent. The number of convolution layers, learning_rate, and batch_size have been set to their default values of 1, 0.001, and 64, respectively.

The best performance of CNN has been obtained in S7 for training and testing with KGE values of 0.85 and 0.86, respectively. S1 and S2 have also shown good performance, achieving KGE values of 0.75 and 0.76 during training and 0.78 and 0.81 during testing. The cause–effect models (S3 and S4) performed unsatisfactorily, as evidenced by their low KGE values.

RF (refer to Table 8): The optimal values for min_samples_split and min_samples_leaf are on the lower side of the search space across S1–S7. The optimal values of n_estimators do not exhibit significant variation except for S2. The optimal values for max_depth (2–6) are on the lower side of the search space for most of the scenarios. The max_sample parameter has been set to none, indicating that all samples for each tree have been considered. The sqrt has been assigned to the max_feature parameter, indicating that the number of features required for optimal node separation is the square root of the total number of features. The setting of max_leaf_nodes to none indicates no upper limit on the number of leaf nodes.

RF has achieved the highest KGE value in training (0.85) and testing (0.84) for S7. RF has also performed well for S5 and S6 in training, as evidenced by their respective KGE values (0.84 and 0.82). Notably, the performance of these scenarios is moderate in the testing. RF has also not performed well in S3 and S4, similar to other algorithms.

Selection of best input combination and implementation of SEM

It is evident from Tables 48 that the S7 performed better than the time series (S1and S2) and cause-and-effect models (S3–S6). Consequently, the inputs of S7 have been considered suitable for enhancing the predictive accuracy of the ML algorithms. Figure 10(a)–10(e) present the scatter plots between observed and predicted inflow values for S7.
Figure 10

Performance of the algorithms in S7 for the testing period: (a) Bi-LSTM, (b) XGBoost, (c) CNN, (d) LGBM, and (e) RF.

Figure 10

Performance of the algorithms in S7 for the testing period: (a) Bi-LSTM, (b) XGBoost, (c) CNN, (d) LGBM, and (e) RF.

Close modal

All the ML algorithms performed well with KGE values larger than 0.8 in testing. Bi-LSTM achieved the maximum KGE of 0.87 during the testing phase, followed by CNN (0.85), XGBoost (0.85), LGBM (0.85), and RF (0.84). In the training, S7 in XGBoost attained the maximum KGE of 0.99. Nonetheless, the performance decreased by about 14% during the testing phase, i.e., 0.85. Compared to its training phase, the testing phase performance of BI-LSTM has only reduced by 5%. Consequently, Bi-LSTM is considered superior to XGBoost in the S7 and used as the basis for SEM.

In SEM, the optimal hyperparameter values for lstm_units, dropout_rate, epochs, and learning_rate are determined to be 124, 0.2, 62, and 0.001, respectively. Among relu, tanh, and sigmoid, relu is the suitable activation function. A radar chart reflects the comparative assessment of SEM with the ML algorithms in the training and testing (Figure 11).
Figure 11

Comparative analysis of the SEM with other ML algorithms.

Figure 11

Comparative analysis of the SEM with other ML algorithms.

Close modal

Figure 11 presents KGE values of SEM (0.94 and 0.89 during the training and testing) and five ML algorithms. KGE values of SEM are observed to be better than those of Bi-LSTM (0.92 and 0.87).

Peak reservoir inflow analysis

The efficacy of the ML algorithms is also explored for peak reservoir inflow analysis. The 98th percentile has been utilized to identify extreme events (Bharti et al. 2016). Thirty-four extreme events were considered, out of which 20 are considered for training. The resulting KGE values of XGBoost, LGBM, Bi-LSTM, CNN, and RF for training are 0.93, 0.46, 0.85, 0.8, and 0.67. These values for testing are 0.77, 0.31, 0.81,0.8, and 0.59, respectively. The testing efficacy of XGBoost has decreased by 17% compared to training, indicating less generalization capability. Thus, Bi-LSTM with KGE values (0.85 and 0.81) has been selected as the meta-model for SEM. SEM obtained KGE values of 0.86 and 0.81 during training and testing, indicating its high performance in simulating peak inflow events.

Figure 12(a) and 12(b) illustrate how well the extreme simulated inflows match the observed ones using the chosen ML algorithms and SEM during training and testing, respectively.
Figure 12

(a) Simulation of peak reservoir inflows using ML algorithms in training and (b) simulation of peak reservoir inflows using ML algorithms in testing.

Figure 12

(a) Simulation of peak reservoir inflows using ML algorithms in training and (b) simulation of peak reservoir inflows using ML algorithms in testing.

Close modal

Figure 12(a) shows that XGBoost, SEM, and BI-LSTM have performed well in simulating peak inflows in the training. All the algorithms have underestimated most of the peak inflow events, except SEM. XGBoost has been able to replicate the observed peak flows quite efficiently. Bi-LSTM has underestimated 16 out of 20 extreme occurrences, with an average residual of 875 m3/s. The performance of LGBM and RF algorithms has been moderate, as they have consistently underestimated the majority of peak inflows with an average error of 1,947 and 1,726 m3/s, respectively. In contrast, SEM has marginally overestimated all the peak inflows, with an average residual of 598 m3/s.

It is observed from Figure 12(b) that BI-LSTM and XGBoost have mostly underestimated the peak inflow events in testing. LGBM has not been able to simulate the peaks, which can be seen from the more significant gap between the observed and simulated lines. With an average residual of 1,294 m3/s, the Bi-LSTM underestimated 10 peak inflow occurrences. CNN underestimated 8 of 14 events, with an average inaccuracy of 1,219 m3/s. On the other hand, SEM has marginally overestimated almost every peak inflow, with an average residual of 978 m3/s. RF also underestimated the majority of peak inflow occurrences, with an average error in inflow value of 1,885 m3/s.

Discussion

Most previous studies have simulated reservoir inflows on a monthly time scale (Noorbeh et al. 2020; Allawi et al. 2021; Han et al. 2023). This is because of the aggregation of monthly data over extended periods to reduce noise and variability inherent in daily data, making it easier for ML algorithms to identify data patterns. The present study has, however, undertaken a more intricate approach by considering daily timescale in predicting inflow to the SRSP. In this regard, a substantial amount of training data from 11 years (3,751 records) has been considered, which is also in similar lines with that of the previous research works (Bai et al. 2016; Lee & Kim 2021; Saab et al. 2022). Higher training data allows the algorithm to be trained on a broader range of climatic fluctuations affecting the inflow.

Frequency and distribution of rainfall are partially governed by the intricate relationship of climate patterns across various continents and oceans (Maity & Kashid 2011; Hasanean et al. 2023). These were expressed as inflow-related climate indices (Muluye & Coulibaly 2007; Kim et al. 2019). Relevant discussion in the context of the case study is as follows:

NINO indices (NINO 3.4, NINO 3, and NINO 4) measure the differences in the sea surface temperature (SST), indicating developments for El Niño and La Niña events. This is responsible for variability in the global wind patterns, leading to a shift in monsoon circulation over India (Azad & Rajeevan 2016). Thus, NINO indices have a strong association with inflows of SRSP, which the results of the present research work have also evidenced (refer to Table 1).

DMI directly influences atmospheric circulation over the Indian Ocean, strongly impacting cloud formation and rainfall patterns. GH indicates pressure gradients in the atmosphere. Its variability will affect the wind patterns, impacting the Indian monsoon. A lower GH value for SRSP (mean GH of 80.39 m and 80.25 m in training and testing) shows areas of low atmospheric pressure, leading to frequent cloud formation and increased rainfall over the catchment area (Jabbar & Hassan 2022).

GBI reflects the atmospheric blocking patterns over Greenland, which impacts global atmospheric circulation, notably in the Northern Atlantic Ocean (Barrett et al. 2020). This blocking pattern over the Atlantic Ocean influences the timing, strength, and distribution of Indian summer monsoon rainfall. A higher value of GBI in the present study (mean GBI of 5,320.14 m and 5,329.54 m in training and testing) indicates enhanced rainfall in the SRSP catchment area and higher inflows correspondingly.

On the other hand, climate indices based on teleconnection patterns such as the WPA, PNA, NAO, and AO have shown relatively weaker association with the daily inflows of SRSP. SRSP is located in the tropical region, whereas these indices primarily represent atmospheric circulation patterns of the mid-latitude Northern Hemisphere, far from the case study area (Yu & Lin 2016). Thus, the influence of such indices on the tropical atmospheric dynamics could be weaker than that of other climate indices. Indices such as SLP, MW, and ZW are based on coarser spatial scales covering wider regions of atmospheric conditions. Thus, their direct impact on the localized rainfall patterns of the SRSP catchment area might not have been strong. The present study employed a mechanism to find out the most significant climate indices, which is based on EWM, along the same lines as previous studies (Gelati et al. 2014; Amnatsan et al. 2018).

The present work also reveals that combined model (S7: Q(t), Q(t 1), Q(t 2), P(t), P(t 1), P(t 2), E(t), E(t 1), E(t 2), GBI, GH, NINO 3, NINO 3.4, NINO 4, and DMI) performed better than cause–effect and time series models for simulating inflows, which is echoed by the works of Magar & Jothiprakash (2011). All the ML algorithms (XGBoost, LGBM, Bi-LSTM, CNN, and RF) have performed very well in simulating the inflow patterns under S7. Among them, Bi-LSTM has shown the best efficiency and matches the analysis of Feizi et al. (2022). Bi-LSTM is better since it can learn from both forward and backward directions, while other algorithms are unidirectional (Mughees et al. 2021). SEM with Bi-LSTM as the meta-model has been implemented to improve the predictive accuracy. The results show that SEM (KGE of 0.94 and 0.89 in training and testing) has performed relatively better than Bi-LSTM (KGE of 0.89 and 0.87 in training and testing). Granata et al. (2022) have drawn similar conclusions in their study on streamflow forecasting. Lu et al. (2023) have also found that SEM has performed better than XGBoost, RF, and AdaBoost.

The present research indicates that the climate indices with the conjunction of rainfall, evaporation, and time-lagged inflow have significantly improved the predictive accuracy, in contrast to the utilization of climate indices as the sole inputs (S5: GBI, GH, NINO 3, NINO 3.4, NINO 4, and DMI). It may be due to the global averaging of climate indicators, which are not adjusted or downscaled for the case study.

In peak inflow analysis, SEM has shown promising results matching the extreme values. Neural-based ML algorithms such as Bi-LSTM and CNN have marginally underestimated the peak flows. Similar observations have been reported by Forghanparast & Mohammadi (2022) and Granata et al. (2022). Underestimation of extreme inflows will aid decision-makers in preparing for the worst-case scenario and implementing water conservation measures. The findings of the present study have significant implications for research on reservoir inflow forecasting, specifically from the viewpoint of climatic indices. There are many uncertainties in the data of climate indices. Inflow prediction can be improved if these uncertainties are minimized, which will be considered as future work.

The present study modeled the nonlinear complexities of climate indices, meteorological variables, and time-lagged inflow data for daily inflow to SRSP, India. Five ML algorithms, XGBoost, LGBM, Bi-LSTM, CNN, and RF, are employed for simulating inflow. Climate indices were evaluated utilizing RFI, FFS, and SCC. The results from these three methods were aggregated using EWM. Combining time-lagged historical inflows, rainfall, evaporation, their time-lagged variables, and six influential indices is the suitable combination for the ML algorithms (scenario S7). An ensemble framework based on the stacking mechanism is employed. The conclusions are as follows:

  • 1.

    GBI, GH, NINO 3, NINO 3.4, NINO 4, and DMI are the most influential climate indices on the inflow to the reservoir.

  • 2.

    SEM utilizing Bi-LSTM (best ML Algorithm) as the meta-model performed marginally better than Bi-LSTM, LGBM, RF, and CNN.

  • 3.

    SEM has predicted the peak inflow events quite well, with KGE values of 0.86 and 0.81 for both training and testing, respectively. Notably, SEM has marginally overestimated the inflows.

Further research works that may be explored include land use characteristics, groundwater data, soil, utilizing finer resolution of climate indices, and climate change framework.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Allawi
M. F.
,
Hussain
I. R.
,
Salman
M. I.
&
El-Shafie
A.
2021
Monthly inflow forecasting utilizing advanced artificial intelligence methods: A case study of Haditha dam in Iraq
.
Stochastic Environmental Research and Risk Assessment
35
(
11
),
2391
2410
.
Amiri
F.
,
Yousefi
M. R.
,
Lucas
C.
,
Shakery
A.
&
Yazdani
N.
2011
Mutual information-based feature selection for intrusion detection systems
.
Journal of Network and Computer Applications
34
(
4
),
1184
1199
.
Apaydin
H.
,
Feizi
H.
,
Sattari
M. T.
,
Colak
M. S.
,
Shamshirband
S.
&
Chau
K. W.
2020
Comparative analysis of recurrent neural network architectures for reservoir inflow forecasting
.
Water
12
(
5
),
1500
.
Barrett
B. S.
,
Henderson
G. R.
,
McDonnell
E.
,
Henry
M.
&
Mote
T.
2020
Extreme Greenland blocking and high-latitude moisture transport
.
Atmospheric Science Letters
21
(
11
),
1002
.
Bharti
V.
,
Singh
C.
,
Ettema
J.
&
Turkington
T. A. R.
2016
Spatiotemporal characteristics of extreme rainfall events over the Northwest Himalaya using satellite data
.
International Journal of Climatology
36
(
12
),
3949
3962
.
Carranza
C.
,
Nolet
C.
,
Pezij
M.
&
van der Ploeg
M.
2021
Root zone soil moisture estimation with Random Forest
.
Journal of Hydrology
593
,
125840
.
Chakraborty
D.
,
Elhegazy
H.
,
Elzarka
H.
&
Gutierrez
L.
2020
A novel construction cost prediction model using hybrid natural and light gradient boosting
.
Advanced Engineering Informatics
46
,
101201
.
Feizi
H.
,
Apaydin
H.
,
Sattari
M. T.
,
Colak
M. S.
&
Sibtain
M.
2022
Improving reservoir inflow prediction via rolling window and deep learning-based multi-model approach: Case study from Ermenek dam, Turkey
.
Stochastic Environmental Research and Risk Assessment
36
(
10
),
3149
3169
.
Gelati
E.
,
Madsen
H.
&
Rosbjerg
D.
2014
Reservoir operation using El Niño forecasts – Case study of Daule Peripa and Baba, Ecuador
.
Hydrological Sciences Journal
59
(
8
),
1559
1581
.
Hasanean
H. M.
,
Almaashi
A. K.
&
Labban
A. H.
2023
Variability in global climatic circulation indices and its relationship
.
Atmosphere
14
(
12
),
1741
.
Huang
I. H.
,
Chang
M. J.
&
Lin
G. F.
2022
An optimal integration of multiple machine learning techniques to real-time reservoir inflow forecasting
.
Stochastic Environmental Research and Risk Assessment
36
,
1541
1561
.
Ibrahim
K. S. M. H.
,
Huang
Y. F.
,
Ahmed
A. N.
,
Koo
C. H.
&
El-Shafie
A.
2022
Forecasting multi-step-ahead reservoir monthly and daily inflow using machine learning models based on different scenarios
.
Applied Intelligence
53
,
10893
10916
.
Jabbar
M. A. R.
&
Hassan
A. S.
2022
Evaluation of geopotential height at 500 hPa with rainfall events: A case study of Iraq
.
Al-Mustansiriyah Journal of Science
33
(
4
),
1
8
.
Kermani
M.
,
Batelaan
O.
,
Fadaee
M.
&
Hinkelmann
R.
2021
Ensemble machine learning paradigms in hydrology: A review
.
Journal of Hydrology
598
,
126266
.
Khorram
S.
&
Jehbez
N.
2023
A hybrid CNN-LSTM approach for monthly reservoir inflow forecasting
.
Water Resources Management
37
,
4097
4121
.
Kiran
S. J. K.
,
Reddy
N. K.
&
Karthik
C. D.
2021
Influence of sediment yield over life of reservoir using Arc SWAT, to adjudge useful life of reservoir by ANOVA analysis for Sri Ram Sagar Project
.
Journal of Water Resource Engineering and Management
8
(
1
),
17
26
.
Latif
S. D.
,
Ahmed
A. N.
,
Sathiamurthy
E.
,
Huang
Y. F.
&
El-Shafie
A.
2021
Evaluation of deep learning algorithm for inflow forecasting: A case study of Durian Tunggal reservoir, Peninsular Malaysia
.
Natural Hazards
109
(
1
),
351
369
.
Lee
H. C.
&
Chang
C. T.
2018
Comparative analysis of MCDM methods for ranking renewable energy sources in Taiwan
.
Renewable and Sustainable Energy Reviews
92
,
883
896
.
Lu
M.
,
Hou
Q.
,
Qin
S.
,
Zhou
L.
,
Hua
D.
,
Wang
X.
&
Cheng
L.
2023
A stacking ensemble model of various machine learning models for daily runoff forecasting
.
Water
15
(
7
),
1265
.
Ma
M.
,
Zhao
G.
,
He
B.
,
Li
Q.
,
Dong
H.
,
Wang
S.
&
Wang
Z.
2021
XGBoost-based method for flash flood risk assessment
.
Journal of Hydrology
598
,
126382
.
Mathevet
T.
,
Le Moine
N.
,
Andréassian
V.
,
Gupta
H.
&
Oudin
L.
2023
Multi-objective assessment of hydrological model performances using Nash–Sutcliffe and Kling–Gupta efficiencies on a worldwide large sample of watersheds
.
Comptes Rendus. Géoscience
355
(
S1
),
1
25
.
Moradi
A. M.
,
Dariane
A. B.
,
Yang
G.
&
Block
P.
2020
Long-range reservoir inflow forecasts using large-scale climate predictors
.
International Journal of Climatology
40
(
13
),
5429
5450
.
Mughees
N.
,
Mohsin
S. A.
,
Mughees
A.
&
Mughees
A.
2021
Deep sequence to sequence Bi-LSTM neural networks for day-ahead peak load forecasting
.
Expert Systems with Applications
175
,
114844
.
Muhammad
M. K. I.
,
Nashwan
M. S.
,
Shahid
S.
,
Ismail
T. B.
,
Song
Y. H.
&
Chung
E. S.
2019
Evaluation of empirical reference evapotranspiration models using compromise programming: A case study of Peninsular Malaysia
.
Sustainability
11
(
16
),
4267
.
Najafabadi
M. M.
,
Villanustre
F.
,
Khoshgoftaar
T. M.
,
Seliya
N.
,
Wald
R.
&
Muharemagic
E.
2015
Deep learning applications and challenges in big data analytics
.
Journal of Big Data
2
(
1
),
1
21
.
Ni
L.
,
Wang
D.
,
Wu
J.
,
Wang
Y.
,
Tao
Y.
,
Zhang
J.
&
Liu
J.
2020
Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model
.
Journal of Hydrology
586
,
124901
.
Noorbeh
P.
,
Roozbahani
A.
&
Kardan Moghaddam
H.
2020
Annual and monthly dam inflow prediction using Bayesian networks
.
Water Resources Management
34
,
2933
2951
.
Pan
S.
,
Zheng
Z.
,
Guo
Z.
&
Luo
H.
2022
An optimized XGBoost method for predicting reservoir porosity using petrophysical logs
.
Journal of Petroleum Science and Engineering
208
,
109520
.
Rajagopalan
B.
,
Erkyihun
S. T.
,
Lall
U.
,
Zagona
E.
&
Nowak
K.
2019
A nonlinear dynamical systems-based modeling approach for stochastic simulation of streamflow and understanding predictability
.
Water Resources Research
55
(
7
),
6268
6284
.
Rajesh
M.
,
Anishka
S.
,
Viksit
P. S.
,
Arohi
S.
&
Rehana
S.
2023
Improving short-range reservoir inflow forecasts with machine learning model combination
.
Water Resources Management
37
(
1
),
75
90
.
Ratner
B.
2009
The correlation coefficient: Its values range between+ 1/− 1, or do they?
Journal of Targeting, Measurement and Analysis for Marketing
17
(
2
),
139
142
.
Roy
D. K.
,
Sarkar
T. K.
,
Kamar
S. S. A.
,
Goswami
T.
,
Muktadir
M. A.
,
Al-Ghobari
H. M.
&
Mattar
M. A.
2022
Daily prediction and multi-step forward forecasting of reference evapotranspiration using LSTM and Bi-LSTM models
.
Agronomy
12
(
3
),
594
.
Rucker
D. D.
,
McShane
B. B.
&
Preacher
K. J.
2015
A researcher's guide to regression, discretization, and median splits of continuous variables
.
Journal of Consumer Psychology
25
(
4
),
666
678
.
Saab
S. M.
,
Othman
F.
,
Tan
C. G.
,
Allawi
M. F.
,
Sherif
M.
&
El-Shafie
A.
2022
Utilizing deep learning machine for inflow forecasting in two different environment regions: A case study of a tropical and semi-arid region
.
Applied Water Science
12
(
12
),
272
.
Semmelmann
L.
,
Henni
S.
&
Weinhardt
C.
2022
Load forecasting for energy communities: A novel LSTM-XGBoost hybrid model based on smart meter data
.
Energy Informatics
5
(
1
),
1
21
.
Shu
X.
,
Ding
W.
,
Peng
Y.
,
Wang
Z.
,
Wu
J.
&
Li
M.
2021
Monthly streamflow forecasting using convolutional neural network
.
Water Resources Management
35
,
5089
5104
.
Skariah
M.
&
Suriyakala
C. D.
2022
Forecasting reservoir inflow combining exponential smoothing, ARIMA, and LSTM models
.
Arabian Journal of Geosciences
15
(
14
),
1292
.
Wang
S.
,
Peng
H.
,
Hu
Q.
&
Jiang
M.
2022
Analysis of runoff generation driving factors based on hydrological model and interpretable machine learning method
.
Journal of Hydrology: Regional Studies
42
,
101139
.
Wegayehu
E. B.
&
Muluneh
F. B.
2022
Short-term daily univariate streamflow forecasting using deep learning models
.
Advances in Meteorology
2022
,
1860460
.
Xu
C.
,
Wang
Y.
,
Fu
H.
&
Yang
J.
2022a
Comprehensive analysis for long-term hydrological simulation by deep learning techniques and remote sensing
.
Frontiers in Earth Science
10
,
1
16
.
Xu
Y.
,
Liu
Y.
,
Jiang
Z.
,
Yang
X.
,
Wang
X.
,
Zhang
Y.
&
Qin
Y.
2022b
Improved convolutional neural network and its application in non-periodical runoff prediction
.
Water Resources Management
36
(
15
),
6149
6168
.
Yang
T.
,
Asanjan
A. A.
,
Welles
E.
,
Gao
X.
,
Sorooshian
S.
&
Liu
X.
2017
Developing reservoir monthly inflow forecasts using artificial intelligence and climate phenomenon information
.
Water Resources Research
53
(
4
),
2786
2812
.
Zhou
J.
,
Xu
K.
,
Xiang
T.
&
Yu
Y.
2023
Composition analysis and identification of ancient glass objects based on spearman correlation analysis model
.
Highlights in Science, Engineering and Technology
33
,
80
87
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).