This study coupled the ensemble learning method with residual error (RE) correction to propose a more accurate hydrologic model for the time-series prediction of the reservoir inflow. To enhance the prediction capability of the model in mountain catchments, three deep learning (DL) models, namely the encoder–decoder gated recurrent units (ED-GRU), encoder–decoder long short-term memory network (ED-LSTM), and combined convolutional neural network with LSTM (CNN-LSTM), were deployed to train reservoir inflow prediction model for the lead times of 1–24 h. The prediction outputs from three DL models were then incorporated into the categorical gradient boosting regression (CGBR) model to resolve the highly non-linear relationship between model inputs and outputs. In the final procedure, the RE correction method was implemented by using the outcomes of the CGBR model to construct the proposed hybrid model. The proposed model was applied to simulate the hourly inflow in the Shihmen and Feitsui Reservoirs. The proposed model achieved improved performance by an average proportion of 66.2% compared to the three DL models. It is demonstrated that the proposed model is accurate in predicting the reservoir peak and total inflows and also performs well for storm events with multi-peak hydrographs.

  • An augmented hydrologic model is proposed by integrating ensemble learning with residual error correction methods.

  • The proposed model can accurately simulate the reservoir inflow with multi-peak and prolonged periods.

  • The overall improved accuracy for the lead times of 1–24 h was obtained by an averaging factor of 66.2%.

Under the impact of global climate change, the frequency of extreme weather phenomena has increased significantly worldwide. Taiwan, located at the junction of the Northwest Pacific Ocean and the Asian continent, is affected by an average of three to four typhoons yearly. The annual typhoon season brings abundant rainfall, providing the most crucial sources of water resources throughout the year in Taiwan. However, if typhoon-induced rainfall results in a large amount of runoff, it also may cause downstream floods and further flood disaster loss. Accordingly, the flood control operation of reservoirs during typhoon periods is a challenging, intricate, and important issue. To carefully and concisely assess flood control operations, the reservoir operators will consider several factors, such as reservoir inflow, dam security, downstream flood mitigation, and water storage utilization, to decide whether to regulate water release in advance. Therefore, the accurate estimation of reservoir inflow will benefit reservoir managers in effectively operating the regulation release, water resource supply, and flood control.

Many studies have focused on developing and applying reservoir inflow estimation models, which can be divided into two main categories: the hydrological physical-based (HP) and the data-driven (DD) models. Existing process HP models include the Sacramento model (Anderson et al. 2006), Xin'anjiang (XAJ) model (Bao et al. 2010; Cui et al. 2021), the watershed-scale long-term hydrologic impact assessment model (watershed-scale L-THIA) (Ryu et al. 2016), the hydraulic engineering center-hydrologic modeling system (HEC-HMS) (Yang & Yang 2014), the hydrological simulation program-Fortran (HSPF) (Albek et al. 2019) and the Hydrologiska Byråns Vattenbalansavdelning (HBV) model (Li et al. 2014; Gelete et al. 2023). The HP models can be employed to estimate the reservoir inflow based on the hydrological processes of the basin. However, many hydro-geomorphological parameters are needed to be calibrated. Moreover, detailed information about a watershed's dynamic hydrological process needs to be determined. Furthermore, due to the expensive computational resources, the applications of HP models are weaker in their adaptability.

In recent years, with the gradual advance of artificial intelligence (AI) technology, there has been pertinent literature on applying DD models to estimate reservoir inflow. The DD-based models predict time-series problems without solving other physics processes. These models utilize the hydrologic data to construct the non-linear relationship between inputs and outputs and can perform noticeably well compared to the HP model. Therefore, the DD models have been applied to the predictions of reservoir inflow based on the different AI techniques. For instance, Wang et al. (2014) applied a support vector machine (SVM), genetic programming (GP), and seasonal autoregressive method to predict monthly reservoir inflow on the Three Gorges Reservoirs located on the Yangtze River. Yang et al. (2017) employed and compared random forest (RF), artificial neural network (ANN), and SVM models for predicting monthly reservoir inflows for the Trinity Lake in the USA and the Danjiangkou Reservoir in China. Compared to SVM and ANN, RF achieves the best performance in reservoir inflow prediction. Honorato et al. (2018) employed ANN combined with the wavelet transform approach to build the hybrid model. Their model was applied to the monthly prediction of reservoir inflows in the Sobradinho Dam, located on the São Francisco River in Northeast Brazi. Babaei et al. (2019) employed the ANN and SVM models for inflow prediction on the Zayandehroud Dam Reservoir. Their comparison results indicated that the SVM produces better prediction solutions than the ANN model. Ngamsanroaj & Tamee (2019) employed ANN with 3,169 data records to predict the daily inflow of the Bhumibol reservoir in Tak Province, Thailand. Rezaie-Balf et al. (2019) proposed a reservoir monthly inflow estimation model based on ANN with a 130-year inflow dataset. The model was applied to Aswan High Dam, Egypt, for reservoir inflow estimation in the lead times of 1–6 months. Zarei et al. (2021) employed the SVM, regression tree, GP, and ANN to forecast the reservoir inflow with the lead times of 1–2 months at the Dez, Karkheh, and Gotvand Reservoirs in Iran.

More recently, the DD models based on the so-called deep learning (DL) techniques have been proposed owing to sufficient data measurements. They have been increasingly utilized to solve hydrology problems. The DL-based models require more valid and significant data but less human intervention in the training and prediction process, such as the recurrent neural network (RNN), gated recurrent units (GRUs), convolutional neural network (CNN), and long short-term memory network (LSTM). Moreover, the different DL techniques could be integrated into a new combined model for improving the prediction performance of the single model. Therefore, various DL or hybrid models have been proposed to predict reservoir inflows. For example, Yang et al. (2019) used three RNNs, namely non-linear autoregressive models with exogenous input (NARX), LSTM, and genetic algorithm-based NARX (GA-NARX), to simulate the inflows on three reservoirs located in the upper Chao Phraya River basin. Their results indicated that the GA-NARX yields the best prediction performance among the three RNNs. Apaydin et al. (2020) presented a comparative analysis of daily reservoir inflow in the Ermenek hydroelectric dam reservoir, Turkey. They employed the GRU, LSTM, and RNN models and found that the LSTM model has the best prediction accuracy of reservoir inflow. Based on the combined CNN with LSTM (CNN-LSTM) model, Herbert et al. (2021) used 30 years of reservoir inflow data at the Upper Stillwater Reservoir located in Utah for reservoir inflow forecasting. Latif et al. (2021) employed and evaluated LSTM, SVM, and ANN models for inflow forecasting in the Durian Tunggal Reservoir, Peninsular Malaysia. They found that the LSTM has the best prediction accuracy among the tested models. Lee & Kim (2021) applied the LSTM model with nearly 15 years of hydrology data to predict the reservoir inflow of Soyang Dam in Korea. The results showed that the LSTM model provides good agreements with measured data. Zhang et al. (2021) combined the CNN, the extreme gradient boosting model (XGBoost), and a partial least squares model (PLS) to propose a new hybrid model for the inflow prediction in the Jinshuitan reservoir. Indeed, the combined model presents better performance than the other single models. Feizi et al. (2022) utilized four different RNN models, bidirectional long short-term memory (BiLSTM), GRU, LSTM, and simple RNN, in the framework of the rolling window technique for proposing a hybrid model. Their model achieved reliable and accurate prediction results in the Ermenek Dam, Turkey. Maddu et al. (2022) employed various DD-based models, including RF, gradient boosting regressor (GBR), K-nearest neighbors (KNN) regressor, and LSTM, to propose a hybrid model. Their proposed hybrid model performs better than the standalone RF, GBR, KNN, and LSTM models in predicting daily reservoir inflows on two headwater reservoirs in California, USA, and Bhadra reservoir, India.

According to the above literature, the reservoir inflow estimation model based on DD-based models is successfully applied to estimate reservoir inflow in different countries and temporal scales (hourly, daily, weekly, and monthly). The results indicated that the DD-based models are accurate and fast in predicting the non-linear trend of the reservoir inflows. However, the quality of the used data can affect the prediction accuracy of DD models, and thus the sensitivity analysis, such as the combination of inputs, should be conducted and analyzed before model predicting. Furthermore, the application of DD models to the predictions with longer lead times often results in lower accuracy for peak and time-to-peak inflow due to the uncertainty of model prediction. Therefore, substantially reducing the errors from predictions with longer lead times would help reservoir managers make optimal operations for mitigating the flood risk in the downstream reservoir and conserving available water.

The primary purpose of this study is to propose an augmented hydrologic model for reservoir inflow predictions in the lead times of 1–24 h. The proposed model was constructed based on combining ensemble learning with the residual error (RE) correction. For ensemble learning, three DL models, encoder–decoder-based GRU (ED-GRU), encoder–decoder-based LSTM (ED-LSTM), and CNN-LSTM, were utilized to improve the performance in the predictions with longer lead times. Based on the categorical gradient boosting regression (CGBR), the predicted results from these three DL models were then employed as the inputs of the RE correction for further significantly improving the accuracy both in peak inflow error (PIE) and error of time-to-peak inflow. The proposed model was applied to the Shihmen and Feitsui Reservoirs in Taiwan. The hourly hydrological data, including rainfall and reservoir inflows, were collected to train the proposed model. Based on the collected data, the trained model was used to predict reservoir inflows with longer lead times of up to 24 h. Seven evaluation criteria, namely the correlation coefficient (CC), mean absolute error (MAE), root mean square error (RMSE), Nash-Sutcliffe efficiency (NSE), reservoir PIE, reservoir total inflow error (TIE) and generalization ability (GA), were selected to assess the prediction performance of the proposed model. To further evaluate the capability and reliability of the proposed model in reservoir inflow simulation, the predicted results by the proposed model were also compared with those using the single ED-GRU, ED-LSTM, and CNN-LSTM models.

To summarize, the three main objectives of this study are (i) to propose a new hydrologic model by utilizing three DL models with the error correlations, (ii) to evaluate the predictive accuracy of the proposed model in 1–24-h lead times for the Shihmen and Feitsui Reservoirs, and (iii) to compare and examine the models' usability with the other DL models for providing insights in reservoir inflow simulation at mountain catchments.

Two sites, the Shihmen and Feitsui Reservoirs located in northern Taiwan, were selected. The location map of two reservoirs concluding rainfall gauging stations is shown in Figure 1. Shimen Reservoir is located in the middle and lower reaches of the Dahan River, with a catchment area of about 763.4 km2. There are nine rainfall stations in the upper reaches of the Shimen catchment area. The Shimen Reservoir has flood control, irrigation, power generation, and water supply functions. It mainly supplies public water in New Taipei City and Taoyuan City, an essential reservoir in northern Taiwan. Nearly 56 years (1966–2022) of hydrological data were collected in the Shimen Reservoir. Figure 2(a) shows the measured rainfall and reservoir inflow for the Shihmen Reservoirs. Table 1 shows that the maximum average rainfall in the catchment area of Shimen Reservoir in the past 55 years was 96.08 mm/h, and the maximum reservoir inflow was 8,593.85 m3/s during Typhoon Aere in 2004.
Table 1

The summary characteristics of the collected hydrologic measured data

ReservoirsHydrology dataTraining data
Test data
MeanMaxMinMeanMaxMin
Feitsui Rainfall (mm/h) 2.50 54.4 2.12 45.5 
Inflow (m3/s) 188.42 3,597.1 2.78 190.82 3,016.64 
Shihmen Rainfall (mm/h) 2.34 96.08 1.98 46.89 
Inflow (m3/s) 411.08 8,593.85 0.36 351.42 5,385.08 1.06 
ReservoirsHydrology dataTraining data
Test data
MeanMaxMinMeanMaxMin
Feitsui Rainfall (mm/h) 2.50 54.4 2.12 45.5 
Inflow (m3/s) 188.42 3,597.1 2.78 190.82 3,016.64 
Shihmen Rainfall (mm/h) 2.34 96.08 1.98 46.89 
Inflow (m3/s) 411.08 8,593.85 0.36 351.42 5,385.08 1.06 
Figure 1

The location map of the reservoir watershed including the rainfall stations.

Figure 1

The location map of the reservoir watershed including the rainfall stations.

Close modal
Figure 2

The measured hydrologic data showing rainfall and reservoir inflow for the (a) Shihmen and (b) Feitsui Reservoirs.

Figure 2

The measured hydrologic data showing rainfall and reservoir inflow for the (a) Shihmen and (b) Feitsui Reservoirs.

Close modal

The Feitsui Reservoir is located in the lower reaches of the Beishi Creek. There are six rainfall stations in the upper reaches of the catchment area, and the catchment area is about 303 km2. The Feicui Reservoir has the functions of flood control, power generation, and water supply and supplies essential water sources for Taipei City. We collected rainfall and reservoir hourly inflow data during the past 16 years (2006–2022). Figure 2(b) shows the measured rainfall and reservoir inflow for the Feitsui Reservoirs. Table 1 shows that the maximum average rainfall in the catchment area in the past 16 years was 54.4 mm/h, and the maximum reservoir inflow was 3,597.1 m3/s during Typhoon Sinlaku in 2008.

ED-LSTM

The LSTM model is an advanced type of classical RNN model (Hochreiter & Schmidhuber 1997). The LSTM utilizes feedback connections to vanish unnecessary information and store helpful instructions; thus, LSTM can achieve better performance of long-range dependencies than RNN. The neural network unit of LSTM includes an input layer, memory cells, and an output layer. In addition, the memory cell of LSTM consists of three main gates: input, output, and forget gates.

The core architecture of the LSTM neural network is expressed as (Xiang et al. 2020; Xu et al. 2020; de Moura et al. 2022):
(1)
(2)
(3)
(4)
(5)
(6)
where , , , and stand for the vectors of information for the forget gate, input gate, output gate and memory cells, respectively; , , , and denote the weighted parameter matrices for the forget gate, input gate, output gate and memory cells, respectively;, , , and represent the bias for the forget gate, input gate, output gate and memory cells, respectively; stands for the non-linear activation function of sigmoid; tanh denotes a function of hyperbolic tangent; is the input vector at time step t; represents the output vector of hidden state at the previous time step t −1; and denotes the output vector of hidden state at time step t.

The single LSTM model is suitable for forecasting a single-step time-series problem. As for multi-step forecasting problem, the encode-decoder framework is widely used for extending the LSTM to achieve the multi-output predicting capability (Cho et al. 2014). In the encoder part, the inputs of sequence with varied length were encoded into a vector representing with a fixed length. As for the decoder process, the fixed-length vectors were decoded into a varied-length sequence, leading to the allowed continuous prediction. Therefore, this study employed the ED-LSTM for the simulations of reservoir inflow.

ED-GRU

The GRU model, which is a simplified type of LSTM, includes two main core gates: update and reset gates (Ayzel & Heistermann 2021). The update gate's purpose is to control how much of the previous vital memory data can be stored. The reset gate is utilized to forget the insignificance information obtained from the previous time step. The GRU model can accelerate the DL process with these two core gates.

The neural network structure of the GRU model is formulated as follows (Ayzel & Heistermann 2021):
(7)
(8)
(9)
(10)
where and denote the vectors related to information for the update gate and reset gate, respectively; , , and stand for the matrices with weighted parameters for the update gate, reset gate and output gate, respectively; denotes the vectors of hidden stage at time step t; tanh is the activation function in terms of hyperbolic tangent function and and represent the bias for the update gate and reset gate, respectively.

The encode-decoder architecture has good performance in dealing with the multi-output time-series prediction problem. Thus, it is employed to combine the GRU, which is referred to as the ED-GRU model herein. For a detailed architectural description of the ED-GRU, one can refer to Cho et al. (2014).

CNN-LSTM

CNN employs feedforward neural network for picture recognition. CNN could also be applied to analyze the non-linear time-series problem. CNN has several desirable properties: less complexity, weight sharing, and pulling-down sampling. Thus, CNN can yield better performance in resolving time-series prediction through the vital and efficient operation of feature extraction.

The core framework of CNN includes the input, convolution, pooling, dense, and output layers (Livieris et al. 2022). The output vector of the convolution layer is defined as (Xu et al. 2022):
(11)
where stands for the time-series outputs from convolution layer, denotes the bias; represents the weighted matrix of the convolution layer; is the activation function of convolution layer; the symbol ‘∗’ stands for the convolution operation; and X is the input vectors.
In the pooling layer, the sequence features from the convolutional layer are determined based on the commonly used operation of global maximum pooling. Then the dense layer is employed to implement the critical integration of temporal features. For the final output layer, the prediction result is defined as (Xu et al. 2022):
(12)
(13)
where and represent the weighted matrix in the dense and output layers, respectively; and are the bias by the dense and output layers, respectively; and are the activation function of dense and output layers, respectively; stands for the results relating to the global maximum pooling; is the output vectors of dense layer; and y is the final prediction output vectors.

The single CNN model is suitable for spatial non-linear prediction problems but has limitations for predicting temporal time-series problems. To improve the performance of a single DL model, the methodology combining two or more DL models is an advanced and novelty trend for obtaining the individual benefits of each DL model. For instance, based on the framework of spatial (CNN) and temporal (LSTM) features learning, the hybrid model named CNN-LSTM has been proposed and utilized in the recent literature, which yields a remarkable and robust performance in simulating the problems with the non-linear nature of time-series compared to the single DL model (Wegayehu & Muluneh 2021). The reservoir inflows are a complex hydrological process with the characteristics of temporal and spatial variations. Thus, the CNN-LSTM model is employed herein to predict the inflows for two reservoir sites. More details about the neural network framework of CNN-LSTM can be found in Wegayehu & Muluneh (2021).

Proposed model

This paper proposes an augmented hydrologic model for prediction of hourly inflow for reservoirs at mountain catchments. The proposed model includes three primary submodules: a reservoir inflow prediction model, an RE prediction model, and an RE correction model. Since the proposed model is highly dependent on ensemble learning, an accurate ensemble model is essential to provide the correct result. Consequently, the CGBR model is applied to train both the reservoir inflow prediction model and the RE prediction model for enhancing the performance of proposed model.

For the training of the reservoir inflow prediction model, the DL models could produce predictions closer to the measurement dataset through reading, training, remembering valuable dataset and forgetting needless information. Hence, three DL-based models (i.e., ED-GRU, ED-LSTM, and CNN-LSTM) were employed herein as the ensemble learning. Consequently, the outputs from three DL models were selected and used as the input vectors of the CGBR model to construct the reservoir inflow prediction model for increasing the prediction accuracy with lead times of 1–24 h.

As for constructing the RE prediction model, the predicted outputs from the trained reservoir inflow prediction model were used to produce the RE time-series. On the basis of the CGBR, the RE prediction model is then trained. Both the trained reservoir inflow and RE prediction models were finally utilized to update and achieve the final RE correction model.

The prediction model for reservoir inflows

The outputs Y of reservoir hourly inflow using proposed model is expressed as (Kan et al. 2020):
(14)
in which X denotes the matrix vector of inputs comprised of the outputs from three DL models and stands for the objective function of CGBR.

The CGBR is an innovative DD ensemble learning model based on the technique of the gradient-boosted decision tree. For the classification or regression problems, the CGBR employed the permutation-driven algorithm in dealing with the categorical features to avoid overfitting and reducing memory consumption of the training process. Among popular ensemble learning models such as the extreme gradient boosting and the light gradient boosting machine, the CGBR is the best algorithm employed and applied in several time-series prediction problems (Gao et al. 2022).

In Equation (14), the matrix vector of outputs Y is defined as:
(15)
where n is the total number of dataset, represents the predicted vectors of reservoir inflow at time t + L, and L denotes the lead time.
The matrix vector of inputs X in Equation (14) is defined as:
(16)
in which , and are, respectively, the predicted vectors of reservoir inflow at time t + L using ED-GRU, ED-LSTM and CNN-LSTM models.
Accordingly, the predicted vectors of the reservoir inflow using three DL models can be expressed as:
(17)
(18)
(19)
where , , and are the objective functions of ED-GRU, ED-LSTM and CNN-LSTM models, respectively; and and are the antecedent measured reservoir hourly inflow and rainfall at time step t-L, respectively; and stands for the future hourly rainfall at time step t+L. Before implementing the model forecasting in real-time, the observed rainfall dataset was selected and unitized as the rainfall values at time step of t+L for model evaluation.

The prediction model for RE

Before constructing the RE prediction model, the time-series of RE are estimated by:
(20)
where denotes the predicted vectors of reservoir inflow by the proposed model and represents the measured data. With the RE time-series, the CGBR is employed to construct the RE prediction model and it is defined as:
(21)
in which is the antecedent time-series of RE at time step of tm, m is the proper antecedent time and the denotes the predicted RE for time t + L.

RE correction model

For the final reservoir hourly inflow prediction, the results from both reservoir inflow and RE prediction models were corrected and updated to new results, which are expressed as follows:
(22)
where is the corrected reservoir hourly inflow at time t + L.

Performance evaluation metrics

The performance of the proposed model can be evaluated using the statistical indices. Seven evaluation criteria are employed herein, namely CC, MAE, RMSE, NSE, PIE, TIE and GA, which are defined (Chen et al. 2012, 2014; Cai et al. 2021; Guo et al. 2021, 2023):
(23)
(24)
(25)
(26)
(27)
(28)
(29)
where and are, respectively, the measured and predicted reservoir inflows; and are, respectively, the mean values of the measured and predicted reservoir inflows; and are, respectively, the measured and predicted reservoir peak inflows; and and denote the measured and predicted reservoir total inflows, respectively.

The CC and NSE values represent the prediction performance between the simulated reservoir inflows with the measured dataset. The range of CC value is from −1 to 1, and the range of NSE value can vary from negative infinity to 1. The closer the CC or NSE is to 1, the better the performance of reservoir inflow prediction is. In addition, the evaluation metrics of MAE, RMSE, PIE, and TIE are employed to examine the prediction accuracy of DD models. The smaller the MAE, RMSE, PIE, and TIE, the greater the model performance. Moreover, the adopted GA aims to investigate the model's applicability. As the value of GA is up to 1, the model trains and accurately predicts a given non-linear problem. On the contrary, the value of GA is smaller or larger than 1.0, meaning the model is undertraining or overtraining (Guo et al. 2023).

Model implementation and flowchart

A lead time of up to 24 h was selected to investigate the performance of our proposed model in predicting inflows for two reservoir sites. In addition, if the number of antecedent RE is too much, it will be unable to train the non-linear relationship between the inputs and outputs effectively. After our trial-error experimental procedure, it is found that the antecedent time-series of RE with time step of t − 1 provides enough performance. Hence, the m value of 1 h was employed in this study. As for the inputs considering the rainfall data, the values at each rainfall station can be individually selected and inputted to the proposed model. However, according to the existing studies (Xiang et al. 2020; Lee & Kim 2021), using global inputs averaged from the rainfall dataset at each station is a commonly used and more efficient method that provides efficient and good enough performance. Therefore, the rainfall data were averaged based on Thiessen's polygon method and used as the inputs of model training and prediction.

Various functions could be used to select the activation function in the DL models. Existing reservoir inflow prediction studies (Apaydin et al. 2020; Xiang et al. 2020) have demonstrated promising results using the rectified linear unit (ReLU) for the optimum activation function. Therefore, the present study employed the ReLU function combined with the Adam optimizer for three DL models. In addition, it is crucial to determine the parameter in the CGBR model. Based on the previous study (Gao et al. 2022), the number of trees is one of the most sensitive parameters. Therefore, this paper employed the grid search approach to yield the optimal value for tree numbers.

Figure 3 shows the flowchart employed to construct and test the proposed model, which process comprises hourly data collection, data preprocessing, model training, and testing, and is summarized as follows:
  • (1)

    The observed historical dataset for hourly rainfall and reservoir inflow were gathered.

  • (2)

    The min-max normalization was utilized to convert the collected dataset into an interval between −1 and 1 in proportion to reduce the overfitting of the prediction results and improve the prediction accuracy.

  • (3)

    The standardized dataset was further divided into two subsets: training and testing at a ratio of 7:3. Seventy per cent of the dataset was used to learn and construct the model, and the remaining 30% was employed for the model evaluation.

  • (4)

    Three predicted outputs could be sequentially obtained from Equations (17) to (19) through three DL models. With these outputs, the CGBR obtains the predicted reservoir inflows for 1–24-h lead times based on the reservoir inflow prediction model using Equation (14).

  • (5)

    The time-series dataset of RE with a lead time of 1–24-h are obtained using Equations (20). Through Equations (21), the RE of reservoir inflows is then predicted.

  • (6)

    Based on Equation (22), the prediction results of the reservoir inflow are corrected and updated as the final results.

  • (7)

    Seven statistical indicators were employed to assess the proposed model's performance. The proposed model was also compared with the three DL models to examine the performance of our model.

Figure 3

The flowchart for the development, application, and evaluation of the proposed model.

Figure 3

The flowchart for the development, application, and evaluation of the proposed model.

Close modal

Sensitivity analysis for the influence of input factors on predictions

The DD model is constructed mainly based on the given dataset for resolving the relationships between the inputs and outputs of system variables. Hence, the accuracy of DD models' predictions is highly dependent on the data characteristics of the system state variables. This section aims to analyze how the input variables affect the prediction results by the DD model, which is needed before constructing the proposed model.

Four different combinations of input variables were designed and listed in Table 2. The first combination of input variables, namely C1, considers only the antecedent hourly rainfall from time t to t–24 and was adopted as the inputs. As for the second combination of inputs, C2, only the antecedent hourly reservoir inflows were utilized. The combination of antecedent (time step from t to t–24) and future (time step from t + 1 to t + 24) rainfall was used as the third combination, namely C3. In the final combination, namely C4, all inputs were considered, including the antecedent rainfall, the antecedent reservoir inflows, and the future rainfall.

Table 2

Detailed results of performance evaluation using the CGBR model for two study areas

ReservoirsCombinations1–24-h lead times averaged values
CCMAE (m3/s)RMSE (m3/s)NSE
Shihmen C1 0.73 187.46 362.34 0.52 
C2 0.70 192.12 385.41 0.46 
C3 0.95 101.09 167.13 0.91 
C4 0.97 65.81 131.46 0.94 
Feitsui C1 0.64 102.04 207.88 0.42 
C2 0.51 117.41 224.74 0.31 
C3 0.93 60.19 110.88 0.84 
C4 0.95 49.87 98.11 0.87 
ReservoirsCombinations1–24-h lead times averaged values
CCMAE (m3/s)RMSE (m3/s)NSE
Shihmen C1 0.73 187.46 362.34 0.52 
C2 0.70 192.12 385.41 0.46 
C3 0.95 101.09 167.13 0.91 
C4 0.97 65.81 131.46 0.94 
Feitsui C1 0.64 102.04 207.88 0.42 
C2 0.51 117.41 224.74 0.31 
C3 0.93 60.19 110.88 0.84 
C4 0.95 49.87 98.11 0.87 

The CGBR model with four combinations was employed to predict the reservoir inflows in two study reservoirs with 1–24 h lead times. Based on the gird search approach, the optimal parameter in the CGBR model with a learning rate of 0.1 for Shimen and Feitsui Reservoirs was achieved as the number of trees of 500 and 50, respectively. Figure 4 compared the measured and predicted reservoir inflows with a 24-h lead time by four input combinations for the whole test dataset. The results indicated that the predicted reservoir inflows by the CGBR model with C3 and C4 were closer to the measured data than those of the CGBR model with C1 and C2. To further examine the difference between the predicted results by four combinations, the scatter plots in a lead time of 24 h using the CGBR model with four input combinations on all test datasets are shown in Figure 5. The predicted results using the CGBR model with C3 and C4 were closer to the 45° line, whereas those of the CGBR model with C1 and C2 were away from the 45° line. Based on the qualitative analysis, as shown in Figures 4 and 5, the CGBR model with both C3 and C4 achieved the accurate performance in predicting reservoir inflows.
Figure 4

The predicted results showing a 24-h lead time using the CGBR model with four input combinations on testing dataset in the (a) Shihmen and (b) Feitsui Reservoirs.

Figure 4

The predicted results showing a 24-h lead time using the CGBR model with four input combinations on testing dataset in the (a) Shihmen and (b) Feitsui Reservoirs.

Close modal
Figure 5

The scatter plots by the CGBR model with four input combinations for a lead time of 24 h based on the testing dataset of the (a) Shihmen and (b) Feitsui Reservoirs.

Figure 5

The scatter plots by the CGBR model with four input combinations for a lead time of 24 h based on the testing dataset of the (a) Shihmen and (b) Feitsui Reservoirs.

Close modal

As for the quantitative analysis, four commonly used evaluation criteria, namely CC, MAE, RMSE, and NSE, were adopted to assess the performance by four combinations. As shown in Table 2, both C1 and C2 obtained the worst prediction performance, indicating the DD model based on only the antecedent rainfall or reservoir inflows cannot produce reasonable and accurate resolutions. The results also indicated that the DD model with C4 achieved the highest accuracy based on the evaluation results by four criteria. Therefore, the optimal combination of C4 was employed in the following sections.

Model performance evaluation on predictions in the lead times of 1–24 h

All datasets were first utilized to investigate the prediction capacity of the proposed model in the lead times of 1–24 h because the whole data serve as a time-series problem with non-linear property to verify the proposed model. Based on the gird search approach with a learning rate of 0.1, the optimal tree number of 100 and 20 of the CGBR model were obtained for Shimen and Feitsui Reservoirs, respectively.

For the Shimen Reservoir, a comparison of the measured test dataset with the simulated using four models for lead times of 6, 12, 18, and 24 h is presented in Figure 6. When the lead time becomes longer, all models yield more simulated errors. However, the predicted results using the proposed model agree closely with the measured, while the ED-GRU model produced the worst fitting to the measured data. To further evaluate the simulation performance quantitatively, Table 3 summarizes the performance evaluation results by CC, MAE, RMSE, NSE and GA for all presented models at the training and test stages. For the test dataset with a lead time of 24 h, the RMSE by the proposed, ED-GRU, ED-LSTM, and CNN-LSTM models were 85.75, 272.32, 187.46, and 180.54 m3/s, respectively. The results indicated that the proposed model yields the smallest RMSE value, whereas the ED-GRU model gives the largest one. In addition, the proposed model achieved the best performance in terms of CC, MAE, RMSE, NSE, and GA overall among the four presented models.
Table 3

Performance evaluation results by four presented DD models regarding the overall dataset

ReservoirModelsLead times (h)Training
Test
GA
CCMAE (m3/s)RMSE (m3/s)NSECCMAE (m3/s)RMSE (m3/s)NSE
Shihmen ED-GRU 0.96 88.17 203.24 0.92 0.96 84.78 170.53 0.91 0.84 
12 0.97 89.41 183.99 0.93 0.95 89.80 177.99 0.90 0.97 
18 0.96 94.62 190.44 0.93 0.93 99.18 204.53 0.86 1.07 
24 0.91 126.24 294.45 0.83 0.87 125.95 272.32 0.76 0.92 
ED-LSTM 0.98 67.30 159.53 0.95 0.97 64.33 140.29 0.94 0.88 
12 0.98 70.45 150.69 0.95 0.97 70.84 148.43 0.93 0.99 
18 0.98 72.16 145.89 0.96 0.96 74.91 160.16 0.92 1.10 
24 0.97 84.27 193.32 0.93 0.95 86.44 187.46 0.89 0.97 
CNN-LSTM 0.99 58.64 104.44 0.98 0.97 74.26 135.05 0.94 1.29 
12 0.99 67.45 115.99 0.97 0.96 89.65 165.26 0.91 1.42 
18 0.99 75.82 125.14 0.97 0.96 95.25 168.42 0.91 1.35 
24 0.98 85.22 142.19 0.96 0.95 102.96 180.54 0.89 1.27 
Proposed 1.00 24.31 48.62 1.00 1.00 22.93 47.13 0.99 0.97 
12 1.00 21.87 43.40 1.00 1.00 22.96 48.13 0.99 1.11 
18 1.00 24.41 50.03 0.99 1.00 24.10 52.43 0.99 1.05 
24 0.99 39.44 78.80 0.99 0.99 39.59 85.75 0.98 1.09 
Feitsui ED-GRU 0.76 72.81 205.49 0.55 0.77 66.30 183.36 0.57 0.89 
12 0.69 81.66 233.99 0.42 0.69 81.14 213.54 0.41 0.91 
18 0.64 87.62 248.84 0.34 0.61 92.36 234.09 0.30 0.94 
24 0.50 107.21 273.22 0.20 0.44 110.24 260.16 0.13 0.95 
ED-LSTM 0.87 60.85 157.81 0.73 0.90 63.36 147.70 0.72 0.94 
12 0.89 65.42 145.30 0.77 0.90 69.04 143.33 0.74 0.99 
18 0.90 67.21 139.91 0.79 0.89 72.97 142.93 0.74 1.02 
24 0.80 92.60 189.81 0.62 0.75 91.75 189.88 0.54 1.00 
CNN-LSTM 0.98 28.01 56.05 0.97 0.94 48.33 102.46 0.87 1.83 
12 0.98 27.20 58.62 0.96 0.92 53.56 118.60 0.82 2.02 
18 0.98 28.70 58.37 0.96 0.92 55.56 120.32 0.81 2.06 
24 0.97 34.22 78.08 0.94 0.90 65.58 142.84 0.74 1.83 
Proposed 0.99 19.65 38.60 0.98 0.99 25.91 47.53 0.97 1.23 
12 0.99 20.81 40.77 0.98 0.99 26.70 50.92 0.97 1.25 
18 0.99 20.07 39.37 0.98 0.99 26.37 51.82 0.97 1.32 
24 0.98 29.71 65.36 0.95 0.96 38.76 86.88 0.90 1.33 
ReservoirModelsLead times (h)Training
Test
GA
CCMAE (m3/s)RMSE (m3/s)NSECCMAE (m3/s)RMSE (m3/s)NSE
Shihmen ED-GRU 0.96 88.17 203.24 0.92 0.96 84.78 170.53 0.91 0.84 
12 0.97 89.41 183.99 0.93 0.95 89.80 177.99 0.90 0.97 
18 0.96 94.62 190.44 0.93 0.93 99.18 204.53 0.86 1.07 
24 0.91 126.24 294.45 0.83 0.87 125.95 272.32 0.76 0.92 
ED-LSTM 0.98 67.30 159.53 0.95 0.97 64.33 140.29 0.94 0.88 
12 0.98 70.45 150.69 0.95 0.97 70.84 148.43 0.93 0.99 
18 0.98 72.16 145.89 0.96 0.96 74.91 160.16 0.92 1.10 
24 0.97 84.27 193.32 0.93 0.95 86.44 187.46 0.89 0.97 
CNN-LSTM 0.99 58.64 104.44 0.98 0.97 74.26 135.05 0.94 1.29 
12 0.99 67.45 115.99 0.97 0.96 89.65 165.26 0.91 1.42 
18 0.99 75.82 125.14 0.97 0.96 95.25 168.42 0.91 1.35 
24 0.98 85.22 142.19 0.96 0.95 102.96 180.54 0.89 1.27 
Proposed 1.00 24.31 48.62 1.00 1.00 22.93 47.13 0.99 0.97 
12 1.00 21.87 43.40 1.00 1.00 22.96 48.13 0.99 1.11 
18 1.00 24.41 50.03 0.99 1.00 24.10 52.43 0.99 1.05 
24 0.99 39.44 78.80 0.99 0.99 39.59 85.75 0.98 1.09 
Feitsui ED-GRU 0.76 72.81 205.49 0.55 0.77 66.30 183.36 0.57 0.89 
12 0.69 81.66 233.99 0.42 0.69 81.14 213.54 0.41 0.91 
18 0.64 87.62 248.84 0.34 0.61 92.36 234.09 0.30 0.94 
24 0.50 107.21 273.22 0.20 0.44 110.24 260.16 0.13 0.95 
ED-LSTM 0.87 60.85 157.81 0.73 0.90 63.36 147.70 0.72 0.94 
12 0.89 65.42 145.30 0.77 0.90 69.04 143.33 0.74 0.99 
18 0.90 67.21 139.91 0.79 0.89 72.97 142.93 0.74 1.02 
24 0.80 92.60 189.81 0.62 0.75 91.75 189.88 0.54 1.00 
CNN-LSTM 0.98 28.01 56.05 0.97 0.94 48.33 102.46 0.87 1.83 
12 0.98 27.20 58.62 0.96 0.92 53.56 118.60 0.82 2.02 
18 0.98 28.70 58.37 0.96 0.92 55.56 120.32 0.81 2.06 
24 0.97 34.22 78.08 0.94 0.90 65.58 142.84 0.74 1.83 
Proposed 0.99 19.65 38.60 0.98 0.99 25.91 47.53 0.97 1.23 
12 0.99 20.81 40.77 0.98 0.99 26.70 50.92 0.97 1.25 
18 0.99 20.07 39.37 0.98 0.99 26.37 51.82 0.97 1.32 
24 0.98 29.71 65.36 0.95 0.96 38.76 86.88 0.90 1.33 
Figure 6

Comparisons of measured and predicted reservoir inflows for testing dataset with (a) 6-h, (b) 12-h, (c) 18-h, and (d) 24-h lead times in the Shihmen Reservoir.

Figure 6

Comparisons of measured and predicted reservoir inflows for testing dataset with (a) 6-h, (b) 12-h, (c) 18-h, and (d) 24-h lead times in the Shihmen Reservoir.

Close modal
Concerning the Feitsui Reservoir, adopting the four models, the comparison of the simulated reservoir inflow hydrographs with the measured data at different lead times is shown in Figure 7. The ED-GRU model underestimates inflows above 1,500; 1,000; 800; and 500 m3/s compared with measured data for 6, 12, 18, and 24-h lead times, respectively. In addition, the ED-LSTM model underestimates inflows above 1,000; 1,500; 2,000; and 2,500 m3/s compared with measurements for 6, 12, 18, and 24-h lead times, respectively. Furthermore, the proposed and the CNN-LSTM models present similar prediction performance compared to the measured data. From the quantitative results in Table 3, the proposed model had the smallest values in terms of MAE and RMSE for lead times of 6, 12, 18, and 24 h, both in the model training and test stages, whereas the ED-GRU has the most significant values. In addition, compared with ED-GRU, ED-LSTM, and CNN-LSTM, the NSE value obtained by the proposed model is increased by 594.5, 67.2, and 22.0%, respectively, for the test dataset with a lead time of 24 h. Moreover, the proposed model produced a lead-time averaged GA of smaller than 1.3, which is still relatively favorable and acceptable.
Figure 7

Comparisons of measured and predicted reservoir inflows for testing dataset with (a) 6-h, (b) 12-h, (c) 18-h, and (d) 24-h lead times in the Feitsui Reservoir.

Figure 7

Comparisons of measured and predicted reservoir inflows for testing dataset with (a) 6-h, (b) 12-h, (c) 18-h, and (d) 24-h lead times in the Feitsui Reservoir.

Close modal

From the performance evaluation results in this section, the proposed model in the framework of RE correction can produce overall accurate resolutions in Shimen and Feitsui Reservoirs for the lead times of 1–24 h considering all datasets.

Comparison of predictions for different typhoon or storm events

To further investigate the model performance, the capacity of the proposed model in predicting reservoir inflows during typhoon or storm periods is also an important test issue. First of all, comparisons of the measured data with the simulated reservoir inflows using the proposed model and the three DL models with 6-h, 12-h, 18-h, and 24-h lead times for Typhoon Sinlaku (September 2008) in the Shihmen Reservoir are presented in Figure 8(a)–8(d), respectively. The result shows that the ED-GRU model produces significantly larger prediction errors for the simulated reservoir inflows than other models. In addition, the proposed model can globally predict the hourly trend of a two-peak reservoir inflow hydrograph well. Figure 9 compares the measured data with the simulated reservoir inflows using four DD models with different lead times for Typhoon Jangmi (September 2008) in the Shihmen Reservoir. The simulated peak inflows by all models present underestimation compared to the measured peak inflow data. However, the proposed model still provides slightly better prediction results than the other three DL models.
Figure 8

Comparisons of measured and predicted reservoirs inflows for Typhoon Sinlaku (September 2008) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h, in the Shihmen Reservoir.

Figure 8

Comparisons of measured and predicted reservoirs inflows for Typhoon Sinlaku (September 2008) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h, in the Shihmen Reservoir.

Close modal
Figure 9

Comparisons of measured and predicted reservoirs inflows for Typhoon Jangmi (September 2008) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h in the Shihmen Reservoir.

Figure 9

Comparisons of measured and predicted reservoirs inflows for Typhoon Jangmi (September 2008) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h in the Shihmen Reservoir.

Close modal

Based on the quantitative results for PIE and TIE summarized in Table 4, for Typhoon Trami (August 2013), the lead-time averaged values of the absolute PIE (namely LPIE) by proposed, ED-GRU, ED-LSTM, and CNN-LSTM models are 7.1, 6.1, 22.0, and 13.6%, respectively, indicating the ED-GRU model yields the smallest value and the proposed model obtains the second smallest one. For the other three events, the proposed model obtains the smallest LPIE of 0.3, 4.0, and 8.9% for Typhoon Krosa (October 2007), Typhoon Sinlaku (September 2008) and Typhoon Jangmi (September 2008), respectively. Concerning the performance in predicting total reservoir inflows, the proposed model yields the smallest lead-time averaged values of the absolute TIE (namely LTIE) of 1.7, 1.9, 2.8, and 3.2% for Typhoon Krosa (October 2007), Typhoon Sinlaku (September 2008), Typhoon Jangmi (September 2008) and Typhoon Trami (August 2013), respectively.

Table 4

The summary evaluation results concerning the different typhoon or storm events for two study reservoirs

ReservoirEventsLead time (h)PIE (%)
TIE (%)
ProposedED-GRUED-LSTMCNN-LSTMProposedED-GRUED-LSTMCNN-LSTM
Shihmen Typhoon Krosa (October 2007) −0.42 −3.41 −1.23 −1.88 −1.89 −16.84 −10.69 −3.43 
12 −0.22 −1.82 −0.80 −2.08 −2.19 −11.98 −9.00 −1.26 
18 −0.36 −1.22 −0.49 −2.07 −4.80 −9.82 −6.10 −3.56 
24 −2.20 −1.41 −2.50 −1.43 −9.41 −12.44 −17.21 −7.17 
Typhoon Sinlaku (September 2008) −0.99 −4.97 −11.20 −6.32 −2.55 −18.02 −15.25 −5.16 
12 −1.37 5.93 −11.35 −10.15 −4.92 −20.68 −11.80 −9.44 
18 1.01 −0.41 −9.66 −16.37 −1.61 −30.54 −7.48 −6.73 
24 −10.71 −32.45 −38.81 −23.96 −3.27 −44.84 −19.81 −7.73 
Typhoon Jangmi (September 2008) −11.65 −27.07 −30.16 −24.13 −3.77 −11.16 −14.98 −8.46 
12 −16.33 −34.59 −34.41 −29.99 −3.51 −13.51 −23.55 −11.41 
18 −17.82 −32.65 −37.15 −33.66 −5.65 −17.37 −24.13 −11.41 
24 −14.07 −22.02 −30.89 −28.83 −3.00 −22.16 −24.88 −12.89 
Typhoon Trami (August 2013) 1.98 −9.81 −21.82 0.40 −2.71 −3.91 −9.09 13.92 
12 −3.73 −8.94 −23.91 15.18 1.78 4.22 −8.72 28.37 
18 5.99 1.13 −20.17 12.82 −1.90 7.36 −4.41 29.72 
24 21.03 4.42 −22.25 25.98 7.71 2.50 −9.05 33.35 
Feitsui Typhoon Dujuan (September 2015) −13.14 −43.15 −47.67 5.37 −3.40 −40.81 −31.28 0.03 
12 −13.23 −72.37 −47.43 6.15 −5.58 −56.91 −23.66 −0.98 
18 −13.42 −77.10 −40.56 4.78 −4.50 −64.78 −19.54 −0.14 
24 −20.45 −76.46 −44.86 −9.02 −10.85 −69.41 −31.10 −16.31 
Typhoon Megi (September 2016) −27.49 −15.56 −42.97 −53.37 −11.80 −24.12 −37.86 −36.20 
12 −26.72 −55.57 −61.91 −53.49 −13.07 −44.11 −42.32 −45.35 
18 −31.18 −70.73 −55.60 −54.15 −13.38 −57.77 −44.47 −42.34 
24 −28.27 −74.01 −56.00 −56.63 −18.31 −63.89 −50.67 −45.95 
Storm(October 2017) −24.19 −2.03 16.72 −29.13 −9.10 −20.20 −21.76 −16.63 
12 −5.30 −16.91 −10.22 −29.19 −8.62 −30.70 −22.57 −24.19 
18 1.62 −28.57 −22.11 −28.56 −7.61 −40.36 −25.64 −22.59 
24 −3.40 −27.65 −23.19 −29.44 −11.34 −48.10 −31.93 −29.62 
Typhoon Mitag (September 2019) −19.76 −24.36 −62.79 −37.54 −9.17 −24.43 −23.15 −16.80 
12 −29.32 −62.77 −62.21 −49.55 −10.47 −38.07 −20.50 −26.09 
18 −34.50 −75.71 −54.19 −58.85 −9.82 −45.84 −19.34 −30.81 
24 −39.96 −81.06 −54.80 −58.68 −17.68 −50.36 −22.84 −36.31 
ReservoirEventsLead time (h)PIE (%)
TIE (%)
ProposedED-GRUED-LSTMCNN-LSTMProposedED-GRUED-LSTMCNN-LSTM
Shihmen Typhoon Krosa (October 2007) −0.42 −3.41 −1.23 −1.88 −1.89 −16.84 −10.69 −3.43 
12 −0.22 −1.82 −0.80 −2.08 −2.19 −11.98 −9.00 −1.26 
18 −0.36 −1.22 −0.49 −2.07 −4.80 −9.82 −6.10 −3.56 
24 −2.20 −1.41 −2.50 −1.43 −9.41 −12.44 −17.21 −7.17 
Typhoon Sinlaku (September 2008) −0.99 −4.97 −11.20 −6.32 −2.55 −18.02 −15.25 −5.16 
12 −1.37 5.93 −11.35 −10.15 −4.92 −20.68 −11.80 −9.44 
18 1.01 −0.41 −9.66 −16.37 −1.61 −30.54 −7.48 −6.73 
24 −10.71 −32.45 −38.81 −23.96 −3.27 −44.84 −19.81 −7.73 
Typhoon Jangmi (September 2008) −11.65 −27.07 −30.16 −24.13 −3.77 −11.16 −14.98 −8.46 
12 −16.33 −34.59 −34.41 −29.99 −3.51 −13.51 −23.55 −11.41 
18 −17.82 −32.65 −37.15 −33.66 −5.65 −17.37 −24.13 −11.41 
24 −14.07 −22.02 −30.89 −28.83 −3.00 −22.16 −24.88 −12.89 
Typhoon Trami (August 2013) 1.98 −9.81 −21.82 0.40 −2.71 −3.91 −9.09 13.92 
12 −3.73 −8.94 −23.91 15.18 1.78 4.22 −8.72 28.37 
18 5.99 1.13 −20.17 12.82 −1.90 7.36 −4.41 29.72 
24 21.03 4.42 −22.25 25.98 7.71 2.50 −9.05 33.35 
Feitsui Typhoon Dujuan (September 2015) −13.14 −43.15 −47.67 5.37 −3.40 −40.81 −31.28 0.03 
12 −13.23 −72.37 −47.43 6.15 −5.58 −56.91 −23.66 −0.98 
18 −13.42 −77.10 −40.56 4.78 −4.50 −64.78 −19.54 −0.14 
24 −20.45 −76.46 −44.86 −9.02 −10.85 −69.41 −31.10 −16.31 
Typhoon Megi (September 2016) −27.49 −15.56 −42.97 −53.37 −11.80 −24.12 −37.86 −36.20 
12 −26.72 −55.57 −61.91 −53.49 −13.07 −44.11 −42.32 −45.35 
18 −31.18 −70.73 −55.60 −54.15 −13.38 −57.77 −44.47 −42.34 
24 −28.27 −74.01 −56.00 −56.63 −18.31 −63.89 −50.67 −45.95 
Storm(October 2017) −24.19 −2.03 16.72 −29.13 −9.10 −20.20 −21.76 −16.63 
12 −5.30 −16.91 −10.22 −29.19 −8.62 −30.70 −22.57 −24.19 
18 1.62 −28.57 −22.11 −28.56 −7.61 −40.36 −25.64 −22.59 
24 −3.40 −27.65 −23.19 −29.44 −11.34 −48.10 −31.93 −29.62 
Typhoon Mitag (September 2019) −19.76 −24.36 −62.79 −37.54 −9.17 −24.43 −23.15 −16.80 
12 −29.32 −62.77 −62.21 −49.55 −10.47 −38.07 −20.50 −26.09 
18 −34.50 −75.71 −54.19 −58.85 −9.82 −45.84 −19.34 −30.81 
24 −39.96 −81.06 −54.80 −58.68 −17.68 −50.36 −22.84 −36.31 

For the Feitsui Reservoir, Figure 10(a)–10(d) shows the comparisons of the measured data with the simulated reservoir inflows using the four DD models with a lead time of 6 h, 12, 18 and 24 h for Typhoon Dujuan (September 2015). The CNN-LSTM model obtains good global agreement with the measured hourly reservoir inflow data. Figure 11 compares the measured data with the simulated reservoir inflows using four DD models with different lead times for Storm (October 2017). This event presents complex non-linear properties with multi-peak hydrographs and extended periods of up to 70 h. The proposed model provides the best prediction potential with relatively few peak and time-to-peak inflow errors. The other three DL models produce the more considerable errors, especially for the ED-GRU model.
Figure 10

Comparisons of measured and predicted reservoirs inflows for Typhoon Dujuan (September 2015) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h in the Feitsui Reservoir.

Figure 10

Comparisons of measured and predicted reservoirs inflows for Typhoon Dujuan (September 2015) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h in the Feitsui Reservoir.

Close modal
Figure 11

Comparisons of measured and predicted reservoirs inflows for Storm (October 2017) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h in the Feitsui Reservoir.

Figure 11

Comparisons of measured and predicted reservoirs inflows for Storm (October 2017) with lead times of (a) 6 h, (b) 12 h, (c) 18 h, and (d) 24 h in the Feitsui Reservoir.

Close modal

According to the quantitative results for PIE and TIE summarized in Table 4, for Typhoon Dujuan (September 2015), the LPIE by proposed, ED-GRU, ED-LSTM, and CNN-LSTM models are 15.1, 67.3, 45.1, and 6.3%, respectively, indicating the CNN-LSTM model yields the smallest value. The proposed model obtains the second smallest one. For the other three events, the LPIE using the proposed model for Typhoon Megi (September 2016), Storm (October 2017), and Typhoon Mitag (September 2019) is respectively 28.4, 8.6, and 30.9%. To the performance in predicting total reservoir inflows, for Typhoon Dujuan (September 2015), the LTIE using the proposed, ED-GRU, ED-LSTM, and CNN-LSTM models are 6.1, 58.0, 26.4, and 4.4%, respectively, indicating the CNN-LSTM model provides the best performance in the total inflow prediction. The proposed model achieves the second best. Furthermore, for the other three events, the proposed model yields the smallest LTIE of 14.1, 9.2, and 11.8% for Typhoon Megi (September 2016), Storm (October 2017) and Typhoon Mitag (September 2019) among the presented models.

Improved accuracy of the proposed model

The Taylor diagram was adopted to show the essential statistical parameters in RMSE, CC, and standard deviation presented in Figure 12. The result using the proposed model is closer to the measured value than those obtained using three DL models.
Figure 12

Taylor diagrams related to the results of the three DL models and the proposed model for the (a) Shihmen and (b) Feitsui Reservoirs.

Figure 12

Taylor diagrams related to the results of the three DL models and the proposed model for the (a) Shihmen and (b) Feitsui Reservoirs.

Close modal

To further investigate the improved accuracy of the proposed model, Table 5 lists the overall simulated performance results, including the LRMSE, LE-PIE, and LE-TIE. It is noted that the LRSME is defined as the lead-time averaged RMSE. In addition, LE-PIE or LE-TIE is defined as the lead-time and event-averaged PIE or TIE, which is calculated by taking the absolute values of PIE or TIE, then averaging by all lead times and events.

Table 5

Summary evaluation results of overall performance by four presented DD models

ReservoirEvaluation criteriaProposedED-GRUED-LSTMCNN-LSTM
Shihmen LRMSE (m3/s) 58.36 206.34 159.09 162.32 
LE-PIE (%) 5.09 12.02 18.55 14.70 
LE-TIE (%) 2.41 15.46 13.51 12.13 
Feitsui LRMSE (m3/s) 59.29 222.79 155.96 121.06 
LE-PIE (%) 20.75 50.25 43.95 35.24 
LE-TIE (%) 10.29 44.99 29.29 24.40 
ReservoirEvaluation criteriaProposedED-GRUED-LSTMCNN-LSTM
Shihmen LRMSE (m3/s) 58.36 206.34 159.09 162.32 
LE-PIE (%) 5.09 12.02 18.55 14.70 
LE-TIE (%) 2.41 15.46 13.51 12.13 
Feitsui LRMSE (m3/s) 59.29 222.79 155.96 121.06 
LE-PIE (%) 20.75 50.25 43.95 35.24 
LE-TIE (%) 10.29 44.99 29.29 24.40 

As shown in Table 5, the proposed model achieved the lowest predicted errors in terms of LRMSE, LE-PIE, and LE-TIE among the four presented DD models. In addition, compared with ED-GRU, ED-LSTM, and CNN-LSTM models, the LRMSE value employing the proposed model is reduced by 71.7, 63.3, and 64.0%, respectively, for the Shihmen Reservoir. For the Feitsui Reservoir, the LRMSE adopting the proposed model is reduced by 73.4, 62.0, and 51.0%, respectively, in comparison to ED-GRU, ED-LSTM, and CNN-LSTM models. The result indicated that the proposed model improved the overall prediction accuracy and outperforms the other three DL models.

Moreover, Table 5 indicated that the improved accuracy of LE-PIE by the proposed model is 57.7, 72.6, and 65.4% in compared with ED-GRU, ED-LSTM, and CNN-LSTM models, respectively, for the Shihmen Reservoir. For the Feitsui Reservoir, the improvement accuracy achieved from the proposed model is 58.7, 52.8, and 41.1%, respectively, compared with the ED-GRU, ED-LSTM, and CNN-LSTM models. Furthermore, adding to LE-TIE value into the evaluation, the overall improved accuracy by the proposed model is by an averaging factor of 66.2% in comparison with the three DL models. The model validation result indicated that the proposed model is accurate, efficient and is superior to the other three DL models in simulating the peak and total reservoir inflows. It is thus helpful for practical application in the reservoir operation and flood controlling.

This study implemented three DL techniques, ED-GRU, ED-LSTM, and CNN-LSTM, to model the reservoir inflows for the lead times of 1–24 h in the Shimen and Feicui Reservoirs, Taiwan. The well-known ensemble learning technique of the CGBR model was incorporated with the three DL models based on the RE correction approach. The measured dataset of hourly rainfall and reservoir inflow over 10 years were utilized to successfully train and construct the proposed model, which can predict the non-linear time-series property of reservoir inflow on an hourly timescale basis. Seven evaluation indicators were employed to compare the estimation results with the measured data for analyzing the inflow estimation performance of the proposed model comprehensively. The results of this study demonstrate that coupling three DL and CGBR models with RE correction can provide accurate and satisfactory outcomes for the peak and total reservoir inflows.

The significant contributions of this study are as follows:

  • (1)

    The proposed model can accurately resolve reservoir inflow with multi-peak and prolonged periods. It also has an obvious benefit over standalone DL model in resolving the non-linear time-series evolution of the reservoir inflow.

  • (2)

    The proposed model provides overall improved accuracy for lead times of 1–24 h by an averaging factor of 66.2% in comparison to the original single three DL models (i.e., ED-GRU, ED-LSTM, and CNN-LSTM).

  • (3)

    The LTIE values by the proposed model are within 3 and 11%, respectively, for the Shihmen and Feitsui Reservoirs. Thus, the proposed model can be considered an efficient, accurate and reliable analysis tool for reservoir inflow forecasting.

This study applied the proposed model to simulate hourly reservoir inflows for the lead times of 1–24 h; it is helpful to assist the reservoir managers in the early warning of flood operation during typhoons or extreme rainfall events. However, the proposed model is mainly trained based on the DL models, which belong to the well-known black-box framework. Hence, applying the DL models to solving hydrology problems does not consider the hydrological, physical process. Accordingly, the proposed model has a limitation about the need for interpretability, and the prediction performance would be highly affected by the quality of the used hydrology dataset. Future studies should consider the combination of DD with the HP models to share the advantages and minimize the drawbacks, and to overcome these problems. Additionally, the rainfall data integrating the high-resolution (5 km) gridded dataset could improve the accuracy of the rainfall inputs of the model. Furthermore, the proposed model with the RE correction could be extended to the DD framework with a monthly temporal scale and applied to modeling the drought indicators for main reservoirs in Taiwan, distinctly enhancing the model's applicability.

The authors thank the Northern Region Water Resources Office, Water Resources Agency, and Ministry of Economic Affairs, Taiwan, for providing the rainfall and reservoir inflow data of the Shimen Reservoir. The measured rainfall and reservoir inflow data of the Feicui Reservoir provided by the Taipei Feitsui Reservoir Administration, Taipei City Government, Taiwan, are also acknowledged.

All relevant data are included in the paper or its Supplementary Information.

The authors declare there is no conflict.

Albek
M.
,
Albek
E. A.
,
Göncü
S.
&
Şimşek Uygun
B.
2019
Ensemble streamflow projections for a small watershed with HSPF model
.
Environmental Science and Pollution Research
26
(
35
),
36023
36036
.
Anderson
R. M.
,
Koren
V. I.
&
Reed
S. M.
2006
Using SSURGO data to improve Sacramento Model a priori parameter estimates
.
Journal of Hydrology
320
(
1–2
),
103
116
.
Apaydin
H.
,
Feizi
H.
,
Sattari
M. T.
,
Colak
M. S.
,
Shamshirband
S.
&
Chau
K. W.
2020
Comparative analysis of recurrent neural network architectures for reservoir inflow forecasting
.
Water
12
(
5
),
1500
.
Bao
H. J.
,
Wang
L. L.
,
Li
Z. J.
,
Zhao
L. N.
&
Zhang
G. P.
2010
Hydrological daily rainfall-runoff simulation with BTOPMC model and comparison with Xin'anjiang model
.
Water Science and Engineering
3
(
2
),
121
131
.
Cai
H.
,
Shi
H.
,
Liu
S.
&
Babovic
V.
2021
Impacts of regional characteristics on improving the accuracy of groundwater level prediction using machine learning: The case of central eastern continental United States
.
Journal of Hydrology: Regional Studies
37
,
100930
.
Chen
W. B.
,
Liu
W. C.
&
Hsu
M. H.
2012
Predicting typhoon-induced storm surge tide with a two-dimensional hydrodynamic model and artificial neural network model
.
Natural Hazards and Earth System Sciences
12
,
3799
3809
.
Chen
W. B.
,
Liu
W. C.
&
Hsu
M. H.
2014
Artificial neural network modeling of dissolved oxygen in reservoir
.
Environmental Monitoring and Assessment
186
,
1203
1217
.
Cho
K.
,
van Merrienboer
B.
,
Gulcehre
C.
,
Bahdanau
D.
,
Bougares
F.
,
Schwenk
H.
&
Bengio
Y.
2014
Learning phrase representations using RNN encoder-decoder for statistical machine translation
.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734, Doha, Qatar. https://aclanthology.org/D14-1179/
.
Cui
Z.
,
Zhou
Y.
,
Guo
S.
,
Wang
J.
,
Ba
H.
&
He
S.
2021
A novel hybrid XAJ-LSTM model for multi-step-ahead flood forecasting
.
Hydrology Research
52
(
6
),
1436
1454
.
Feizi
H.
,
Apaydin
H.
,
Sattari
M. T.
,
Colak
M. S.
&
Sibtain
M.
2022
Improving reservoir inflow prediction via rolling window and deep learning-based multi-model approach: Case study from Ermenek Dam, Turkey
.
Stochastic Environmental Research and Risk Assessment
36
(
10
),
3149
3169
.
Gao
W.
,
Zhou
L.
,
Liu
S.
,
Guan
Y.
,
Gao
H.
&
Hui
B.
2022
Machine learning prediction of lignin content in poplar with Raman spectroscopy
.
Bioresource Technology
348
,
126812
.
Guo
W. D.
,
Chen
W. B.
&
Chang
C. H.
2023
Error-correction-based data-driven models for multiple-hour-ahead river stage predictions: A case study of the upstream region of the Cho-Shui River, Taiwan
.
Journal of Hydrology: Regional Studies
47
,
101378
.
Hochreiter
S.
&
Schmidhuber
J.
1997
Long short-term memory
.
Neural Computation
9
(
8
),
1735
1780
.
Honorato
A. G. D. S. M.
,
Silva
G. B. L. D.
&
Guimarães Santos
C. A.
2018
Monthly streamflow forecasting using neuro-wavelet techniques and input analysis
.
Hydrological Sciences Journal
63
(
15–16
),
2060
2075
.
Kan
G.
,
Liang
K.
,
Yu
H.
,
Sun
B.
,
Ding
L.
,
Li
J.
&
He
X.
2020
Hybrid machine learning hydrological model for flood forecast purpose
.
Open Geosciences
12
(
1
),
813
820
.
Latif
S. D.
,
Ahmed
A. N.
,
Sathiamurthy
E.
,
Huang
Y. F.
&
El-Shafie
A.
2021
Evaluation of deep learning algorithm for inflow forecasting: A case study of Durian Tunggal Reservoir, Peninsular Malaysia
.
Natural Hazards
109
(
1
),
351
369
.
Livieris
I. E.
,
Pintelas
E.
&
Pintelas
P.
2022
A CNN–LSTM model for gold price time-series forecasting
.
Neural Computing and Applications
32
(
23
),
17351
17360
.
Maddu
R.
,
Pradhan
I.
,
Ahmadisharaf
E.
,
Singh
S. K.
&
Shaik
R.
2022
Short-range reservoir inflow forecasting using hydrological and large-scale atmospheric circulation information
.
Journal of Hydrology
612
,
128153
.
Ngamsanroaj
Y.
&
Tamee
K.
2019
Improved model using estimate error for daily reservoir inflow forecasting
.
ECTI Transactions on Computer and Information Technology
13
(
2
),
159
166
.
Rezaie-Balf
M.
,
Naganna
S. R.
,
Kisi
O.
&
El-Shafie
A.
2019
Enhancing streamflow forecasting using the augmenting ensemble procedure coupled machine learning models: Case study of Aswan High Dam
.
Hydrological Sciences Journal
64
(
13
),
1629
1646
.
Wang
Y.
,
Guo
S.
,
Chen
H.
&
Zhou
Y.
2014
Comparative study of monthly inflow prediction methods for the Three Gorges Reservoir
.
Stochastic Environmental Research and Risk Assessment
28
(
3
),
555
570
.
Wegayehu
E. B.
&
Muluneh
F. B.
2021
Multivariate streamflow simulation using hybrid deep learning models
.
Computational Intelligence and Neuroscience
2021,
5172658
.
https://doi.org/10.1155/2021/5172658
.
Xiang
Z.
,
Yan
J.
&
Demir
I.
2020
A rainfall-runoff model with LSTM-based sequence-to-sequence learning
.
Water Resources Research
56
(
1
),
e2019WR025326
.
Xu
W.
,
Jiang
Y.
,
Zhang
X.
,
Li
Y.
,
Zhang
R.
&
Fu
G.
2020
Using long short-term memory networks for river flow prediction
.
Hydrology Research
51
(
6
),
1358
1376
.
Xu
C.
,
Wang
Y.
,
Fu
H.
&
Yang
J.
2022
Comprehensive analysis for long-term hydrological simulation by deep learning techniques and remote sensing
.
Frontiers in Earth Science
10
,
875145
https://doi.org/10.3389/feart.2022.875145
.
Yang
T.
,
Asanjan
A. A.
,
Welles
E.
,
Gao
X.
,
Sorooshian
S.
&
Liu
X.
2017
Developing reservoir monthly inflow forecasts using artificial intelligence and climate phenomenon information
.
Water Resources Research
53
(
4
),
2786
2812
.
Zarei
M.
,
Bozorg-Haddad
O.
,
Baghban
S.
,
Delpasand
M.
,
Goharian
E.
&
Loáiciga
H. A.
2021
Machine-learning algorithms for forecast-informed reservoir operation (FIRO) to reduce flood damages
.
Scientific Reports
11
(
1
),
24295
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).