Abstract
Rainfall is a precious water resource, especially for Shenzhen with scarce local water resources. Therefore, an effective rainfall prediction model is essential for improvement of water supply efficiency and water resources planning in Shenzhen. In this study, a deep learning model based on zero sum game (ZSG) was proposed to predict ten-day rainfall, the regular models were constructed for comparison, and the cross-validation was performed to further compare the generalization ability of the models. Meanwhile, the sliding window mechanism, differential evolution genetic algorithm, and discrete wavelet transform were developed to solve the problem of data non-stationarity, local optimal solutions, and noise filtration, respectively. The k-means clustering algorithm was used to discover the potential laws of the dataset to provide reference for sliding window. Mean square error (MSE), Nash–Sutcliffe efficiency coefficient (NSE) and mean absolute error (MAE) were applied for model evaluation. The results indicated that ZSG could better optimize the parameter adjustment process of models, and improved generalization ability of models. The generalization ability of the bidirectional model was superior to that of the unidirectional model. The ZSG-based models showed stronger superiority compared with regular models, and provided the lowest MSE (1.29%), NSE (21.75%), and MAE (7.5%) in the ten-day rainfall prediction.
HIGHLIGHTS
Proposing a deep learning model based on zero sum game.
Improving unidirectional propagation model into bidirectional propagation model.
Introducing sliding window mechanism to solve the problem of data non-stationarity.
Designing the differential evolution genetic algorithm to solve local optimal solutions.
Using the discrete wavelet transform to filter out noise.
INTRODUCTION
The population and the buildings in Shenzhen are dense, and the modernization degree is high. Therefore, Shenzhen faces the serious problem of local water resources shortage, and water supply is heavily dependent on water diversion. Rainfall is a precious water resource that can be used in urban water supply scheduling, and it can effectively alleviate the water supply pressure. However, uncertainty in rainfall depth is an important cause of an inappropriate water diversion plan for next year. Hence, accurate rainfall prediction is crucial to the water resources planning and water diversion plan. The mechanistic models depend on the external physical environment, to a certain extent, and require a large number of measured data. The measured data often have some missing values and outliers, the external physical environment is changing, and the modeling time of mechanistic models is long. These problems will have a great influence on the prediction of mechanistic models, which may weaken the effectiveness of the predicted results. Nevertheless, a large number of studies show that the data-driven models show high accuracy in rainfall prediction (Bagirov et al. 2017; Ni et al. 2020; Ridwan et al. 2021).
Hydrological data, such as rainfall, often have potential long-term trends and periodic patterns, which can be scientifically predicted with the appropriate tools. In recent years, data-driven methods, such as regression analysis (Danandeh Mehr et al. 2019; Ali et al. 2020), artificial neural network (ANN) (Jaddi & Abdullah 2018; Sulaiman & Wahab 2018; Liu et al. 2019), time series model (Le et al. 2019; Poornima & Pushpalatha 2019), and deep learning model (Yen et al. 2019), have been used to predict rainfall. The data-driven models can discover the potential quantitative relationship between data through self-learning, and autonomously learn previously unknown knowledge. The modeling speed is fast and the accuracy is high, and the model is not affected by changes in the external physical environment. Ramana et al. (2013) applied the wavelet and ANN to effectively predict monthly rainfall and prove that wavelet neural network models are more effective than the ANN models. Liu & Shi (2019) better predicted monthly rainfall based on genetic programming. In addition, many similar studies also show that the data-driven models have a good performance in monthly rainfall prediction.
However, Shenzhen has a large floating population, water supply dispatch is dominated by short-term dispatching, and the dispatching period is generally one week to ten days. Meanwhile, reservoirs are the main objects of water storage, and the water supply reservoirs are mainly medium-scale and small-scale reservoirs. The time span of the monthly rainfall is too long to provide support for water supply dispatching. This cannot effectively make use of rainfall, which tends to cause a great deal of water to be abandoned from the reservoirs in the flood season. Therefore, it is of great importance to accurately predict ten-day rainfall.
In the process of ten-day rainfall prediction, it is found that the stationarity (Ng et al. 2020) of ten-day rainfall data is worse than that of monthly rainfall data. Due to the non-stationarity of data, the generalization ability of data-driven models tends to deteriorate. The ANN model tends to overfit (Sari et al. 2017; Brodeur et al. 2020) and the regression models easily generate a spurious-regression equation. Although Estévez et al. (2020) used wavelet neural network model to solve the non-stationarity of local weather situations in monthly rainfall, the non-stationarity of ten-day rainfall data is worse. On the other hand, the learning ability of ANN is insufficient, which leads to bad predicted results. The most important point is that there is no standard to evaluate the advantages and disadvantages of the modeling process.
The purpose of this study is to solve the above problems, and accurately predict ten-day rainfall in Shenzhen. In this study, a deep learning model based on zero sum games (ZSG) (Dahmani et al. 2020) coupling bidirectional (Chen et al. 2014) long short-term memory (BiLSTM) and support vector machine (SVM) was proposed for ten-day rainfall prediction. The sliding window (SW) mechanism and the k-means clustering algorithm (KCA) were developed to carry out the data pre-processing, and the adaptive moment estimation (AME), differential evolutionary genetic algorithm (DEGA), and least squares (LS) are used to solve the model. The accuracy of the ZSG-based model was compared with the regular models in terms of the mean square error (MSE), Nash–Sutcliffe efficiency coefficient (NSE), and mean absolute error (MAE), and cross-validation (CV) was performed to further compare the prediction accuracy of the models. The influence of SW and discrete wavelet transform (DWT) on prediction accuracy was further compared and discussed. The methods in this study can not only provide guidance for hydrologic forecasting in other fields, but also provide reference to solve the problem of non-stationarity and noise filtration.
STUDY AREA AND DATA
Shenzhen (Figure 1), located on the shore of the South China Sea, adjacent to Hong Kong, is a special economic zone of China. It is the first city in China to be fully urbanized. Shenzhen has a subtropical marine climate, and annual average temperature is 22.3 °C. The rapid growth of the population and economy have increased water demand, which poses a huge challenge for water supply in Shenzhen. However, Shenzhen has abundant rainfall, with an average annual amount of 1,830 mm. The rainfall is concentrated in April to September, accounting for more than 80% of the annual rainfall. The abundant rainfall in the flood season increases the amount of abandoned water from reservoirs, leading to serious waste of water resources. In this study, the daily average rainfall for a 40-year period from 1981 to 2020 in Shenzhen was used. The daily average rainfall data from January 1981 to December 2020 are from the Meteorological Bureau of Shenzhen Municipaity. The daily data are summarized into ten-day data, and each year consists of 36 ten days. The daily data of 40 years are summarized into 1,440 ten-day data, so the length of dataset in this study is 1,440.
METHODS
Model development

First, the measured data (Y) and noise data (Noise) are imported into the SVM model for training, so that the SVM model can distinguish the label of measured data as 1 and the label of other data as 0. The purpose is that only the measured data can be recognized as 1. If the error between the predicted data generated by BiLSTM model and measured data is large, the discriminant result of the SVM model is 0. If predicted data want to be discriminated as 1, the predicted data must have a small error with the measured data, so as to make clear the optimization direction for the BiLSTM model. Therefore, the smaller the error between the measured data and the predicted data is, the closer the discriminant result is to 1. Then, the input data (X) are imported into the BiLSTM model to generate the predicted data. After the generated data are discriminated by the SVM model, the results will be fed back to the BiLSTM model, and the parameter adjustment process of modeling can be better guided through the ZSG between the two models. Therefore, the entire modeling process is the ZSG process of the SVM-BiLSTM (SBiLSTM) model.
Process of ZSG: (a) the discriminant process; (b) the generation process of predicted data.
Process of ZSG: (a) the discriminant process; (b) the generation process of predicted data.
SW
Optimizer
Genetic algorithms (GA) are a commonly used heuristic algorithm. GA (Delgoda et al. 2017; Sotomayor et al. 2018) can help individuals in the population to complete evolution, but the degradation phenomenon of the population may also occur. Sometimes this phenomenon is obvious, resulting in the bad fitness of individuals. Even after continuous evolution for many generations, the individual in the population cannot be better. If mutation probability of self-adaptive attenuation is set, the decreasing probability value with the iteration cannot make the individual obtain higher fitness, so the GA cannot converge. Therefore, DEGA is developed in this study. Mutation is generated by the difference of the parent generation, and new individuals are generated by crossing with the parent generation individuals to solve the degradation phenomenon.
In addition, in order to reduce the fluctuation of the training process, the adaptive learning rate is set in this study. The initial learning rate is set to 0.01 so that the model can converge fast. If the training error continues to decrease, the learning rate will not change. When the error rebounds, the monitor can start to track the following 100 rounds of training. If the errors of following the 100 rounds of training continue to decrease, the learning rate remains unchanged, otherwise the learning rate is set to 0.9 times to slow down the step size of parameter adjustment. Then, the monitor is turned off and there is a wait for the 100 rounds of training before it can be turned on again. When the error rebounds to more than twice that of the current time error, the model is considered to jump out of the local optimal solution, so the learning rate is set to 0.01.
Exploratory data analysis
Before modeling, exploratory data analysis (Xiao et al. 2012) is carried out to find the potential law of this dataset. In this study, KCA (Kim & Parnichkun 2017; Hamid 2019) is carried out for data of different time spans, and the results show that there is obvious law in the clustering results of annual data. The data normalization is carried out to present clustering results clearly.
According to Figure 4, the annual rainfall data from 1981 to 2020 is divided into two categories. The annual data are split into four breakpoints, which are 1992, 2001, 2008, and 2017. Therefore, the annual rainfall has periodic patterns. The periodic patterns of rainfall data can provide reference for SW. Based on the laws, s1 is set to 36 and s2 is set to 2 to smooth first-order SW series. As can be seen from Figure 5, the law of the second-order SW series is simple, and the trend is relatively obvious. This makes it easier to construct data-driven models. The second-order SW series is used for modeling in this study, and the predicted results of original time series can be obtained by reverse SW.
Model evaluation
RESULTS AND DISCUSSION
Training and validation results
The training results are shown in Table 1. All the models converge well in the training set. Except for the LSTM model, the evaluation standards of the other three models have little difference. The training results of the BiLSTM model are close to the SLSTM model, so the fitting ability of the BiLSTM model is close to that of the SLSTM model. According to evaluation standards, the error, fitting degree, and average deviation degree of the SBiLSTM model are superior to those of the other three models. Therefore, the fitting ability of the SBiLSTM model is the strongest, and the fitting ability of the LSTM model is the worst. The fitting ability of BiLSTM and SLSTM models is superior to that of the LSTM model.
Results of four models on the training set
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.54 | 89.00 | 0.52 |
BiLSTM | 0.17 | 96.65 | 0.27 |
SLSTM | 0.17 | 96.49 | 0.27 |
SBiLSTM | 0.1 | 98.03 | 0.2 |
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.54 | 89.00 | 0.52 |
BiLSTM | 0.17 | 96.65 | 0.27 |
SLSTM | 0.17 | 96.49 | 0.27 |
SBiLSTM | 0.1 | 98.03 | 0.2 |
Table 2 clearly shows the evaluation standards of the four models on the validation set. The four models show good validation results, which highlight the effectiveness of the deep learning models. The SBiLSTM model has optimal MSE, NSE, and MAE, while the LSTM model has the worst MSE, NSE, and MAE. The validation results of the BiLSTM model are still very close to the SLSTM model, so the learning ability of the BiLSTM model is close to that of the SLSTM model. The validation results of the BiLSTM model are superior to that of the LSTM model, and the validation results of the SLSTM model are superior to that of the LSTM model. Therefore, the learning ability of the BiLSTM model is superior to that of the LSTM model, and the learning ability of the SLSTM model is superior to that of the LSTM model.
Results of four models on the validation set
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.15 | 94.26 | 0.31 |
BiLSTM | 0.11 | 95.80 | 0.24 |
SLSTM | 0.12 | 95.68 | 0.24 |
SBiLSTM | 0.07 | 97.47 | 0.19 |
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.15 | 94.26 | 0.31 |
BiLSTM | 0.11 | 95.80 | 0.24 |
SLSTM | 0.12 | 95.68 | 0.24 |
SBiLSTM | 0.07 | 97.47 | 0.19 |
Test results
Table 3 presents the test results of the four models. On the test set, the MSE and MAE of the four models are sorted in ascending order: SBiLSTM, BiLSTM, SLSTM, and LSTM. The NSE is sorted from largest to smallest: SBiLSTM, BiLSTM, SLSTM, and LSTM. Therefore, the SBiLSTM model has the minimum error and average deviation degree, and the highest fitting degree. The generalization ability of the SBiLSTM model is strongest, and this model is closest to the unbiased prediction. Compared with BiLSTM, the MSE of the SBiLSTM decreases by 40%, the NSE increases by 1.46%, and the MAE decreases by 18.18%. It can be seen that the prediction accuracy of the SBiLSTM model is superior to that of the BiLSTM model. Similarly, the prediction accuracy of the SLSTM model is superior to that of the LSTM model. Thus, the generalization ability of the SBiLSTM model is superior to that of the BiLSTM model, and the generalization ability of the SLSTM model is superior to that of the LSTM model. This shows that ZSG can improve the generalization ability of models and increase the prediction accuracy.
Results of four models on the test set
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.24 | 90.27 | 0.38 |
BiLSTM | 0.1 | 96.05 | 0.22 |
SLSTM | 0.1 | 95.90 | 0.23 |
SBiLSTM | 0.06 | 97.45 | 0.18 |
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.24 | 90.27 | 0.38 |
BiLSTM | 0.1 | 96.05 | 0.22 |
SLSTM | 0.1 | 95.90 | 0.23 |
SBiLSTM | 0.06 | 97.45 | 0.18 |
Compared with the LSTM model, the MSE of the BiLSTM model decreases by 58.33%, the NSE increases by 6.4%, and the MAE decreases by 42.11%. Therefore, the prediction accuracy of the BiLSTM model is superior to that of the LSTM model. Similarly, the prediction accuracy of the SBiLSTM model is superior to that of the SLSTM model. These results show that the generalization ability of the bidirectional model is superior to that of the unidirectional model. Since the BiLSTM model has a relatively strong generalization ability, the prediction accuracy of the SBiLSTM model is not improved much. However, the prediction accuracy of the SLSTM model is much better than that of the LSTM model, and the prediction accuracy of the BiLSTM model is slightly superior to that of the SLSTM model. The generalization ability of the BiLSTM model is superior to that of the SLSTM model, indicating bidirectional propagation is essential for improvement of generalization ability.
Violin parameters of second-order SW series of four models
Model . | CIUL/% . | UQ/% . | Median/% . | LQ/% . | IQR/% . | CILL/% . |
---|---|---|---|---|---|---|
LSTM | 11.79 | 5.66 | 3.13 | 1.57 | 4.09 | 0 |
BiLSTM | 6.98 | 3.16 | 1.39 | 0.62 | 2.54 | 0 |
SLSTM | 7.37 | 3.34 | 1.51 | 0.65 | 2.69 | 0 |
SBiLSTM | 5.71 | 2.58 | 1.17 | 0.49 | 2.09 | 0 |
Model . | CIUL/% . | UQ/% . | Median/% . | LQ/% . | IQR/% . | CILL/% . |
---|---|---|---|---|---|---|
LSTM | 11.79 | 5.66 | 3.13 | 1.57 | 4.09 | 0 |
BiLSTM | 6.98 | 3.16 | 1.39 | 0.62 | 2.54 | 0 |
SLSTM | 7.37 | 3.34 | 1.51 | 0.65 | 2.69 | 0 |
SBiLSTM | 5.71 | 2.58 | 1.17 | 0.49 | 2.09 | 0 |
Predicted result distribution plot of second-order SW series of four models. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.10.2166/aqua.2021.086.
Predicted result distribution plot of second-order SW series of four models. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.10.2166/aqua.2021.086.
The CIUL, UQ, LQ, IQR, and CILL are confidence interval upper limit, upper quartile, lower quartile, interquartile range, and confidence interval lower limit, respectively: IQR = UQ − LQ; confidence interval (CI) = CIUL − CILL.
The violin plot represents the relative error distribution of each predicted data. The larger the width of the violin plot at a certain interval, the larger the density of relative error at this interval. The larger the height of CI is, the more discrete relative error distribution is. If the height of the violin is large, the value range of relative error between the measured values and predicted values is large, which reveals that the model stability is bad.
According to Table 4 and Figure 7, although the CILL of the four models are all 0, the CIUL is quite different. The CI height of the SBiLSTM model is the smallest, and the relative error in the widest part of the violin is the smallest. The relative error of the SBiLSTM model is mostly distributed between 0 and 5.71%, and the SBiLSTM model has the optimal CIUL, UQ, median, LQ, IQR, and CILL. These results indicate relative error distribution of the SBiLSTM model is the best. Therefore, the stability of the SBiLSTM model is the strongest.
Compared with the SBiLSTM model, the SLSTM model has larger violin parameters, revealing that the SLSTM model has a more discrete relative error distribution. Similarly, the violin parameters of the BiLSTM model are superior to that of the LSTM model, and the violin parameters of the BiLSTM model are also superior to that of the SLSTM model. These results indicate relative error distribution of the BiLSTM model is superior to that of the LSTM model and SLSTM model. These results reveal that bidirectional propagation is also critical to the stability of the model.
During the process of reverse SW, the errors will accumulate, and the prediction accuracy of the second-order SW series determines the prediction accuracy of the first-order SW series. Therefore, we can infer that the first-order SW series of SBiLSTM model has the minimum error. The first-order SW series is obtained by the reverse SW (Table 5). The SBiLSTM model still has the smallest MSE and MAE, and the highest NSE. The ranking of prediction accuracy from high to low is: SBiLSTM, BiLSTM, SLSTM, and LSTM. The MSE, NSE, and MAE of the LSTM model is the worst. The fitting degree of the LSTM model decreased obviously, and the error and average deviation degree between measured values and predicted values increased obviously.
Prediction results of first-order SW series of four models
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.44 | 82.45 | 0.53 |
BiLSTM | 0.22 | 91.19 | 0.37 |
SLSTM | 0.23 | 90.81 | 0.38 |
SBiLSTM | 0.12 | 95.06 | 0.26 |
Model . | MSE/10−4 . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 0.44 | 82.45 | 0.53 |
BiLSTM | 0.22 | 91.19 | 0.37 |
SLSTM | 0.23 | 90.81 | 0.38 |
SBiLSTM | 0.12 | 95.06 | 0.26 |
Figure 8 presents the absolute error bar plot of the second-order SW series and the first-order SW series. The SBiLSTM model has the shortest error bar, while the error bars of the other three models all become longer. This is because the error can accumulate during the process of reverse SW. Finally, after twice reversing the SW, the predicted results of the original series can be obtained (Table 6). The error of the predicted values is small in the initial stage. However, as the error accumulates, the predicted values of the models begin to deviate from the measured values. The LSTM model greatly deviates from the measured values, while the other three models slightly deviate from the measured values (Figure 9).
Prediction results of original series of four models
Model . | MSE/% . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 1.93 | −10.68 | 9.89 |
BiLSTM | 1.50 | 8.95 | 7.94 |
SLSTM | 1.52 | 7.65 | 8.0 |
SBiLSTM | 1.29 | 21.75 | 7.5 |
Model . | MSE/% . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 1.93 | −10.68 | 9.89 |
BiLSTM | 1.50 | 8.95 | 7.94 |
SLSTM | 1.52 | 7.65 | 8.0 |
SBiLSTM | 1.29 | 21.75 | 7.5 |
Absolute error bar plot of second-order SW series and first-order SW series.
According to Table 6, the MSE, NSE, and MAE of the four models are sorted from best to worst: SBiLSTM, BiLSTM, SLSTM, and LSTM. Since the measured values include many 0 values, the NSE of the LSTM model becomes negative, but the NSE of the other three models still remains positive. The SBiLSTM model has the smallest error and average deviation degree. The MSE and MAE of the BiLSTM model are superior to that of LSTM model and SLSTM model.
Compared with the BiLSTM and SLSTM models, the SBiLSTM model has higher prediction accuracy, indicating that ZSG can better improve the generalization ability of the model, and the generalization ability of the bidirectional model is superior to that of the unidirectional model.
CV
In order to further compare the prediction accuracy of the four models, cross-validation is performed. The predicted results of the original series are presented in Table 7 and Figure 10.
Prediction results of original series of four models on cross-validation set
Model . | MSE/% . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 1.93 | −4.23 | 9.87 |
BiLSTM | 1.79 | −0.81 | 8.47 |
SLSTM | 1.75 | 1.49 | 8.38 |
SBiLSTM | 1.46 | 17.77 | 7.63 |
Model . | MSE/% . | NSE/% . | MAE/% . |
---|---|---|---|
LSTM | 1.93 | −4.23 | 9.87 |
BiLSTM | 1.79 | −0.81 | 8.47 |
SLSTM | 1.75 | 1.49 | 8.38 |
SBiLSTM | 1.46 | 17.77 | 7.63 |
According to Table 7 and Figure 10, the prediction accuracy of the four models on the CV is good, and the predicted accuracy is close to the results in Table 6. This reveals that the generalization ability of deep learning models is very strong. Compared with the other three models, the SBiLSTM model has the optimal MSE, MAE, and NSE, indicating that the error, fitting degree, and average deviation degree between the measured values and predicted values are optimal. The prediction accuracy of the SBiLSTM model is superior to that of the SLSTM model, and the prediction accuracy of the BiLSTM model is superior to that of the LSTM model. Therefore, the prediction accuracy of ZSG-based deep learning models is superior to regular deep learning models. These results also reveal that ZSG can improve generalization ability of models and the bidirectional models have stronger generalization ability. Compared with results in Table 6, prediction accuracy of the SLSTM model is superior to that of the BiLSTM model, but the gap is not large. This is because the error of second-order SW series of SLSTM model is smaller.
DISCUSSION
In order to compare the influence of DWT on the prediction accuracy, the SBiLSTM model with the highest prediction accuracy is selected to present the predicted results. The predicted results without DWT of the first-order SW series and original series are shown in Tables 8 and 9, respectively. The prediction accuracy without DWT is obviously different from that with DWT, and MSE, NSE, and MAE of the predicted results without DWT were obviously worse.
Prediction results of first-order SW series
Model . | MSE/10−4 . | NSE/% . | MAE/% . | |||
---|---|---|---|---|---|---|
Without DWT . | DWT . | Without DWT . | DWT . | Without DWT . | DWT . | |
SBiLSTM | 1.20 | 0.12 | 52.00 | 95.06 | 1.00 | 0.26 |
SBiLSTM-CV | 0.32 | 0.13 | 91.85 | 96.58 | 0.43 | 0.25 |
Model . | MSE/10−4 . | NSE/% . | MAE/% . | |||
---|---|---|---|---|---|---|
Without DWT . | DWT . | Without DWT . | DWT . | Without DWT . | DWT . | |
SBiLSTM | 1.20 | 0.12 | 52.00 | 95.06 | 1.00 | 0.26 |
SBiLSTM-CV | 0.32 | 0.13 | 91.85 | 96.58 | 0.43 | 0.25 |
Prediction results of original series
Model . | MSE . | NSE . | MAE . | |||
---|---|---|---|---|---|---|
Without DWT . | DWT/% . | Without DWT . | DWT/% . | Without DWT . | DWT/% . | |
SBiLSTM | 2.82 | 1.29 | −160.71 | 21.75 | 1.02 | 7.5 |
SBiLSTM-CV | 0.96 | 1.46 | −50.72 | 17.77 | 0.61 | 7.63 |
Model . | MSE . | NSE . | MAE . | |||
---|---|---|---|---|---|---|
Without DWT . | DWT/% . | Without DWT . | DWT/% . | Without DWT . | DWT/% . | |
SBiLSTM | 2.82 | 1.29 | −160.71 | 21.75 | 1.02 | 7.5 |
SBiLSTM-CV | 0.96 | 1.46 | −50.72 | 17.77 | 0.61 | 7.63 |
For the predicted results without DWT of the first-order SW series, the evaluation standards of the model all deteriorated to a certain extent, leading to larger errors during the process of reverse SW. As the error accumulates, predicted values of the original series deteriorate seriously and predicted results are completely distorted. A fitting plot of predicted values is presented in Figure 11 to clearly show the influence of DWT.
Fitting plot: (a) fitting plot of first-order SW series, (b) fitting plot of original series.
Fitting plot: (a) fitting plot of first-order SW series, (b) fitting plot of original series.
Obviously, the procedure of noise filtration is essential for reverse SW. When the forward SW is carried out, the influence of noise is weakened to some extent. Nevertheless, during the process of the reverse SW, the influence of noise will be amplified. If the noise is not filtered out, the effective data will be gradually amplified into noise. According to Figure 11(a), the error of the predicted results without DWT increases obviously, while the error of the predicted results with DWT is small. As can be seen from Figure 11(b), the predicted results without DWT in the initial stage has a small error. Due to the accumulation of error, all the results finally are amplified into noise, while the predicted results with DWT still maintain a small error, indicating that DWT can effectively filter out noise during the process of reverse SW.
Meanwhile, in order to compare the influence of SW on prediction accuracy, the predicted results of the SBiLSTM model without SW are compared with that with SW in terms of violin plot (Figure 12 and Table 10).
Violin parameters of original series
Model . | CIUL . | UQ . | Median . | LQ . | IQR . | CILL . |
---|---|---|---|---|---|---|
SBiLSTM-SW | 2.01 | 1.01 | 0.74 | 0.34 | 0.67 | 0 |
SBiLSTM-without SW | 4.66 | 2.12 | 0.78 | 0.43 | 1.69 | 0 |
SBiLSTM-SW-CV | 2.58 | 1.24 | 0.74 | 0.35 | 0.89 | 0 |
SBiLSTM-without SW-CV | 4.48 | 2.04 | 0.77 | 0.42 | 1.62 | 0 |
Model . | CIUL . | UQ . | Median . | LQ . | IQR . | CILL . |
---|---|---|---|---|---|---|
SBiLSTM-SW | 2.01 | 1.01 | 0.74 | 0.34 | 0.67 | 0 |
SBiLSTM-without SW | 4.66 | 2.12 | 0.78 | 0.43 | 1.69 | 0 |
SBiLSTM-SW-CV | 2.58 | 1.24 | 0.74 | 0.35 | 0.89 | 0 |
SBiLSTM-without SW-CV | 4.48 | 2.04 | 0.77 | 0.42 | 1.62 | 0 |
Violin plot of original series: (a) violin plot on original set, (b) violin plot on CV set.
Violin plot of original series: (a) violin plot on original set, (b) violin plot on CV set.
According to the violin plot, the predicted results with SW are obviously superior to those without SW. Compared with the violin parameters without SW, the violin parameters with SW are better, and IQR and CI are smaller.
Prediction results in 2022
Based on the data from 1981 to 2020, the rainfall of especially wet year, wet year, normal year, dry year, and especially dry year in Shenzhen can be calculated (Figure 13). The predicted result of rainfall in 2022 is 1,821.14 mm, which belongs to normal year (Figure 14).
In 2022, the rainfall from April to September accounts for 85.26% of the annual rainfall, and this period is a period that has a large amount of water supply in Shenzhen. The ten-day rainfall can provide strong support for the water diversion plan of next year and water resources programming. Based on the predicted results of ten-day rainfall, more utilization of rainfall can relieve the water supply pressure of water diversion and reduce the abandoned water from the reservoirs, so as to improve the water supply efficiency.
CONCLUSIONS
In this study, a ZSG-based deep learning model coupling SVM and BiLSTM is proposed to predict the ten-day rainfall. The predicted results are compared with regular deep learning model, and CV is performed. The AME and LS are used to solve the model, and DEGA and DWT are used to solve the local optimal solutions and noise problems. SW mechanism is introduced to solve the problem of non-stationarity, and KCA is developed to find the potential law of this dataset. MSE, NSE, and MAE are used for model evaluation.
The ten-day data are obtained by sum of daily data. Through the ZSG between SVM and BiLSTM models, the BiLSTM model can generate predicted data with high accuracy, and the predicted results of the original series can be obtained by reverse SW. The results show that the prediction accuracy of the model is high, and the SBiLSTM model is the closest to the unbiased prediction. These discussion results show the effectiveness of the proposed methods in the study. Based on the above experimental results, the proposed methods in this study possess three advantages:
The KCA can discover period breakpoints in long time series, which provides reference for window size of SW.
The SW mechanism can solve the non-stationarity problem of the time series to a large extent, and improve the prediction accuracy of the model. During the process of reverse SW, DWT can effectively filter out noise.
The ZSG can help the BiLSTM model optimize the process of parameter adjustment, find the optimal solution more accurately, and improve generalization ability of models. Meanwhile, the bidirectional models have stronger generalization ability than unidirectional model.
ACKNOWLEDGEMENTS
This study is supported by the Scientific Research Projects of IWHR (01882103, 01882104), China Three Gorges Corporation Research Project (Contract No: 202103044), National Natural Science Foundation of China (51679089), and Innovation Foundation of North China University of Water Resources and Electric Power for PhD graduates.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.