## Abstract

In this study, a deep learning model based on zero-sum game (ZSG) was proposed for accurate water demand prediction. The ensemble learning was introduced to enhance the generalization ability of models, and the sliding average was designed to solve the non-stationarity problem of time series. To solve the problem that the deep learning model could not predict water supply fluctuations caused by emergencies, a hypothesis testing method combining Student's *t*-test and discrete wavelet transform was proposed to generate the envelope interval of the predicted values to carry out rolling revisions. The research methods were applied to Shenzhen, a megacity with extremely short water resources. The research results showed that the regular bidirectional models were superior to the unidirectional model, and the ZSG-based bidirectional models were superior to the regular bidirectional models. The bidirectional propagation was conducive to improving the generalization ability of the model, and ZSG could better guide the model to find the optimal solution. The fluctuations in water supply were mainly caused by the floating population, but the fluctuation was still within the envelope interval of the predicted values. The predicted values after rolling revisions were very close to the measured values.

## HIGHLIGHTS

Exploratory data analysis can discover laws from dataset, and the analysis of dataset can provide reference for modeling.

A deep learning model based on zero-sum game was proposed to better guide the model to find the optimal solution.

The ensemble learning was introduced so as to enhance the generalization ability of models.

The Student's

*t*-test and discrete wavelet transform were proposed to generate the envelope interval of the predicted values to make rolling revisions.The sliding average was designed to solve the non-stationarity problem of time series.

### Graphical Abstract

## INTRODUCTION

The 14th Five-Year Plan Period is a critical period for Shenzhen to build socialist modernization, a crucial period for implementing the construction of the Guangdong-Hong Kong-Macao Greater Bay Area, deepening reform and opening up, and accelerating the transformation of economic development mode. Water resources is an essential resource for urban development. If we do not make scientific planning and layout in advance, the lack of water resources is likely to become the obstacle and bottleneck for Shenzhen to build itself into a world-class Bay Area city. To carry out water supply scheduling more effectively and improve the efficiency of water supply, we must determine the water demand of the city in the future period, and on this basis, rationally use the limited water resources. Therefore, the water demand prediction is very important for the water supply of Shenzhen.

In previous water demand prediction, historical water supply data were usually used as the basis to predict next year's water demand according to the multi-year average growth rate. Although the average growth rate may be valid for the next few years, there are large errors in the results for the following year. In recent years, data-driven models with self-learning and self-adaptive capabilities have been increasingly applied, such as multiple regression, autoregression, artificial neural network (ANN), radial basis function neural network (RBFNN), time series and deep learning models. There is a considerable body of literature on regression models (see, among others, Cabral *et al*., 2019; Filgueiras *et al*., 2020; García-Nieto *et al*., 2020; Huang *et al*., 2020; Knappett *et al*., 2020; Zarei & Mahmoudi, 2020). Literature related to neural network includes Bomers *et al*. (2019), Pyo *et al*. (2020), Kumar *et al*. (2020), Yang *et al*. (2021) and Carreau & Guinot (2021), among many others.

However, based on a vast amount of research and experimentation, it is found that the regression models have a relatively high dependence on the selection of input features and the reliability of the data, resulting in the large limitation of the model. Whether the features extracted from the model's self-learning or features by manual selection, there is usually multicollinearity (Graham, 2003; Kroll & Song, 2013) among various features, which leads to unstable output results, and the determined input features may become invalid features. The ANN model can achieve good performance in simple fitting and classification, while for complex problems, its generalization ability will be affected by the network structure (Ruiming & Shijie, 2020). Although the generalization ability of the model will be improved as the numbers of hidden layers and neurons increase, the model has a high probability of overfitting (Piotrowski & Napiorkowski, 2013; Advani *et al*., 2020). In addition, due to the global parameter adjustment of ANN model, the convergence speed of the model is slow (Moayedi & Armaghani, 2018). Although many researchers have applied advanced algorithms to improve the generalization ability of the ANN model, such as genetic algorithm (Armaghani *et al*., 2018) and wavelet transform (Dalkiliç & Hashimi, 2020), the model still faces limited improvement. On the contrary, the RBFNN model (Bonanno *et al*., 2012; Montazer & Giveki, 2015) approximates a given sequence by the weighted sum of radial basis function (RBF), and the model involves local parameters adjustment, so the convergence speed is fast (Ebtehaj *et al*., 2016; Nie *et al*., 2017). A large number of experiments show that the generalization ability of RBFNN is superior to that of ANN model. Time series and deep learning models are the most widely used models at present due to their excellent generalization ability. Research papers such as Laloy *et al*. (2017), Shen (2018), Qin *et al*. (2018), Zhou *et al*. (2019), He *et al*. (2020), and Xiang *et al*. (2020), among others, apply the advanced models and achieve certain achievement.

In addition, the data-driven models are solved according to the predefined algorithms, and there is no specific standard to evaluate the solution process, which reduces the reliability of the model. Although the deep learning models have strong generalization ability, deep learning models cannot cope with emergencies, such as the COVID-19 epidemic in 2020. This is because data-driven models cannot predict the water supply fluctuations caused by emergencies. When the COVID-19 epidemic broke out amid the 2020 Spring Festival holiday, and the citizens were required to quarantine at home and work online. Shenzhen has only permanent resident population without floating population, so the law of water supply is bound to be different from previous years. Therefore, only the prediction results of data-driven models cannot truly reflect the actual water demand. In this study, the perturbation factor is introduced into the water demand prediction, and the rolling revisions are carried out. The predicted values can be revised based on the government's policies on emergencies or recently measured data of water supply, so that the prediction results can conform to the actual water demand.

This study explored and compared the generalization ability of different deep learning models in water demand prediction. The regular bidirectional models (Chen *et al*., 2014; Zhang *et al*., 2018) are developed to compare the influence of unidirectional and bidirectional propagation on the generalization ability of models. The zero-sum game (ZSG) (Aviram *et al*., 2014) is proposed to guide the model more effectively to find the optimal solution, and ensemble learning is introduced to enhance the randomness of the model so as to further enhance the generalization ability of models. In this study, these models were developed, including Gaussian RBFNN based on ensemble learning (GRBFNNEL), unidirectional long short-term memory (LSTM) recurrent neural network (RNN) (Mouatadid *et al*., 2019; Qi *et al*., 2019), unidirectional gated recurrent unit (GRU) (Gao *et al*., 2020), regular bidirectional LSTM (BiLSTM), regular bidirectional GRU (BiGRU), ZSG-based BiLSTM-GRBFNNEL (BiLSTMG) and ZSG-based BiGRU-GRBFNNEL (BiGRUG). LSTM and GRU are deep learning models that are extensively used nowadays, and their bidirectional models are designed for exploring more effective predictions. Meanwhile, Student's *t*-test (*T*-test) (Hu *et al*., 2017; de-Almeida-Pereira & Veiga, 2019) was used to carry out hypothesis testing, combined with discrete wavelet transform (DWT) (Kashani *et al*., 2017; Supratid *et al*., 2017) to generate the envelope interval of predicted values, and the rolling revisions were carried out. Finally, the research methods were applied to the daily water demand prediction of Shenzhen and eight districts of Shenzhen with incompletely similar water supply laws. If water demand can be predicted accurately, the water diversion plan of next year would be made rationally, water diversion and the rainfall in flood season could also be fully utilized to reduce abandoned water from reservoirs and the efficiency of water supply would be improved. Especially for areas where local water resources are scarce and water supply cannot meet the demand, accurate water demand prediction will become more important.

## MATERIALS AND METHODS

### Study area and data sources

As a sub-provincial city in Guangdong Province, Shenzhen (Figure 1) is a megacity in China, but the local water resources are extremely scarce. Shenzhen is located south of the Tropic of Cancer, between 113°43′ to 114°38′E and 22°24′ to 22°52′N. Although the rainfall is abundant, its annual distribution is uneven. The rainfall is concentrated from April to September, accounting for more than 80% of the annual rainfall. Groundwater supply accounts for only 0.1% of the annual water supply, and more than 85% of the annual water supply relies on water diversion. So the water diversion is crucial to the water supply safety of Shenzhen, and Shenzhen is asked to make the water diversion plan for the next year at the end of this year. Due to the increase of water supply pressure in Guangdong Province, the annual water diversion quota in Shenzhen has also been limited. Therefore, the accuracy of water demand prediction is very important for making scientific water diversion plan and improving the efficiency of water supply.

The data in this study were derived from measured data of daily water supply without vacancies of Shenzhen Water Group and Shenzhen Digital Water System from January 1, 2015 to December 31, 2020.

### Deep learning enhancement model based on ZSG

RBFNN directly maps the input to the hidden layer through RBF without using connection weight. The output of the hidden layer is the linear weighted sum of the hidden layer, and the connection weight of the output layer is the adjustable parameter of the RBFNN model, so the RBFNN model is the local parameter adjustment. Meanwhile, through a large number of experiments, it has been found that RBFNN can solve problems that are difficult to be solved by ANN.

*et al*., 2015; Nourani

*et al*., 2018). In each round training of GRBFNNEL, it randomly discarded some neurons with a certain probability (the probability is equal to 0.1 in this study), so that each training was equivalent to generating a new model. The GRBFNNEL model obtained after the training was equivalent to the integration of these new models, and the GRBFNNEL model was eventually improved into an enhancement model.where

*h*is the Gaussian kernel function; is the Euclidean distance between any point

*x*in space and the center

*c*; is the variance;

*w*is connection weight and

*n*is the number of neurons.

*o*,

*h*,

*b*,

*x*,

*W*,

*m*

_{learn},

*m*and

_{t}*t*are the LSTM's attenuation factor, model output, hidden layer output, bias, input, connection weight, memory learned at current time, memory after attenuation and time, respectively;

*u*,

*r*,

*W*, ,

*h*and

*x*are the GRU's update factor, update unit, connection weight, memory to be transferred, output and input, respectively; tan and

*σ*are the activation function.

To compare their generalization ability, LSTM and GRU models were developed in this study. At the same time, considering that the unidirectional model cannot learn the knowledge of bidirectional propagation, BiLSTM (Figure 2) and BiGRU were also developed and used as GM, respectively. The GRBFNNEL model was used as the DM in the ZSG process to discriminate the predicted data generated by the GM. When the error between the predicted data and the measured data are smaller (the predicted data and the measured data are more similar), the discrimination result will be close to 1 (real); otherwise, the discrimination result will be close to 0 (fake). In that way, the optimization of GM is expected to be better guided by ZSG.

Meanwhile, in order to help the model jump out of the local optimal solution, heuristic algorithms were applied to the modeling process. At present, the most commonly used heuristic algorithm is the genetic algorithm. However, in the process of crossover and mutation, the genetic algorithm provides evolution opportunities for the population and also has a large probability of degradation, making it impossible to obtain more optimal population after multiple continuous iterations. Therefore, this study used an evolutionary genetic algorithm to carry out a heuristic search, which no longer controlled the generation of offspring according to the fitness value. The mutation was generated by the parent and crossed with the parent individual to generate a new individual, expecting to eliminate the phenomenon of degradation.

*E*is the expectation; 0 and 1 are the labels;

*fm*,

*sm*,

*m*,

*g*,

*fv*,

*sv*, , and

*t*are the first-order moment estimation, second-order moment estimation, momentum, gradient, revision of first-order moment estimation, revision of second-order moment estimation, parameters of the model, learning rate and number of iterations, respectively; is 10

^{−8};

**and**

*M***are the measured value vector and predicted value vector, respectively.**

*P*### Perturbation factor

In the first half of 2020, due to the impact of the COVID-19, most areas in China were in lockdown since February. In March, citizens of some low-risk areas were allowed to travel with health codes. After the Qingming Festival in April, most areas in China gradually lifted the lockdown. According to the measured data, water supply decreased by 10.1% year-on-year in February, 7.13% year-on-year in March and 4.2% year-on-year in April. This is because a large number of floating population cannot return to Shenzhen. After International Labor Day on May 1, other regions except for high-risk areas basically resumed normality. The amount of water supply in May 2020 exceeded that in May 2019, an increase of 5.05% year-on-year. Due to the impact of the COVID-19, some companies encountered difficulties in operation and some of them were even forced to shut down, a large number of people were unemployed or had resigned and more people went to Shenzhen for job opportunities. In June 2020, the total amount of water supply increased by 6.17% year-on-year. July was the month with the largest amount of water supply in Shenzhen, and the water supply in July increased by 13.85% year-on-year, causing overload of some waterworks. During that time, the water diversion projects were all running at full capacity, but the water level of the main water supply reservoirs was still falling. It indicates that the floating population is the main factor leading to fluctuations in water supply. Therefore, in this study, the floating population was taken as the perturbation factor to provide reference to rolling revisions.

### Rolling revision

*T*-test is a statistical method for testing hypotheses, which is used to test whether there is a significant difference in the average value of two samples. In this study, the

*T*-test (Equation (22)) hypothesis testing is applied to generate the envelope interval of the predicted data.where , ,

*s*and

*n*are the average value of the new sample, average value of the total sample, standard deviation of the new sample and the size of new sample, respectively.

*T*-test sets a 95% significant level for hypothesis testing. First, the

*T*-test assumes that there is not a significant difference between the average value of two samples (H

_{0}hypothesis). Then, the H

_{1}hypothesis is the opposite of H

_{0}hypothesis. If the

*p*-value for the

*T*-test-statistic is greater than 0.05, the H

_{0}hypothesis is valid; otherwise, it is considered that there is a significant difference between the average value of the two samples, and the H

_{1}hypothesis is valid. Through the hypothesis testing of water supply historical data in Shenzhen, it is found that although there are some fluctuations in water supply, the testing results all show the H

_{0}hypothesis is valid. Therefore, this study assumed that the water supply data in 2020 met the significant level of the total sample. Combined with DWT (Equations (23) and (24)), the

*T*-test was carried out in reverse to generate the envelope interval of the predicted data based on the 95% significant level of the total sample. This envelope interval was used as the standard for rolling revisions.where DWT

*,*

_{f}*a*,

*t*,

*τ*and

*ψ*are the wavelet transform coefficient, scale, time, deviation and wavelet base, respectively.

## RESULTS

### Data exploration and analysis

After obtaining the dataset, this research first carried out the exploratory data analysis (EDA). EDA is helpful for mining the laws of the dataset and analyzing the potential characteristics and quantitative relationships of data, which can provide some reference for modeling.

Figure 3 illustrates the autocorrelation and partial autocorrelation analysis of the water supply time series in Shenzhen. The shadow area is the 95% confidence interval, and 50 lag time was selected for this correlation analysis. The autocorrelation falls into the confidence interval after 5 lag time, and the partial autocorrelation falls into the confidence interval after 1 lag time. Therefore, the correlation degree of the time series is good, which reveals the stationarity of time series is good, and the data can be used directly in the process of modeling. Figure 4 shows the distribution plot of lag time of the water supply time series, and most of the points are distributed on the diagonal. As can be seen from the plot with a 6 lag time, the distribution of some points starts to get significantly worse. So, the models developed in this study were constructed through 5 lag time.

### Model construction

The LSTM, GRU, BiLSTM, BiGRU, BiLSTMG and BiGRUG were developed in this study. All the models and algorithms in this study were developed using Python 3. Through a large number of experiments, it is found that if the modeling is ended with the minimum training error, the memory effect may occur due to overtraining, which will weaken the generalization ability of the models. The dataset was divided into training set and verification set at a ratio of 7:3. All data of 2020 in the verification set were used as the test set, and the remaining data were used as the validation set.

**and**

*M***are the measured value vector and the predicted value vector; var is the measured data variance;**

*P**s*

_{min}and

*s*

_{max}are the minimum and maximum values of the data;

*s*is the value in time step

_{i}*i*and

*n*is the length of data.

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.49 | 82.34 | 5.39 |

LSTM | 0.82 | 70.40 | 7.97 |

BiGRU | 0.37 | 86.68 | 4.65 |

BiLSTM | 0.33 | 88.02 | 4.77 |

BiGRUG | 0.29 | 89.69 | 4.27 |

BiLSTMG | 0.27 | 90.38 | 4.03 |

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.49 | 82.34 | 5.39 |

LSTM | 0.82 | 70.40 | 7.97 |

BiGRU | 0.37 | 86.68 | 4.65 |

BiLSTM | 0.33 | 88.02 | 4.77 |

BiGRUG | 0.29 | 89.69 | 4.27 |

BiLSTMG | 0.27 | 90.38 | 4.03 |

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.48 | 38.79 | 5.65 |

LSTM | 1.13 | −42.8 | 9.99 |

BiGRU | 0.35 | 55.24 | 4.57 |

BiLSTM | 0.42 | 47.06 | 5.69 |

BiGRUG | 0.32 | 59.10 | 4.82 |

BiLSTMG | 0.27 | 66.04 | 4.19 |

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.48 | 38.79 | 5.65 |

LSTM | 1.13 | −42.8 | 9.99 |

BiGRU | 0.35 | 55.24 | 4.57 |

BiLSTM | 0.42 | 47.06 | 5.69 |

BiGRUG | 0.32 | 59.10 | 4.82 |

BiLSTMG | 0.27 | 66.04 | 4.19 |

As can be seen in Table 1, when the modeling is completed, the MSE and MAE of the six models in the training set are all small, and the NSE is high, which reflects that the six models converge well. Based on the three evaluation standards, in the training set, the MSE, NSE and MAE of the BiGRU and BiLSTM models are superior to those of GRU and LSTM models. The MSE, NSE and MAE of BiGRUG and BiLSTMG models are superior to those of BiGRU and BiLSTM models. It can be seen that the ranking order of the three types of models based on the fitting ability from strong to weak is the ZSG-based bidirectional model, the regular bidirectional model and the unidirectional model. Among them, the BiLSTMG model has the strongest fitting ability. As detailed in Table 2, the results in the validation set are similar. The accuracy of ZSG-based bidirectional model is still higher than other models, and BiLSTMG still has the highest accuracy.

### Prediction

Table 3 shows the prediction results of six models. According to three evaluation standards, the MSE, NSE and MAE of the BiGRUG and BiLSTMG models are superior to those of the BiGRU and BiLSTM models, and the MSE, NSE and MAE of the BiGRU and BiLSTM models are superior to those of the GRU and LSTM models. The three types of models can be ranked from strong to weak in terms of generalization ability: the ZSG-based bidirectional model, the regular bidirectional model and the unidirectional model. Among them, the generalization ability of BiLSTMG model is the strongest. The MSE of the BiLSTMG model is reduced by 29.79% compared with the BiLSTM model, and the MSE of the BiGRUG model is reduced by 39.34% than that of the BiGRU model; the NSE of the BiLSTMG model increased by 2.63% compared to that of the BiLSTM model, and the NSE of the BiGRUG model increased by 4.47% than that of the BiGRU model; the MAE of the BiLSTMG model is 20.83% higher than that of the BiLSTM model, and the MAE of the BiGRUG model is 19.93% higher than the BiGRU model. It can be seen that ZSG can more effectively decrease the error and average deviation degree of the model and enhance the generalization ability of the model. Similarly, the evaluation standards of the bidirectional model are significantly superior to those of the unidirectional model, which shows that the bidirectional model has stronger generalization ability, indicating that the bidirectional propagation can improve the generalization ability of the model. The GRU model does not have an independent memory unit, so the accuracy of BiGRU and GRU is not much different. Because the LSTM model has an independent memory unit and the BiLSTM model combines the forward and backward memory units, the accuracy of BiLSTM model is greatly improved compared with LSTM. Therefore, the BiLSTM model has stronger learning ability, and the prediction accuracy of BiLSTM model is superior to that of the LSTM and BiGRU models.

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.70 | 88.09 | 6.64 |

LSTM | 1.03 | 82.59 | 8.87 |

BiGRU | 0.61 | 89.70 | 6.12 |

BiLSTM | 0.47 | 91.95 | 5.76 |

BiGRUG | 0.37 | 93.71 | 4.90 |

BiLSTMG | 0.33 | 94.37 | 4.56 |

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.70 | 88.09 | 6.64 |

LSTM | 1.03 | 82.59 | 8.87 |

BiGRU | 0.61 | 89.70 | 6.12 |

BiLSTM | 0.47 | 91.95 | 5.76 |

BiGRUG | 0.37 | 93.71 | 4.90 |

BiLSTMG | 0.33 | 94.37 | 4.56 |

Figure 6 details the violin plot of the six models after removing outliers greater than 1. The top horizontal line segment represents the confidence interval upper limit (CIUL) of the RE distribution, the upper, middle and lower line segments of the box represent the upper quartile (UQ), median and lower quartile (LQ) of RE distribution, and the bottom line segment represents the confidence interval lower limit (CILL) of the RE distribution. The size of confidence interval (CI) is equal to CIUL minus CILL. The external curve is the probability density curve of RE distribution, and the black dots represent outliers. The RE distribution parameters of the six models are shown in Table 4.

Model . | CIUL (%) . | UQ (%) . | Median (%) . | LQ (%) . | CILL (%) . |
---|---|---|---|---|---|

GRU | 28.19 | 13.76 | 8.02 | 4.14 | 0 |

LSTM | 25.34 | 16.6 | 13.69 | 10.77 | 2.03 |

BiGRU | 26.92 | 12.64 | 7.35 | 3.12 | 0 |

BiLSTM | 21.33 | 11.36 | 8.02 | 4.71 | 0 |

BiGRUG | 19.58 | 10.19 | 6.68 | 3.93 | 0 |

BiLSTMG | 18.66 | 9.35 | 5.69 | 3.14 | 0 |

Model . | CIUL (%) . | UQ (%) . | Median (%) . | LQ (%) . | CILL (%) . |
---|---|---|---|---|---|

GRU | 28.19 | 13.76 | 8.02 | 4.14 | 0 |

LSTM | 25.34 | 16.6 | 13.69 | 10.77 | 2.03 |

BiGRU | 26.92 | 12.64 | 7.35 | 3.12 | 0 |

BiLSTM | 21.33 | 11.36 | 8.02 | 4.71 | 0 |

BiGRUG | 19.58 | 10.19 | 6.68 | 3.93 | 0 |

BiLSTMG | 18.66 | 9.35 | 5.69 | 3.14 | 0 |

As detailed in Table 4 and Figure 6, the RE distribution from good to poor can be sorted as the ZSG-based bidirectional model, the regular bidirectional model and the unidirectional model. Among them, the CILL of LSTM is not 0, and the largest probability density is in the interval [0.1, 0.2]. Therefore, the MSE of the LSTM model is the largest. Compared with BiLSTM, although the LQ and median of BiGRU model are smaller, the UQ and CIUL of BiGRU model are larger. It is revealed that compared with BiLSTM, the predicted values of BiGRU model are smaller at some time, but the predicted values are larger at most times. Therefore, the CI of RE distribution of BiGRU model is large, indicating that the discrete degree of RE distribution is large and the stability of the BiGRU model is bad. On the contrary, the CI of RE distribution of the BiLSTM model is small, indicating that the BiLSTM model is more stable compared with the BiGRU model. Compared with the BiGRUG model, the CIUL, UQ, median and LQ of BiLSTMG model are all smaller, which shows that the RE distribution of the BiLSTMG model is better and the stability of BiLSTMG model is better.

Figure 7 compares the fitting results of the BiLSTMG and BiGRUG models. The other four models have larger errors, so they are not displayed. Although the accuracy of BiLSTMG and BiGRUG models is high, compared with the measured data, the predicted values of the two models in February are larger, and the predicted values from June to December are smaller. That is because the deep learning model can only learn the potential laws of historical data and cannot predict the water supply fluctuations caused by the COVID-19.

Figure 8 shows the ZSG process of BiLSTMG model. When the loss of GM increases, the loss of DM decreases, and when the loss of GM decreases, the loss of DM increases, which exactly reflects the ZSG process between GM and DM. According to Equations (13) and (14), the decrease of GM loss means that the discrimination result of the predicted values is close to 1, which must lead to the increase of DM loss. Because the number of iterations increases, the predicted data generated by GM is very similar to the measured data. On the contrary, the loss of GM increases, and the loss of DM decreases.

### Cross-validation

To further validate the generalization ability of the models, the dataset was divided into the training set and the test set at a ratio of 8:2 for cross-validation. The test results, RE distribution parameters and violin plot of six models in cross-validation are, respectively, detailed in Table 5, Table 6 and Figure 9.

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.63 | 88.03 | 6.25 |

LSTM | 1.03 | 80.34 | 9.06 |

BiGRU | 0.53 | 89.88 | 5.60 |

BiLSTM | 0.44 | 91.55 | 5.64 |

BiGRUG | 0.34 | 93.49 | 4.76 |

BiLSTMG | 0.30 | 94.34 | 4.33 |

Model . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

GRU | 0.63 | 88.03 | 6.25 |

LSTM | 1.03 | 80.34 | 9.06 |

BiGRU | 0.53 | 89.88 | 5.60 |

BiLSTM | 0.44 | 91.55 | 5.64 |

BiGRUG | 0.34 | 93.49 | 4.76 |

BiLSTMG | 0.30 | 94.34 | 4.33 |

Model . | CIUL (%) . | UQ (%) . | Median (%) . | LQ (%) . | CILL (%) . |
---|---|---|---|---|---|

GRU | 25.05 | 12.33 | 7.37 | 3.86 | 0 |

LSTM | 23.35 | 15.9 | 13.43 | 10.93 | 3.48 |

BiGRU | 25.07 | 11.62 | 6.26 | 2.65 | 0 |

BiLSTM | 19.46 | 10.62 | 7.51 | 4.71 | 0 |

BiGRUG | 17.56 | 9.27 | 6.29 | 3.73 | 0 |

BiLSTMG | 16.11 | 8.23 | 5.29 | 2.97 | 0 |

Model . | CIUL (%) . | UQ (%) . | Median (%) . | LQ (%) . | CILL (%) . |
---|---|---|---|---|---|

GRU | 25.05 | 12.33 | 7.37 | 3.86 | 0 |

LSTM | 23.35 | 15.9 | 13.43 | 10.93 | 3.48 |

BiGRU | 25.07 | 11.62 | 6.26 | 2.65 | 0 |

BiLSTM | 19.46 | 10.62 | 7.51 | 4.71 | 0 |

BiGRUG | 17.56 | 9.27 | 6.29 | 3.73 | 0 |

BiLSTMG | 16.11 | 8.23 | 5.29 | 2.97 | 0 |

It can be seen in Table 5 that the accuracy of the three types of models can be ranked from high to low as the ZSG-based bidirectional model, the regular bidirectional model and the unidirectional model. This once again validates that the ZSG can more effectively optimize the modeling process. The generalization ability of the bidirectional model is superior to that of the unidirectional network, indicating that bidirectional propagation can improve the generalization ability of the model. Among them, the BiLSTMG model still has the smallest MSE and the highest NSE. Similarly, the MSE and NSE of the BiLSTM and BiLSTMG models are both superior to those of BiGRU and BiGRUG models, indicating that the learning ability of the BiLSTM model is stronger.

As detailed in Table 6, the RE distribution parameters of the six models are similar to the results in Table 4, the parameters of the BiLSTMG model are still optimal and the stability of the BiLSTMG model is still the strongest. Figure 9 illustrates the violin plot of the six models after removing outliers greater than 1. The violin plot of the six models is similar to the results in Figure 6.

Figure 10 illustrates the error bar plot between the predicted and measured values of the BiGRUG and BiLSTMG models. The predicted values of the two models are close to the measured value, but the error bars of the BiGRUG model are longer than those of the BiLSTMG model at most times, indicating that the accuracy of BiLSTMG is higher.

### Rolling revision

To solve the problem that the deep learning model cannot cope with fluctuations caused by emergencies, this study carried out rolling revisions to the predicted values. The BiLSTMG model with the highest accuracy was selected for the rolling revisions, and the 7:3 dataset was used to display the revision results of the predicted values.

The study used the histogram (Figure 11) of the total sample to obtain the distribution of total sample, and the *T*-test was carried out in reverse to generate the envelope interval (Figure 12) of predicted values. The envelope interval can be converted to a multiple of 1.03 and 0.97 that the predicted value can be scaled up or down.

Figure 12 shows that the upper limit and lower limit of envelope interval exactly meet the upper limit and lower limit of water supply fluctuation, which reveals that although the water supply can be affected by emergencies, the fluctuation is still within the envelope interval of the predicted values. Therefore, the threshold step is set to 0.01, and the revision interval of [0.97, 0.98, 0.99, 1, 1.01, 1.02, 1.03] can be generated. According to the national policy and the measured data, the predicted value is multiplied by the revision coefficient for rolling revisions.

In February 2020, most cities in China were in lockdown, and citizens had to stay at home and work online. Therefore, the floating population could not return to Shenzhen. During that time, there were only permanent residents in Shenzhen, which was the least populated period. Then the predicted value in February should be revised by 0.97. In March, the citizens in low-risk areas were allowed to travel with the health code issued by the State Council. Then the floating population began in Shenzhen to increase, so the revision coefficient in March should be set to 0.98. In April, most cities in China lifted their lockdowns, and the revision coefficient should be changed to 0.99. The change of the floating population in the next month due to national policies is predictable, so the water demand in the next month can be directly revised. After International Labor Day on May 1, except for some high-risk areas, most regions in China basically resumed normality. When there is no national policy guidance, it can refer to the recently measured data of the Shenzhen Digital Water System to revise the predicted values. The water supply scheduling in Shenzhen is carried out on weekdays, and the duration of the fourth week of each month is extended to the end of the month, so the rolling revisions are based on weekly water supply data.

After International Labor Day on May 1, the water supply of Shenzhen returned to the level of 2019, and the revision coefficient should be changed to 1. In May 2020, the water supply data increased steadily, which was in line with the water supply law of the past 5 years. However, compared with May 2019, the water supply in May 2020 increased by 5.05%, which was close to the maximum monthly growth rate in the past 5 years. According to the law of water supply over the years, July and August are the months with the most water supply. Therefore, it can be inferred that the water demand would increase in June, and the revision coefficient in June should be set to 1.01. According to measured data, the water supply in June 2020 increased by 6.17% year-on-year. It can be inferred that the water demand in July is more than that in June, and the revision coefficient in July should be set to 1.02. According to measured data, the water supply in July 2020 increased by 13.85% year-on-year, so the revision coefficient in August was set to 1.03. The water supply did not decrease until the end of September, at which time the revision coefficient should be set to 1.02. The same rule was applied from October to December. In that way, the predicted value was revised, and the revision results of the 48-week were obtained in 2020, as shown in Figure 13.

Figure 13 shows the error between the predicted values after rolling revisions and the measured values is small, which can provide strong support for the water supply scheduling. The water supply scheduling of Shenzhen can make more effective use of water diversion based on the predicted values, control water levels of reservoirs in flood season to make more use of rainfall and reduce abandoned water from reservoirs.

## DISCUSSION

Based on the BiLSTMG model, this study continued to predict the daily water demand in 2020 in eight districts of Shenzhen that have incompletely similar water supply laws. Specifically, it involved Futian District, Luohu District, Yantian District, Nanshan District, Bao'an District, Longgang District, Longhua District and Guangming District. Among them, Futian, Luohu, Yantian and Nanshan are dominated by the tertiary industry. The proportions of the tertiary industry in the four districts are 88.4, 90.77, 83.16 and 83.03%, respectively. The other four districts are dominated by the secondary industry, and the secondary industry in Guangming District has the highest proportion, reaching 84.12%.

First of all, this study carried out correlation analysis to show the correlation coefficient matrix of the water supply of eight districts (Figure 14). As can be seen in Figure 14, the eight districts have different positive correlation degrees. Among them, the correlation degree of water supply between Yantian District and Luohu District is the weakest, and the correlation degree of water supply between Longgang District and Bao'an District is the strongest. The population and industrial structure of each district are different, which leads to different laws of water supply, and the correlation coefficients are also quite different.

*x*,

*T*,

*t*,

*s*and

*l*are the original series, sliding series, time, the size of sliding window and the length of original series, respectively.

Figure 16 shows the correlation degree of the time series after sliding average is strong, and the prediction results can be restored by reverse sliding. The prediction results of the eight districts are shown in Table 7. According to Table 7, the prediction accuracy of the eight districts is high, indicating that the generalization ability of the BiLSTMG model is strong. The prediction results of Luohu District are only slightly inferior to those of other districts, which shows that the sliding average method is an effective method to solve the non-stationarity problem of time series.

District . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

Futian | 0.34 | 91.82 | 4.42 |

Luohu | 0.89 | 67.28 | 7.55 |

Yantian | 0.70 | 83.66 | 6.51 |

Nanshan | 0.66 | 87.69 | 5.79 |

Baoan | 0.42 | 94.25 | 4.99 |

Longgang | 0.50 | 93.93 | 5.59 |

Longhua | 0.44 | 92.97 | 5.16 |

Guangming | 0.38 | 94.75 | 4.80 |

District . | MSE (%) . | NSE (%) . | MAE (%) . |
---|---|---|---|

Futian | 0.34 | 91.82 | 4.42 |

Luohu | 0.89 | 67.28 | 7.55 |

Yantian | 0.70 | 83.66 | 6.51 |

Nanshan | 0.66 | 87.69 | 5.79 |

Baoan | 0.42 | 94.25 | 4.99 |

Longgang | 0.50 | 93.93 | 5.59 |

Longhua | 0.44 | 92.97 | 5.16 |

Guangming | 0.38 | 94.75 | 4.80 |

## CONCLUSIONS

In this study, the ZSG-based deep learning models were proposed for water demand prediction. The goal is to provide support for water supply dispatching in Shenzhen, and the water diversion plan of next year can be made scientifically. According to the accurate water demand prediction, the water diversion can be used rationally, and the water-use efficiency can be improved. During the flood season, the water level of the reservoir can be controlled to a certain extent to make more use of rainfall and reduce abandoned water from reservoirs.

In this study, the ensemble learning is introduced to improve the generalization ability of models. AME + LS was used to solve the model, and an evolutionary genetic algorithm was adopted to help the model jump out of the local optimal solution. In addition, this study also revealed that the floating population is the biggest perturbation factor affecting the fluctuation of water supply. The *T*-test and DWT are used to generate the envelope interval of predicted values. The results show that ZSG can better guide the model to find the optimal solution and improve the generalization ability of models. The prediction accuracy of BiLSTMG model is the highest. Although the water supply fluctuation caused by emergencies cannot be predicted by the BiLSTMG model, the fluctuation is still within the envelope interval of the predicted values. The rolling revision is carried out to revise the predicted value based on envelope interval. Although the industrial structures and water supply laws of the eight districts are different, the BiLSTMG model can accurately predict the water demand of the eight districts. The prediction results of these districts indicate that this research method is universal. In the study, the following conclusions were drawn:

- 1.
EDA can discover knowledge from dataset, and the correlation analysis of dataset can provide reference for modeling.

- 2.
The accuracy of the regular bidirectional models is superior to that of the unidirectional models, which indicates that the bidirectional propagation is conducive to improve the generalization ability of the model.

- 3.
The accuracy of the ZSG-based bidirectional models is superior to that of the regular bidirectional models, which indicates that ZSG can better enhance the generalization ability of the model. The MSE, NSE and MAE values of BiLSTMG model are 0.33, 94.37 and 4.56%, and the MSE, NSE and MAE values in the cross-validation set are 0.3, 94.34 and 4.33%. The RE distribution parameter and violin plot of the BiLSTMG model are the best, and this model has the strongest stability and generalization ability.

- 4.
Compared with the BiGRU model, the BiLSTM model has stronger learning ability, higher accuracy and better stability, which indicates that the independent memory unit is more efficient.

- 5.
The sliding average can effectively solve the non-stationarity problem of time series.

This research method can also provide reference for other parts of China and other countries. Although the external environment is constantly changing, the deep learning model does not depend on external environment, so this research method can be applied to other areas for water demand prediction. Especially for the areas where the water supply cannot meet the demand, water demand prediction is very crucial. Without an accurate water demand prediction, water supply is blind, and lots of water may be abandoned from reservoirs, which can cause the waste of limited water resources. If water demand can be predicted accurately, limited water resources can be used scientifically. In addition, in the flood season, the reservoir water level can be appropriately controlled at a low level in order to use rainfall as much as possible so as to relieve the stress of water supply.

## ACKNOWLEDGEMENTS

This research was funded by the Scientific Research Projects of IWHR (01882103, 01882104), China Three Gorges Corporation Research Project (Contract No: 202103044), National Natural Science Foundation of China (51679089), and Innovation Foundation of North China University of Water Resources and Electric Power for PhD Graduates. The authors sincerely thank the editor and the anonymous reviewers for their insightful comments and constructive suggestions, which helped us improve the paper.

## AUTHOR CONTRIBUTIONS

X.L. designed methods and algorithms, developed the six models, analyzed the data and drafted the manuscript. X.S., J.C., Y.Z. and Y.H. revised the manuscript.

## CONFLICT OF INTEREST

The authors declare no conflict of interest.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.