Regional water demand is an important basic data for regional engineering planning, design and management. Making full use of multi-source data and prior knowledge to quickly and economically obtain high-precision regional water demand is of great significance to the optimal allocation of regional water resources. In order to accurately predict the regional water demand, this study took Yulin City as a research area to predict the water demand of the city from 2017 to 2019. Aiming at the oscillating characteristics of the regional water demand sequence and the over-fitting problem of traditional prediction models, this study proposed the non-dominated sorting genetic algorithm II-fractional order reverse accumulative grey model (NSGAII-FORAGM). The regional water demand oscillation sequence was transformed into a monotonically decreasing non-negative sequence. Based on the transformation sequence, an optimization model was constructed according to the two objective functions of ‘maximum (or minimum) order’ and ‘best fit to historical data’, and the NSGAII method was adopted to solve the model. The three model structures of ‘fractional order’, ‘reverse accumulation’ and ‘obtaining order through multi-objective optimization model ‘ were tested based on the water use sequence of the three sectors (industry, tertiary industry and domestic) in Yulin City, and the performance of the method is compared with NSGAII-IORAGM, NSGAII-FOFAGM and SOGA-FORAGM. The results showed that the average relative error of the model established in this study for the simulation of industry, tertiary industry (the tertiary industry is a technical name for the service sector of the economy, which encompasses a wide range of businesses), and domestic was 15.54%, 11.20%, 9.98% respectively. The average relative error of the model established in this study for the prediction of industry, tertiary industry and domestic was 9.46%, 7.9%, and 1.8%, respectively. For the simulation of water demand sequences in three sections, the simulation average relative errors of the other three models were not absolutely dominant except for the SOGA-FORAGM model. The average relative predicted error by the model in this study was the smallest (the relative errors of the three sequence predictions for industry, tertiary industry and domestic were lower than the relative errors of the optimal results of the comparison model, which were 0.97%, 0.72% and 4.5%, respectively), indicating that the model had certain applicability for the water demand prediction of various sectors (industry, tertiary industry and domestic) in the region compared with other models, and can improve the accuracy of the prediction results.

  • The oscillating water demand sequence is transformed into a monotonically decreasing non-negative sequence.

  • The reverse accumulation method enables the model to make full use of the information of the new data.

  • As the model order, the fractional order can improve the prediction accuracy of the model.

  • Determining the order through the multi-objective optimization model can prevent the model from overfitting.

The optimal allocation of water resources is an effective means to realize the sustainable development and utilization of water resources (Zhou et al. 2015). Accurately predicting the regional water demand is a vital method to realize the optimal allocation of water resources in the region, and it has certain significance for the rational water distribution of the region (Zhai et al. 2009; Buck et al. 2020). Therefore, the accurate forecasting of regional water demand has been an urgent problem to be solved with the increasing shortage of water resources.

The forecasting methods of water demand mainly include linear and nonlinear forecasting models. Linear prediction models mainly include regression analysis methods (Anele et al. 2017; Villarin 2019), and the quota method (Babel et al. 2007). These models have been applied in different fields; for example, Kitessa et al. (2021) used linear regression to predict the urban energy and water demand. Li et al. (2015) used the quota method to predict the water demand in Gui'an, but these methods are linear models and cannot reflect the nonlinear relationship in water consumption prediction. Meanwhile, these methods also have their own shortcomings. For example, the regression analysis method requires high data stability (Chen et al. 2003), the quota method is mainly based on the characteristics of the main water demand and the current water allocation quota for prediction. However, the preparation of the water allocation quota has certain subjectivity and uncertainty and it is difficult to achieve accurate prediction (Hou et al. 2018). To solve this problem, nonlinear prediction models have been proposed, such as neural network models (Maidment & Parzen 1984; Maidment et al. 1985; Guo et al. 2018), system dynamics methods (Chhipi-Shrestha et al. 2017), support vector machine algorithm (Pena-Guzman et al. 2016) and grey model (Wu et al. 2017). These models have already been applied in engineering practice, and can reflect the non-linear relationship in the water demand forecasting process, but there are still certain problems. For example, neural network models and support vector machine algorithms required a large amount of data for model training, and there is also the problem of overfitting (Tetko et al. 1995). The system dynamics method required high requirements for actual operators, which affected the further promotion of the method (Huang et al. 2004). Meanwhile, the grey model is a method for studying ‘poor information’, ‘small samples’ and uncertainty (Liu et al. 2013), and is widely used in economics, finance and other fields. The amount of historical data of regional water demand is not large and it is affected by many factors, so the regional water demand sequence is oscillating. Therefore, the grey model is applicable to forecasting the water demand.

Currently, the improvement of the grey model is divided into two aspects. One is to extend the order of the grey model from a positive integer number to a positive real number, such as the fractional order forward accumulation model (Wu et al. 2013b; Mao et al. 2016; Li et al. 2021), which can effectively improve the prediction accuracy (Wu et al. 2015). The other is to change the forward accumulation to the reverse accumulation to improve the model structure, such as the first-order reverse accumulative model (Che et al. 2013; Xiao et al. 2014). Reverse accumulation uses more information about the new data than the old data, which can effectively improve the performance of the model (Liao & Luo 2011). Combining the advantages of these two aspects, this study used the fractional order reverse accumulative grey model to predict regional water demand, which had good research prospects and has attracted the attention of many scholars (Xiong et al. 2019). The fractional order reverse accumulative grey model also had less predictive disturbance and could utilize new information of the sequence compared with the fractional-order forward accumulation model and the first-order backward accumulation model. The fractional order reverse accumulative grey model is mainly suitable for monotonous non-negative decreasing series (Lian et al. 2013). At the same time, the actual historical water demand data has a certain degree of oscillation. In response to these problems, it is necessary to transform historical water demand data into a monotonous non-negative decreasing sequence. However, there is still no more uniform conversion method to transform the sequence into a monotonous non-negative decreasing sequence.

The order of the fractional order reverse accumulative grey model has a great influence on the model prediction effect. In the past, the determination of the order mostly used ‘the best fit of historical data’ as the objective function to construct the optimization model, and the intelligent optimization algorithm was used to solve the model and obtain the final order (Li et al. 2021). This method only took the degree of historical data fitting as the objective function, which was over-fitting and made the model over-learn the noise in the historical data.

In view of the above problems, the idea of inverse accumulation and fractional order was added on the basis of the grey model, and a fractional reverse accumulation grey model was constructed. The water demand sequence was converted into a monotonically decreasing non-negative sequence, a multi-objective model about the order was constructed based on ‘the best fit of historical data’ and ‘maximum (or minimum) order’, and the NSGAII was used to find the optimal order set. Then the optimal order was found according to the fitting effect on the verification set. Finally, the future water demand sequence was predicted based on the optimal order and fractional order backward accumulation grey model. The water demand forecast of Yulin City's water department was taken as an example to verify the model and compare the forecast results with the forecast results of the comparative model.

It is necessary to understand the regional water demand in water resources planning and management, and the model of this study can predict the regional water demand, so it can provide reference for the actual water resource planning and design personnel. The model established in this study can provide guidance for production and domestic water use for the water allocation business of water resources management departments, which is of great significance to improving the current situation of increasingly scarce water resources. The forecast results can provide support for the regional development of water plans.

Overall framework

The specific flow chart of NSGAII-FORAGM is shown in Figure 1.

Figure 1

NSGAII-FORAGM structure diagram.

Figure 1

NSGAII-FORAGM structure diagram.

Close modal

Method introduction

Data preprocessing

The collected historical data were divided into three types: training data, verification data and test data. Since the historical water demand sequence was an oscillating sequence, the training data set and the verification data set need to be converted into monotonic non-negative sequences. Literature (Qian & Dang 2009) proposes that there are c, d ∈ (2,3,…,m)
(1)
The sequence X (x1,x2,…,xn,xn+1,…,xm) is called the oscillation (fluctuation) sequence. The sequence composed of the training data set and the verification data set is X (x1,x2,…,xn,xn+1,…,xm), where the first n is the training data, and m is the total of the training data set and the verification data set. Marking T = max (xk-xk−1|k = 2,3,…,m), the oscillating sequence can be transformed into a monotonous non-negative decreasing sequence by formula (2).
(2)
yk (k = 1,2,…,n) is the training data sequence of monotonous non-negative decrease after conversion.
Formula (3) can be obtained from formula (2),
(3)
After simplification, formula (4) can be obtained,
(4)
The formula (4) is always non-positive because T = max(xkxk−1|k = 1,2,…,m). Therefore, the transformed sequence is monotonous decreasing and non-negative, which meets the requirements of the model.

Fractional order reverse accumulative grey model

The fractional order reverse accumulative grey model has the advantages of less prediction disturbance and the ability to use new information compared with the traditional grey model. According to the principle of the fractional order reverse accumulative grey model, the r-order reverse accumulative operator can be written as:
(5)
where r is the order, is the reverse cumulative value of the sequence; k is the serial number of the water use year.
Definition n7∈R and n7 ∉ {0, − 1, − 2, − 3,…}, Γ(n7) is the Gamma function of the real number n7,
(6)
It can be deduced that the Gamma function has the following recursive relationship through the integration by parts:
(7)
Especially when n7∊N, there are
(8)
The coefficient of y0(i) is
(9)
Formula (9) can be transformed into formula (10)
(10)
According to the properties of the above Gamma function, formula (10) can be written as formula (11)
(11)

This conversion is conducive to subsequent programming calculations.

The time response formula of the model is
(12)
In the formula (k), the sequence is the cumulative predicted value of order r; the coefficients a and b can be obtained by formulas (13)-(15):
(13)
(14)
(15)

and are the estimated parameters of a and b respectively.

According to the r order accumulation sequence (, , …, ), which are obtained according to the time response formula, firstly carry out 1-r order accumulation, then carry out the first-order subtraction, and restore according to formula (2) to obtain the simulation value (, , …, ).

Method for determining the optimal order of the model

With the best fit of historical data as the objective function, the inherent laws of historical data can be found, see formula (16) for details. However, it should be pointed out that over-reliance on this formula will also cause a risk of overfitting.
(16)

In the formula, f2(r) represents the degree to which the model fits historical data.

In order to overcome the over-fitting problem of the calculation model, this study selected the ‘maximum (or minimum) order’ as the objective function, and combined formula (16) to construct a multi-objective optimization model.
(17)

The constraints of the model are expressed as 0 < r < 1.

f1(r) represents the maximum order (or minimum order). Only using formula (16) as the objective function to find the model order would cause the model to overfit. Therefore, it is necessary to prevent the model from over-learning; that is, to prevent over-fitting to historical data, and formula (17) was constructed. The fitting effect of historical data and the model order would show a positive or negative correlation in different intervals. When the two show a positive correlation, the order needs to be reduced to reduce the historical data fitting effect to prevent over-fitting; that is, the minimum order is the goal; when the two show a negative correlation, the order needs to be increased; that is, the maximum order is the goal. Therefore, it is necessary to perform trial calculations on both forms of formula (17) to find the optimal order.

The NSGA-II was used to solve the optimization model composed of formula (16)–formula (17), and a Pareto solution set was obtained, and each order in the solution set was inputted into the fractional-order reverse cumulative grey model to simulate the data in the validation data set. The simulation effect was evaluated according to formula (18), and the order with the best simulation effect was selected as the final order of the model.
(18)

f3(r) represents the relative error of the model to the validation set data simulation. According to the obtained final order, combined with the fractional order reverse accumulative grey model to predict the data of the test data set.

Model input and output

The rolling prediction method was adopted, the step size of the prediction was one step, and the historical data before the prediction year was used as the training set and the verification set. The model was constructed according to the above method to predict the water demand in one year. By analogy, rolling predictions on all data in the test set were realized.

Comparison model and evaluation index

Comparison model

In order to test the prediction performance of the model proposed in this study, the ‘non-dominated sorting genetic algorithm II-fractional order forward accumulative grey model (NSGA II-FOFAGM)’, ‘non-dominated sorting genetic algorithm II- integer order reverse accumulative grey model (NSGA II-IORAGM)’ and ‘single objective genetic algorithm-fractional order reverse accumulative grey model (SOGA- FORAGM)’ were established respectively. The details are as follows.

The NSGA II-FOFAGM adopts the forward accumulation method to build the model. This is the only difference between the NSGA II-FOFAGM and the model build in the study. The model in this study used formula (19) instead of formula (5), which becomes the NSGA II-FOFAGM.
(19)
yrf(k) is the forward cumulative value of the sequence, and the remaining parts are the same as before. The NSGA II-FOFAGM can be used as a comparative model to verify the influence of the reverse accumulation on the prediction performance of the model.

The order of the NSGA II-IORAGM is one order, which is the only difference between this model and the model in this study; that is, NSGA II-IORAGM can be regarded as the case where the order of the model in this study is one. At the same time, the one order is the commonly used order of the previous grey prediction models (Liu et al. 2011; Yao et al. 2012; Wu et al. 2013a; Mao et al. 2016). The NSGA II-IORAGM can be used as a comparative model to verify the influence of fractional order on the prediction performance of the model.

The only difference between the SOGA-FORAGM and the model in this study is that the SOGA-FORAGM takes the objective function of ‘optimal historical fitting’ as the objective function; that is, the SOGA-FORAGM can be regarded as the case where the objective function of ‘maximum (minimum) order’ is removed from the model in this study. The SOGA-FORAGM can be used as a comparative model to verify the impact of ‘whether overfitting has been considered’ on the prediction performance of the model.

Evaluation index

At present, there are many methods to measure the prediction error of the model, including Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Mean Squared Error (MSE), Mean Square Root Mean Squared Error, and so on. One or more of them are selected in different studies, and it is generally considered that the model with the smallest error under a certain criterion is the model with the highest accuracy. Donkor et al. (2014) believed that defining the reliability of prediction required constructing a threshold based on an error measurement method to compare the confidence of different prediction models. Relative error is a criterion that can use this threshold to compare the accuracy of models, because it is often dimensionless (Sebri 2016). The calculation formula of relative error is shown in formula (20).
(20)
where MAPE represents the relative error of the model; l represents the numbers of data in the prediction set. The specific comparison flowchart is shown in Figure 2.
Figure 2

Model comparison flowchart.

Figure 2

Model comparison flowchart.

Close modal

Data collection and basic situation introduction

This study took Yulin City as an example. Due to the uneven distribution of water resources in Yulin City and the increasingly serious water pollution, 83% of the counties in the city have per capita water resources below 1,000 m3, which has reached the severe water shortage standard (OrlińskaWoźniak et al. 2013). The per capita water resources are less than 500 m3/person to varying degrees, which has reached the extreme water shortage standard (OrlińskaWoźniak et al. 2013). Therefore, it is necessary to predict the city's water demand in order to provide a basis for the city's overall planning of water resources.

The water demand of the city from 2017 to 2019 was forecast. According to the 2000–2019 Shaanxi Water Resources Bulletin, Yulin Water Resources Bulletin, the water demand data of Yulin City from 2000 (2004) to 2019 were collected. The characteristic parameters of the water demand data of the city's industry, tertiary industry and domestic over the years are shown in Table 1.

Table 1

Characteristic parameter table of water demand sequence in Yulin City from 2000(2004) to 2019

Sequence lengthWater demand sectionsMean value (100 million m3)Skewness coefficientHurst index
2000–2019 Industry 1.292 0.188 0.929 
2004–2019 Tertiary industry 0.124 0.216 0.942 
2000–2019 Domestic 0.694 0.910 0.928 
Sequence lengthWater demand sectionsMean value (100 million m3)Skewness coefficientHurst index
2000–2019 Industry 1.292 0.188 0.929 
2004–2019 Tertiary industry 0.124 0.216 0.942 
2000–2019 Domestic 0.694 0.910 0.928 

From the average value in the third column of Table 1, it can be seen that the main water users in Yulin City were industrial and domestic water demand, while the tertiary industry was in the last position. Although there were data showing that the tertiary industry in the city had developed rapidly, the tertiary industry accounts for a small share of the city from the current situation. From the perspective of the skewness coefficient in the fourth column, all three series were right-skewed; that is, the average value was less than the median, indicating that there were maximum values in the three series, and the consistency of the data were affected to a certain extent. This was not conducive to data-driven models that consider data consistency to predict. From the Hurst index in the fifth column and the literature (Xie et al. 2009), it can be seen that the three sequences had mutation. This was also consistent with the conclusion drawn from the skewness coefficient, indicating that the water demand time series of the three sections all had certain inconsistencies, so this should be considered when constructing a data-driven model.

Model application and result analysis

According to formula (2), the data sequences of three sections were preprocessed to make the transformed data conform to the characteristics of ‘non-negative’ and ‘monotonous decreasing’, which can be used as the training and verification data set of the fractional reverse accumulation grey model.

It can be seen from Figure 3 that the transformed data conforms to the characteristics of ‘non-negative’ and ‘monotonous decreasing’, and can be used as a training and verification data set for the fractional reverse accumulation model. The data was divided into training set and validation set based on the transformed water demand sequences of the three sections from 2000 (2004) to 2016. According to the training set, a multi-objective optimization model about the order was constructed with ‘the smallest order’ and ‘the best fit of historical data’ as the objective function, and the NSGA-II was used to solve the model to obtain the Pareto solution set. Similarly, a multi-objective optimization model about the order was constructed with the ‘maximum order’ and ‘best fit of historical data’ as the objective function, and the NSGA-II was used to solve the model to obtain the Pareto solution set. This was the first forecast among the rolling water demand forecasts for three sections, and the target of forecast was the water demand of 3 sections in 2017.

Figure 3

Water demand after conversion.

Figure 3

Water demand after conversion.

Close modal

The target of the second forecast was the water demand of three sections in 2018, and the basic data was the water demand sequence of three sections from 2000 (2004) to 2017. The target of the third forecast was the water demand of three sections in 2019, and the basic data was the water demand sequence of three sections from 2000 (2004) to 2018. The method for obtaining the second and third optimal order sets was the same as the first one. The difference was the basic data and prediction objects. The finally obtained Pareto solution set is shown in Figure 4.

Figure 4

Pareto set of orders of various sections.

Figure 4

Pareto set of orders of various sections.

Close modal

From Figure 4(n) and 4(o), it can be seen that the simulation errors of domestic training data were positively correlated in the range of 0–1, but these cases were rare. From 0 to 1, the two models (the minimum order model and the maximum order model) show the Pareto solution set of curves. The order showed a positive correlation in some intervals and a negative correlation in some intervals. Therefore, it was difficult to determine the correlation between the model's simulation error and the order, which also showed that it was very necessary to construct the model for the two situations in this study. At the same time, it can be seen from the figure that the simulation error of the model for training data was not a continuous function with respect to the order, which was difficult to solve for some analytical methods that require high continuity, and the intelligent optimization algorithm used in this study required lower conditions, which also showed that the intelligent optimization algorithm used in this study had certain applicability. Generally speaking, the solutions in each graph basically conform to the Pareto solution set, indicating that the model in this study and the corresponding solution algorithm were reasonable.

In the first predictions of the three sections, the orders in the Pareto solution set of the first predictions of the three sections were substituted into the NSGAII-FORAGM one by one, and the data of the first verification data set of the three sections were simulated respectively. The optimal order of the first forecast model for the three sections was determined according to formula (20). Similarly, according to the Pareto solution set of the second and third simulation of the three sections and the second and third verification data sets of the three sections, the optimal order of the second and third prediction models for three sections was determined according to the same as the first prediction. The simulation errors of the validation set for different prediction times in various sectors are shown in Figure 5.

Figure 5

The simulation error of each order in each section on the validation set.

Figure 5

The simulation error of each order in each section on the validation set.

Close modal

Comparing Figures 4 and 5, it can be concluded that the optimal order of each sector forecast was not the smallest simulation error for the training set. This also showed that the optimal order determined by the method in this article can effectively prevent the prediction overfitting.

According to the optimal order, combined with the fractional order reverse accumulative grey model, the water demand sequence of three sections was simulated and predicted. The results are shown in Figure 6.

Figure 6

The predictions of this model for various sections.

Figure 6

The predictions of this model for various sections.

Close modal

The posterior variance ratio and small error probability are commonly used to evaluate the point prediction performance of the grey model. The smaller the posterior variance ratio, the more discrete the original water demand sequence in the area, but the difference sequence between the model calculated value and the actual value is not very discrete, indicating that the performance of the model build in this study is good. The greater the probability of small errors, the higher the proportion of model fitting points and the better the model performance. According to the model ranking table, when the posterior variance ratio is less than 0.35 and the small error probability is greater than 0.95, the model performance is at the optimal level (excellent); when the posterior variance ratio is less than 0.5 and the small error probability is greater than 0.8, the model performance is sub-optimal Level (good). The specific calculation method of the index and the model ranking table can be found in the literature (Hsu & Wen 1998; Hu et al. 2016).

According to the data in Figure 6, the posterior variance ratios of the three predictions for industrial water demand were 0.348, 0.347, and 0.271, respectively. The probability of small error was 1, 1 and 1, respectively. The predictive performance of the fractional reverse cumulative grey model for industrial water demand series was at the optimal level (excellent); the posterior variance ratios of the three predictions for tertiary industrial water demand were 0.177, 0.182 and 0.203. The probabilities of small errors were 1, 1, and 1, respectively. The prediction performance of the fractional reverse accumulation grey model for the tertiary industrial water demand sequence was the optimal level (excellent); the posterior variance ratios of the three predictions for domestic water demand were 0.233, 0.209 and 0.195, respectively. The probability of small error is 1, 1 and 1, respectively, and the prediction performance of the fractional reverse accumulation grey model for the domestic water demand sequence was the optimal level (excellent). It can be seen from the above data that the model in this study had excellent forecasting performance for the water demand sequences of the three sections. At the same time, the prediction average (the average of the relative errors of the three predictions) of the model for industry, tertiary industry and domestic in this study was 9.46%, 7.90%, and 1.80%, respectively. The simulation average relative errors (the average of the relative errors of the three simulations) were 15.538%, 11.197% and 9.983% respectively. It can be found by comparison that the predicted average relative error of the model in this study was less than the simulated average relative error. This was because the model adopted such as the added verification set, the double-objective function and the reverse accumulation method, which prevent the data from over-fitting and excessive use of new information. Preventing over-fitting would cause the simulation average relative error to be higher; however, the reverse accumulation method can enable the model to use information from the new points and improve the prediction performance, so it would make the forecast average relative error lower. Therefore, the model in this study made the forecast average relative error lower, and the obtained forecast average relative error was less than the simulated average relative error.

Model comparison

In order to test the prediction performance of the model in this study, the NSGA II-IORAGM, SOGA-FORAGM and NSGA II-FOFAGM were compared as comparison models. The prediction results of each model are shown in Table 2 and Figure 7.

Table 2

Comparison table of prediction effects of various models (Unit: %)

itemsWater demand sectionsNSGAII-FORAGMNSGA II-IORAGMSOGA- FORAGMNSGA II-FOFAGM
Simulated average relative error Industry 15.54 18.30 10.49 15.81 
Tertiary industry 11.20 10.83 10.42 11.20 
Domestic 9.98 8.02 4.29 4.94 
Forecast average relative error Industry 9.46 32.46 10.43 15.91 
Tertiary industry 7.90 8.62 8.90 9.31 
Domestic 1.80 15.55 6.30 1.86 
itemsWater demand sectionsNSGAII-FORAGMNSGA II-IORAGMSOGA- FORAGMNSGA II-FOFAGM
Simulated average relative error Industry 15.54 18.30 10.49 15.81 
Tertiary industry 11.20 10.83 10.42 11.20 
Domestic 9.98 8.02 4.29 4.94 
Forecast average relative error Industry 9.46 32.46 10.43 15.91 
Tertiary industry 7.90 8.62 8.90 9.31 
Domestic 1.80 15.55 6.30 1.86 
Figure 7

Comparison of predicted values of various models (Unit:100 million m3). (a) Industy. (b) Tertiary industry. (c) Domestic.

Figure 7

Comparison of predicted values of various models (Unit:100 million m3). (a) Industy. (b) Tertiary industry. (c) Domestic.

Close modal

It can be seen from Table 2 and Figure 7 that, except for the SOGA-FORAGM model (this model uses the best historical fit as the objective function to obtain parameters), the average relative error of the simulation of the other three models did not have an absolute dominance for the simulation of the water demand sequence of the three sections. However, the prediction average relative error with the model in this study was the smallest compared with other comparison models, indicating that the model in this study had certain applicability to the water demand forecast of various sectors (industry, tertiary industry and domestic) in the region.

Discussion of results

According to the results in Table 1 (according to the skewness coefficient in the fourth column of Table 1, the three water use sequences are all right biased, and the Hurst index in the fifth column also shows that there is variation in the three water use sequences. Skewness coefficient and Hurst index show that there is a certain degree of variation in the three water use sequences), there was a certain degree of variation in the water demand sequence of the three sectors (industry, tertiary industry and domestic), which affected the consistency of the data. If the model cannot distinguish the data before and after the mutation for inconsistent sequences, it will inevitably affect the prediction results. The model established by the reverse accumulative method can give new data greater weight, so that the new information can be fully utilized and the prediction accuracy can be improved. The difference between the NSGAII-FORAGM and the NSGA II-FOFAGM was that NSGAII-FORAGM used the reverse accumulative method to construct the model, so the prediction accuracy of the NSGAII-FORAGM should be better. In Table 2, the average relative error of the NSGAII-FORAGM's prediction of water demand sequence for the three sections was smaller than that of the NSGAII-FOFAGM, which also verified the above viewpoint. However, for the simulation of the training set and the validation set, because the NSGAII-FORAGM and the NSGAII-FOFAGM had different emphasis on the data, the simulation average relative error of the two models was not absolutely dominant. The simulated average relative errors of the two models in Table 2 had their own advantages and disadvantages, which also verified the above viewpoint.

The difference between the NSGAII-FORAGM and the NSGA II-IORAGM was that the order of the NSGAII-FORAGM was fractional, which can improve simulation and prediction accuracy. This was also the reason why the average relative error of the NSGAII-FORAGM for water demand series forecast in the three sections in Table 2 was smaller than that of the NSGA II-IORAGM. These data all verified the view that ‘fractional order can improve the simulation and prediction accuracy of the model’, which was also consistent with the view in the literature (Yao et al. 2012). However, for the simulation of the training set and the validation set, the NSGAII-FORAGM model considered the over-fitting problem in the process of determining the order (such as adding a validation set, adding an objective function such as ‘maximum order (minimum)’), so there were certain disadvantages in data simulation compared with the NSGA II-IORAGM model. However, since the order of the NSGAII-FORAGM was fractional, the simulation accuracy had been improved. In general, the two models were not absolutely dominant in the simulated average relative error, and this was also consistent with the Table 2, which showed that the simulated average relative errors of the two models had their own advantages and disadvantages.

The SOGA-FORAGM only used ‘the best historical fit’ as the objective function to construct an optimization model to obtain the model order, which caused over-learning of the data before the change and lead to a decrease in the prediction performance of the model, especially in the inconsistent time series. However, the NSGAII-FORAGM in this study added the objective function of ‘maximum order (minimum)’ to the objective function of ‘best historical fit’, which effectively alleviated model overfitting. Therefore, the average relative error of the prediction of this model was less than that of the SOGA-FORAGM in the prediction of the three inconsistent water demand series. However, because the SOGA-FORAGM took the ‘best historical fit’ as the objective function to obtain order, the average relative error of the simulation of the SOGA-FORAGM was smaller than that of other models in the simulation of the training set and the validation set.

In general, the prediction results of the three models were reasonable, but the model in this study was more applicable, especially for the prediction of time series with inconsistent data. Therefore, the model in this study had certain applicability for the water demand forecast in Yulin City.

The reverse accumulation method was used to construct the sequence in order to make full use of the information of the new data, and the ‘in between’ idea was used to extend the order from a positive integer to a positive real number and improve the accuracy of the model. Aiming at the oscillating characteristics of water demand sequences and the data requirements of the fractional reverse accumulative grey model, a method of transforming the oscillating series into a monotonically decreasing non-negative series was proposed. On the basis of the traditional goal (the best fit of historical data), the objective function of ‘maximum (or minimum) order’ was added to prevent the model from overfitting, and to establish a multi-objective optimization model related to the order. Meanwhile, the NSGA-II method was used to solve the model. According to the results of the verification set fitting, the optimal order was selected to realize the regional water demand prediction, combined with the fractional order reverse accumulation grey model.

The results showed that compared with the existing models NSGA II-IORAGM, NSGA II-FOFAGM and SOGA-FORAGM, the model established in this study was best in terms of prediction accuracy and can effectively improve the generalization ability of the model in most cases.

NSGAII-FOFAGM cannot make full use of new information from historical data, and the prediction effect was not optimal for data with poor consistency. The current regional water use sequence was affected by climate and human activities, which affects the consistency of the data, so NSGAII-FOFAGM cannot respond well to new situations. The model in this study was established by the method of reverse accumulation, which can give new data greater weight and solve the problem of inconsistent data, make full use of new information in the model learning process, improve the prediction accuracy, and make the prediction more accurate.

The order of the NSGAII-IORAGM model was an integer, while the order of the model in this study was a fractional order, which theoretically had higher prediction accuracy. For example, the prediction effect of NSGAII-FORAGM is better than that of NSGAII-IORAGM in the prediction of water use in three sections.

The SOGA-FORAGM model only used ‘the best historical fit’ as the objective function to construct an optimization model to obtain the model order, which caused overfitting, and the forecasting effect was affected especially in inconsistent time series. However, the NSGAII-FORAGM added the objective function of ‘maximum order (minimum)’ to the objective function of ‘best historical fit’, which effectively alleviated the problem of model over-fitting, and the prediction effect of the NSGAII-FORAGM was better than that of SOGA-FORAGM.

Generally speaking, the relative errors of the three sequence predictions of industry, tertiary industry and domestic in the model of this study were lower than the relative errors of the optimal results of the comparison model, which were 0.97%, 0.72% and 4.5%, respectively. Therefore, the three model construction methods of ‘fractional order’, ‘reverse accumulation’ and ‘multi-objective optimization model seeking order’ can improve the performance of the model.

The model established in this study can more accurately predict the regional water demand, and then analyze the regional water demand in advance. The prediction results can provide a basis for the optimal allocation of regional water resources, and realize the optimal dispatch of various regional water resources and the rational use of water resources.

Although this study had obtained some valuable research results, it is still worth noting that this research was conducted through the water demand time series of three sections in Yulin, and more research was needed to determine the model (NSGA II-FORAGM) performance in various situations (geographical and climatic regions).

The authors would like to thank Shanxi Provincial Department of Education.

This study was funded by the Postgraduate education Innovation project of Shanxi province (RC1900001671).

All relevant data are included in the paper or its Supplementary Information.

Babel
M. S.
,
Das Gupta
A.
&
Pradhan
P.
2007
A multivariate econometric approach for domestic water demand modeling: an application to Kathmandu, Nepal
.
Water Resources Management
21
(
3
),
573
589
.
Buck
S.
,
Auffhammer
M.
,
Soldati
H.
&
Sunding
D.
2020
Forecasting residential water consumption in California: rethinking model selection
.
Water Resources Research
56
(
1
),
25
.
Chen
Y. Q.
,
Pong
A.
&
Xing
B.
2003
Rank regression in stability analysis
.
Journal of Biopharmaceutical Statistics
13
(
3
),
463
479
.
Chhipi-Shrestha
G.
,
Hewage
K.
&
Sadiq
R.
2017
Water-energy-carbon nexus modeling for urban water systems: system dynamics approach
.
Journal of Water Resources Planning and Management
143
(
6
),
11
.
Donkor
E. A.
,
Mazzuchi
T. A.
,
Soyer
R.
&
Roberson
J. A.
2014
Urban water demand forecasting: review of methods and models
.
Journal of Water Resources Planning and Management
140
(
2
),
146
159
.
Guo
G. C.
,
Liu
S. M.
,
Wu
Y. P.
,
Li
J. Y.
,
Zhou
R.
&
Zhu
X. Y.
2018
Short-term water demand forecast based on deep learning method
.
Journal of Water Resources Planning and Management
144
(
12
),
11
.
Hou
B. D.
,
Yang
R. X.
,
Zhan
X. Z.
,
Tian
W. K.
,
Li
B. Q.
,
Xiao
W. H.
,
Wang
J. H.
,
Zhou
Y. Y.
&
Zhao
Y.
2018
Conceptual framework and computational research of hierarchical residential household water demand
.
Water
10
(
6
),
18
.
Hsu
C. I.
&
Wen
Y. H.
1998
Improved grey prediction models for the trans-pacific air passenger market
.
Transportation Planning & Technology
22
(
2
),
87
107
.
Hu
X.
,
Wang
Y.
,
Yu
Y.
,
Wang
D.
&
Tian
Y.
2016
Research on the concentration prediction of nitrogen in red tide based on an optimal Grey Verhulst model
.
Mathematical Problems in Engineering
2016
,
9786107
.
Huang
X. Q.
,
Kang
S. Z.
&
Wang
J. L.
2004
A preliminary study on predicting method for the demand of irrigation water resource
.
Journal of Irrigation and Drainage
23
(
4
),
13
15
.
Kitessa
B. D.
,
Ayalew
S. M.
,
Gebrie
G. S.
&
Teferi
S. T.
2021
Long-term water-energy demand prediction using a regression model: a case study of Addis Ababa city
.
Journal of Water and Climate Change
12
(
6
),
2555
2578
.
Li
X. N.
,
Zhao
X. J.
,
Shen
X. M.
&
Wei
Z. X.
2015
Prediction of water demand in Gui'an city of Guizhou province in China
.
International Forum on Energy, Environment Science and Materials
2015
,
1351
1358
.
Li
J.
,
Song
S. B.
,
Kang
Y.
,
Wang
H. J.
&
Wang
X. J.
2021
Prediction of urban domestic water consumption considering uncertainty
.
Journal of Water Resources Planning and Management
147
(
3
),
14
.
Lian
Z. W.
,
Dang
Y. G.
&
Wang
Z. X.
2013
Properties of accumulated generating operation in opposite-direction and optimization of GOM(1,1) model
.
Xitong Gongcheng Lilun yu Shijian/System Engineering Theory and Practice
33
(
9
),
2306
2312
.
Liu
Y. R.
,
Hu
Y.
&
Hou
M. L.
2011
A fractional order grey prediction algorithm
.
Journal of Grey System
14
(
4
),
139
144
.
Liu
S. F.
,
Forrest
J.
&
Yang
Y. J.
2013
Advances in grey systems research
.
Journal of Grey System
25
(
2
),
1
18
.
Maidment
D. R.
&
Parzen
E.
1984
Time patterns of water use in six Texas cities
.
Journal of Water Resources Planning & Management
110
(
1
),
90
106
.
Maidment
D.
,
Miaou
S.-P.
&
Crawford
M.
1985
Transfer function models of daily urban water use
.
Water Resources Research – Water Resour. Res.
21
,
425
432
.
Mao
S. H.
,
Gao
M. Y.
,
Xiao
X. P.
&
Zhu
M.
2016
A novel fractional grey system model and its application
.
Applied Mathematical Modelling
40
(
7–8
),
5063
5076
.
OrlińskaWoźniak
P.
,
Wilk
P.
&
Gębala
J.
2013
Water Availability in Reference to Water Needs in Poland
.
Qian
W. Y.
&
Dang
Y. G.
2009
GM(1,1) model based on oscillation sequences
.
Systems Engineering-Theory & Practice
23
(
4
),
149
154
.
Sebri
M.
2016
Forecasting urban water demand: a meta-regression analysis
.
Journal of Environmental Management
183
,
777
785
.
Tetko
I. V.
,
Livingstone
D. J.
&
Luik
A. I.
1995
Neural-network studies .1. Comparison of overfitting and overtraining
.
Journal of Chemical Information and Computer Sciences
35
(
5
),
826
833
.
Wu
L.
,
Liu
S.
,
Yao
L.
,
Yan
S.
&
Liu
D.
2013a
Grey system model with the fractional order accumulation
.
Communications in Nonlinear Science and Numerical Simulation
18
(
7
),
1775
1785
.
Wu
L. F.
,
Liu
S. F.
,
Yao
L. G.
,
Yan
S. L.
&
Liu
D. L.
2013b
Grey system model with the fractional order accumulation
.
Communications in Nonlinear Science and Numerical Simulation
18
(
7
),
1775
1785
.
Wu
L. F.
,
Liu
S. F.
,
Fang
Z. G.
&
Xu
H. Y.
2015
Properties of the GM(1,1) with fractional order accumulation
.
Applied Mathematics and Computation
252
,
287
293
.
Wu
H. a.
,
Zeng
B.
&
Zhou
M.
2017
Forecasting the water demand in Chongqing, China using a grey prediction model and recommendations for the sustainable development of urban water consumption
.
International Journal of Environmental Research and Public Health
14
(
11
),
1386
.
Xiao
X. P.
,
Guo
H.
&
Mao
S. H.
2014
The modeling mechanism, extension and optimization of grey GM (1,1) model
.
Applied Mathematical Modelling
38
(
5–6
),
1896
1910
.
Xie
P.
,
Chen
G.
&
Lei
H.
2009
Hydrological alteration analysis method based on Hurst coefficient
.
Journal of Basic Science and Engineering
17
(
01
),
32
39
.
Xiong
P. P.
,
Shi
J.
,
Pei
L. L.
&
Ding
S.
2019
A novel linear time-varying GM(1,N) model for forecasting haze: a case study of Beijing
.
China. Sustainability
11
(
14
),
14
.
Yao
T.
,
Forrest
J.
&
Gong
Z.
2012
Generalized discrete GM (1,1) model
.
Grey Systems: Theory and Application
2
,
4
12
.
Zhai
C.
,
Zhang
H.
&
Zhang
X.
2009
Application of System Dynamics in the Forecasting Water Resources Demand in Tianjin Polytechnic University
.
Application of System Dynamics in the Forecasting Water Resources Demand in Tianjin Polytechnic University
,
Dalin, China
.
Zhou
Y. L.
,
Guo
S. L.
,
Xu
C. Y.
,
Liu
D. D.
,
Chen
L.
&
Wang
D.
2015
Integrated optimal allocation model for complex adaptive system of water resources management (II): case study
.
Journal of Hydrology
531
,
977
991
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY-NC-ND 4.0), which permits copying and redistribution for non-commercial purposes with no derivatives, provided the original work is properly cited (http://creativecommons.org/licenses/by-nc-nd/4.0/).