Short-term (e.g., hourly) urban water consumption (or demand) prediction is of great significance for the optimal operation of the intelligent water distribution pump stations. In this study, three single models (autoregressive integrated moving average (ARIMA), back-propagation (BP), support vector machine (SVM)) and three hybrid models (ensemble empirical mode decomposition (EEMD)-ARIMA, EEMD-BP and EEMD-SVM) were developed and compared in terms of prediction accuracy and application convenience. 31-day (1 month) hourly flow series from a water distribution division in Shanghai were used for the demonstration case study, among which 30-day data used for model training and 1-day data used for model verification. Finally, the effects of historical data length on the prediction accuracy of three hybrid models were also analyzed, and the optima of the historical data length for three hybrid models were obtained. Results reveal that (1) the mean absolute percentage errors (MAPE) of EEMD-ARIMA, EEMD-BP, EEMD-SVM, ARIMA, BP and SVM are 5.2036, 1.4460, 1.3424, 5.7891, 4.3857 and 3.8470%, respectively. (2) In terms of prediction accuracy and actual practice convenience, EEMD-SVM performs best among the above six models. (3) The EEMD algorithm is effective for improving the prediction accuracy of six models. (4) The optimal historical data length of EEMD-ARIMA, EEMD-BP and EEMD-SVM are 11, 11 and 10 days, respectively.

  • Three single models (ARIMA, BP and SVM) and three hybrid models (EEMD-ARIMA, EEMD-BP and EEMD-SVM) were compared for the prediction of hourly water demand.

  • EEMD-SVM performs best among the six prediction models.

  • The EEMD algorithm is significant for improving prediction accuracy.

  • The optimal historical data length for intelligent algorithms should be greater than a week.

Graphical Abstract

Graphical Abstract
Graphical Abstract
     
  • ARIMA

    autoregressive integrated moving average

  •  
  • ARMA

    autoregressive moving average

  •  
  • ANN

    artificial neural network

  •  
  • BP

    back-propagation

  •  
  • DNN

    deep neural network

  •  
  • ES

    exponential smoothing

  •  
  • EMD

    empirical mode decomposition

  •  
  • EEMD

    ensemble empirical mode decomposition

  •  
  • IMF

    intrinsic mode functions

  •  
  • LSSVM

    least square support vector machine

  •  
  • MAPE

    mean absolute percentage error

  •  
  • MAE

    mean absolute error

  •  
  • MSE

    mean square error

  •  
  • Res

    residual

  •  
  • RMSE

    root mean square error

  •  
  • RF

    random forest

  •  
  • SVM

    support vector machine

  •  
  • WDN

    water distribution networks

Urban water demand forecast has a great effect on improving the stability of water supply. Due to the increasing number of water customers, the pressure of the water supply pipeline network has also increased, which, in turn, increases the risk of pipeline bursts and leaks and reduces the sustainability of peak water supply. Short-term water demand forecasting helps quantify the probability of pipeline bursts during the water supply process and identify possible leaks in the pipeline network (Hutton & Kapelan 2015; Brentan et al. 2017; Du et al. 2021). In addition, the prediction of urban water demand plays an important role in reducing pump station electricity consumption. Urban waterworks need to consume a lot of electricity in the process of producing highly qualified drinking water for users. In the United States, about 75 billion kWh of electricity are used for water purification and transmission each year, accounting for 4% of the total electricity consumption, and the cost of electricity is about 4 billion US dollars (Goldstein & Smith 2002). In China, electricity for water supply accounts for 30–50% of total water production costs (Shu et al. 2010). The operating costs of water supply pump stations in the water distribution network constitute the largest expenditure of the water supply enterprises worldwide (Zyl et al. 2004). When the pump is running, its cost mainly includes maintenance cost and electricity consumption cost, the cost of electrical energy during pump operation usually accounts for most of the total cost as the price of electricity has been rising globally (Mala-Jetmarova et al. 2017). Researchers proposed and tested an adaptive weighted sum genetic algorithm, which demonstrated the ability to achieve optimal pump scheduling to reduce energy consumption and daily maintenance costs of the water supply system (Abiodun & Ismail 2013). The problem of optimal pump scheduling has been studied in previous research (de la Perrière et al. 2014; Ghaddar et al. 2015; Carpitella et al. 2019; Luna et al. 2019). In the case of the Netherlands, relevant experiments show that the overall energy costs of water distribution networks (WDNs) based on predictive flow control are around 1.7–7.4% lower than that of traditional level-based flow control systems (Bakker et al. 2013a). It is crucial to reduce energy consumption in production, especially in the context of China's current ‘peak carbon’ and ‘carbon neutral’ targets, and the optimization of pumping stations in water distribution systems can reduce unnecessary energy consumption and thus reduce carbon emissions from power plants. Short-term water demand forecasting makes a fundamental contribution to the real-time intelligent control and scheduling of pumping stations, and it has far-reaching significance.

There are a number of problems with building urban water demand forecasting models, as water demand fluctuates over time and future water demand at certain times is affected by previous water demand, demographic, holiday and weather conditions (Donkor et al. 2014; Romano & Kapelan 2014). When modeling, if there are many factors to consider, it will inevitably result in more model input variables and increase model complexity and calculation time. At the same time, for water utilities, some factors are more difficult to obtain reliably than historical water demand data, causing inconvenience in practical application. It has been shown that reliable predictions can be achieved using historical hourly water demand data as the only input when predicting short-term water demand on an hourly scale (Cutore et al. 2008; Bakker et al. 2013b). As a result, short-term water demand forecasting is increasingly being investigated in order to better manage urban water supply systems.

At present, the methods that can be used to predict urban water demand can be roughly summarized as two categories: the traditional statistical method and the artificial intelligence method (Mikut & Reischl 2011). Traditional statistical methods include exponential smoothing (ES), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) and so on (Odan & Reis 2012). However, traditional statistical models have the characteristics of stationarity, linearity and low complexity. It is subject to certain restrictions when dealing with unstable and nonlinear time-series data, resulting in low prediction accuracy (Shukur & Lee 2015; Zhang et al. 2016, 2017; Ma et al. 2017; Qin et al. 2017). Studies over the past two decades have provided important information on the artificial intelligence method, including random forest (RF), artificial neural network (ANN) and support vector machine (SVM), and so on. In view of the nonlinear characteristics of water supply data, some scholars developed the RF regression model to predict daily water demand data in Southwest China (Chen et al. 2017). Previous research comparing the traditional statistical model and the ANN model has found the superiorities of the ANN model in processing complex and nonlinear time-series data (Bougadis et al. 2005; Piasecki et al. 2018; Yin et al. 2018). ANN, deep neural network (DNN), RF and least square support vector machine (LSSVM) were used to predict water demand at different time intervals (1, 12 and 24 h), then used R-squared, root mean square error (RMSE), mean square error (MSE) and mean absolute error (MAE) to compare the performance of the model. Results show that the ANN model performs best in the above time intervals (Vijai & Sivakumar 2018). In addition to the ANN, there are also some researchers who have used SVM regression models based on optimization algorithms to predict short-term urban water demand (Candelieri et al. 2019; Wu et al. 2020b). In some studies, SVM was selected as one of the tools to forecast hourly urban water demand, and the results show that SVM can be used as an appropriate modeling algorithm (Herrera et al. 2010).

In fact, in the use of a single algorithm for modeling, whether it is a traditional statistical method or an artificial intelligence method, there are several disadvantages. The traditional statistical model has a fixed function form and has relatively strict assumptions on the sampling data. Although artificial intelligence models can be used to analyze complex and nonlinear data more effectively, they also have defects, such as overfitting and the local optimization. Original water consumption data without preprocessing are directly used as input data when using a single model to make predictions. However, water consumption data are generally characterized by nonstationarity, nonlinearity and complexity, making the performance of a single prediction model inaccurate and unstable. The use of an orderly combination of multiple algorithms can make use of their respective advantages to make up for deficiencies, thereby improving the performance of the model. For complex time series, the ensemble empirical mode decomposition (EEMD) algorithm is an effective processing algorithm; it can decompose the different frequency components of the data to generate relatively stable subsequence, which is more conducive to modeling. The use of EEMD to pre-treat the data reduces the instability of original data and increases the possibility of achieving high accuracy predictions. In addition, intrinsic mode functions (IMFs) are generated by the EEMD decomposition to eliminate stochastic volatility, so the prediction effect can be improved. Previous studies have demonstrated that the prediction performance of the hybrid model based on empirical mode decomposition (EMD) is better than a single model in many cases (Lin et al. 2012; Wei & Chen 2012; Wang et al. 2016).

In this study, the EEMD algorithm was chosen as the preprocessing algorithm for short-term water demand data because of its effectiveness in decomposing complex time series. The ARIMA algorithm, as a classical algorithm in traditional statistical methods, is widely used in urban water demand forecasting because of its small computational power and does not consider other influencing factors (Brentan et al. 2017; Oliveira et al. 2017). Compared with AR, MA and ARMA, ARIMA has the advantage of adding a differential process, which can be used for both stationary and nonstationary series, making it more applicable to a wider range of applications. Therefore, among the statistical models we choose the ARIMA model for prediction. A back-propagation (BP) neural network has strong learning ability and high efficiency in data processing, which can adapt to the needs of urban water demand forecasting (Qi & Chang 2011). In order to compare traditional statistical methods with artificial intelligence methods, we chose a BP neural network, which has the advantage of minimizing the error between the output of the network and the ideal value during training based on an empirical risk minimization criterion (Ai et al. 2022). However, the disadvantage of BP is that it may fall into a local optimum solution (Rangel et al. 2016). SVM can overcome some of the shortcomings of BP, mainly because when developing the cost function, SVM applies the principle of structural risk minimization rather than the empirical risk minimization principle used by BP (Tripathi et al. 2006; Bazrkar & Chu 2022). Therefore, in this paper, both BP and SVM are chosen to compare their prediction effects in artificial intelligence algorithms.

The framework of this paper is organized as follows: the section Methodology describes the details of the proposed algorithms. In the section Results of Demonstration Study, prediction results of different models and the impact of length of historical data on the prediction accuracy are presented. Discussions are stated in the section Discussion. Lastly, the main conclusions are summarized in the section Conclusions.

Urban hourly water consumption data are generally characterized by nonlinearity, nonstationarity, randomness and complexity. Here, a hybrid prediction methodology based on the EEMD algorithm was developed to improve the prediction accuracy. It consists of three parts: (1) the multi-scale decomposition tool, EEMD algorithm, to decompose the water consumption data series into several IMFs; (2) ARIMA, BP and SVM models were then used to predict for each IMF; (3) prediction results for IMFs were added together to obtain the final prediction results. The basic framework is shown in Figure 1. Particularly, to demonstrate the reliability and efficacy of EEMD algorithm, three single models (ARIMA, BP and SVM) were firstly developed to provide benchmark prediction results, and then three hybrid models (EEMD-ARIMA, EEMD-BP and EEMD-SVM) were performed for further improvement. The parameter values of the algorithms involved in this study are summarized in Table 1.

Table 1

Parameter values for the involved algorithms

AlgorithmsParametersSingle modelsHybrid models
References
IMF1IMF2IMF3IMF4IMF5IMF6IMF7IMF8Res
EEMD std 0.2 Not applicable Zhang et al. (2008, 2010); Ren et al. (2017)  
NE 100 
ARIMA p 16 10 10 – 
d – 
q 14 10 – 
BP Number of neurons in input layer Guo et al. (2018)  
Number of neurons in hidden layer 11 10 13 10 – 
Training function Trainlm – 
Transfer function ‘tansig’ for hidden layers, ‘purelin’ for output layers Guo et al. (2018)  
Epoch 10,000 – 
Performance 10−5 Chen et al. (2017)  
Training speed 0.01 Herrera et al. (2010)  
Momentum parameter 0.9 – 
SVM -s – 
-t – 
-p 0.01 – 
AlgorithmsParametersSingle modelsHybrid models
References
IMF1IMF2IMF3IMF4IMF5IMF6IMF7IMF8Res
EEMD std 0.2 Not applicable Zhang et al. (2008, 2010); Ren et al. (2017)  
NE 100 
ARIMA p 16 10 10 – 
d – 
q 14 10 – 
BP Number of neurons in input layer Guo et al. (2018)  
Number of neurons in hidden layer 11 10 13 10 – 
Training function Trainlm – 
Transfer function ‘tansig’ for hidden layers, ‘purelin’ for output layers Guo et al. (2018)  
Epoch 10,000 – 
Performance 10−5 Chen et al. (2017)  
Training speed 0.01 Herrera et al. (2010)  
Momentum parameter 0.9 – 
SVM -s – 
-t – 
-p 0.01 – 

Note: -s is the type of SVM, setting to 3 means e-SVR; -t is the type of kernel function, setting to 2 means radial basis function; -p is the value of the loss function in the e-SVR.

Figure 1

Flowchart of this study.

Figure 1

Flowchart of this study.

Close modal

Empirical mode decomposition (EMD)

EMD is a multi-scale analysis method for nonlinear, nonstationary and complex time-series data (Huang et al. 1998). It decomposes the original sequence based on the time-scale characteristics of the data itself and can extract the local information of the feature set of the original data from the time series (Tiwari & Kanungo 2010). Compared with the traditional time-series decomposition method, it has better adaptive characteristics and can effectively decompose the fluctuation of different frequencies in the data step by step to obtain multiple IMFs and residual (Res).

The EMD process for a given time sequence is as follows:

  • (1)

    Let , ;

  • (2)

    Determine the local extremum point of sequence ;

  • (3)

    The upper and lower envelope lines are obtained by local maximum and local minimum interpolation, respectively.

  • (4)
    Calculate the average value of upper and lower envelopes:
    formula
  • (5)

    , if meets the conditions: (1) the number of extreme points (including local maximum points and local minimum points) equals or differs from the number of zero crossings to 1; (2) at any point, the average of the local maximum envelope and the local minimum envelope is 0. is considered to be an IMF, let , ; if is not an IMF, let .

  • (6)

    Repeat steps (2)–(5), until the residual satisfies the stop condition.

Ensemble empirical mode decomposition (EEMD)

Although EMD has been widely successful in applied research, it has the defect of mode mixing (Wang et al. 2012), defined as either a single IMF including a scale of other modes, or a scale existing in different IMFs. Aimed at the problem of mode mixing in EMD, EEMD was proposed.

EEMD adds white noise to original data based on EMD. Normally, suppose the added white noise obeys , data with white noise added are decomposed by EMD. Because white noise is added to disturb the original data before the overall average, the purpose of avoiding mode mixing is achieved (Wu & Huang 2009).

The detailed process of EEMD is as follows:

  • (1)
    Add random white noise that obeys the normal distribution to the original sequence, where is known, ,
    formula
    (1)
    where is the signal after adding white noise for the ith time, is the original signal and is the white noise;
  • (2)

    Use EMD to decompose time-series data after adding white noise;

  • (3)
    Based on different white noise, repeat steps (1)–(2), add new white noise every time, integrate the decomposition results, take the average as the final result:
    formula
    (2)
    where N is the number of white noise added, and is the ith IMF after the jth white noise added.

Autoregressive integrated moving average

ARIMA was first proposed by statisticians Box and Jenkins in the 1970s (Box & Jenkins 1973). On the basis of ARMA , the original nonstationary sequence is processed by difference until it is converted into a stationary sequence and then fitted by ARMA. AR is the part of autoregression, p is the order of autoregression, MA is the part of moving average and q is the order of moving average. I is the part of difference, and d represents the order required to differentiate the original nonstationary series into a stationary series. ARIMA solves the problem of poor fitting effect of the ARMA model for nonstationary, nonlinear and other characteristics of time-series data. In the ARIMA model, it is assumed that the future value of the target variable is a linear function of historical data and error. In daily water-use cases, there is often a certain correlation between hourly water demand, which is consistent with the assumptions of the model, so the model is considered for water demand forecasting in this paper, i.e., it is assumed that the future water demand in the case is linearly correlated with the past water demand. The model expression is as follows:
formula
(3)

where is the sample value, is the order of model, is the current random error interference, and and are the parameters.

The structure of ARIMA is as follows:
formula
(4)
where , B is the delay operator; is a polynomial of all autoregression coefficients in the ARIMA model; is a polynomial of all moving average coefficients in the ARIMA model; , ; represents white noise subject to ; denotes the correlation between factors before and after the sequence.

BP neural network

BP neural network is a feedforward neural network composed of nonlinear transformation units based on the error back-propagation algorithm (gradient descent, supervised learning algorithm). The basic idea is to use the gradient search technology to solve the minimum error function. The basic process mainly includes: (1) forward spread of information and (2) backward propagation of error (Wu et al. 2020a). The basic structure is: one input layer, one output layer and one or more hidden layers (Yaqin et al. 2020). It has powerful computational skills and can learn the input and output relationships of samples through training. It is one of the most mature and widely used data mining technologies so far, and it has broad prospects in the field of classification and prediction.

The structure of the three-layer BP neural network is shown in Figure 2. The input signal is forwarded layer by layer to the output layer through the neurons of the input layer and the hidden layer. The neurons in each layer affect each other. If the result of the output layer has a large error with the expected result, the error is calculated and transmitted backwards along the original path. Then the connection weights between the input neurons and the hidden layer neurons are adjusted to make the error decrease along the gradient direction, and then the error is transferred to the forward propagation process. The training is repeated until the output error falls within the allowable range. The process is shown in Figure 3.

Figure 2

Structure of the BP neural network.

Figure 2

Structure of the BP neural network.

Close modal
Figure 3

Training process of the BP neural network.

Figure 3

Training process of the BP neural network.

Close modal

Support vector machine (SVM)

SVM is a new machine learning technology proposed by Cortes & Vapnik (1995). It is based on the VC dimension theory of statistical learning theory and the minimum principle of structural risk. According to the characteristics of the given data, it finds the most appropriate point between the model complexity and learning ability, so as to obtain the best promotion ability. Compared with the traditional linear model, SVM does not complicate the calculation process. In addition, SVM can obtain the global optimal search process and is suitable for the case of fewer samples. SVM is divided into two categories: one is SVC (support vector classifier) for solving classification problems and the other is SVR (support vector regression) for solving regression problems (Malik et al. 2020). The brief introduction of SVR is as follows:

Training samples , ,where represents the input column vector corresponding to the ith training sample, and is the output value corresponding to . Establishing the linear regression function:
formula
(5)
where w is the weight vector in the feature space, b is the bias and is a nonlinear mapping function.
To optimize formula (5), a minimum w value is needed, which can be solved by convex quadratic programming:
formula
(6)
The constraint conditions are as follows:
formula
(7)
where C is a penalty parameter to balance the generalization ability and complexity of the model; represents the error requirement of the regression function, which ensures the sparsity of the solution and are two slack variables, and they restrict the upper and lower bounds of the output value by using Lagrange function to solve the above optimal model and introducing kernel function.
The inclusion of kernel functions can solve the nonlinearity problem of the input data. Since the historical water demand data include some other information, such as holidays, weather, etc., the nonlinear characteristics of urban water demand can be solved in the new space by mapping the kernel function to a higher dimensional space. The nonlinear SVM function model is obtained as follows:
formula
(8)
where represents the kernel function, and are Lagrange multipliers.

In this study, the LIBSVM toolbox was used to implement the SVM algorithm (Chang & Lin 2011).

Evaluation indicators of model performance

In order to compare the predictive performance of different models, the RMSE, the mean absolute percentage error (MAPE) and the correlation coefficient (R) were chosen as evaluation indicators. The details of these indicators are shown below.

The RMSE reflects the degree of dispersion of the model prediction error. The calculation formula is as follows:
formula
(9)
where is predictand, and is actual values.
The MAPE reflects the prediction accuracy of the model. The calculation formula is as follows:
formula
(10)
where is predictand, and is actual values.
The R is a statistical indicator used to reflect the closeness of the correlation between variables. The calculation formula is as follows:
formula
(11)
where is actual values, is predictand, and and are the means of and , respectively.

Study data

An hourly water supply flow series was used for the case study, which was measured from July 1, 2013 to July 31, 2013 for a water supply division of Shanghai, with a total of 744 data points in the entire study period, as is shown in Figure 4. The July water demand data were chosen because the forecast is on an hourly scale and has little correlation with seasonal factors; in addition, July is in the summer months and covers part of the maximum water demand to a certain extent, which is generally high and highly volatile. As water management is considered as well as short-term scheduling optimization of pumping stations, the objective of the model is to predict the future hourly water demand in real time and the historical data are updated hourly; furthermore, it appears that there is a cyclical nature to the daily hourly water demand in the short term. Therefore, the last 24-hourly data points (July 31, 2013) are used as the actual values for verification of prediction models. To ensure that the model was fully trained, we chose to use all the remaining historical data as the training set, so the first 720 data (July 1, 2013–July 30, 2013) are used as the known historical data to train the models.

Figure 4

Actual water supply flow series.

Figure 4

Actual water supply flow series.

Close modal

Outlier analysis

We use MATLAB to perform an outlier test on the original sequence with the method ‘median’, which is described as returning ‘true’ for elements more than three scaled MAD from the median. The scaled MAD is defined as c*median(abs(A − median(A))), where c = −1/(sqrt (2) *erfcinv (3/2)). Detected outliers are highlighted in Figure 5. By analyzing the outliers, we found that most of the outliers are water demand in the early morning hours and a small number of water demand data in the peak hours. This part of the outliers contributes to a certain volatility in the whole series and is consistent with the daily water-use pattern, so we choose to keep the outliers.

Figure 5

Outliers in original data.

Figure 5

Outliers in original data.

Close modal

Nonlinear analysis

The linear and nonlinear characteristics of the data can be tested by the BDS test method proposed by Broock et al. (1996), that is, the linear regression residual method. Before performing the BDS test, in order to eliminate the linear correlation component in the original series, the linear autoregression of the original series is first performed by the AR model and calculated by the fitted residuals. Next the BDS test was carried out with the aid of Eviews software, where the parameters can be based on Brock's suggestion , and r is 0.7 times the variance of the data in the phase space (Broock et al. 1996). The test result of , which is less than 0.05, indicates that the data have nonlinear characteristics, as shown in Table 2.

Table 2

Results of the nonlinear test

DimensionsBDS statisticStd. errorz-statisticProb.
0.085532 0.004984 17.161730 0.00000 
0.137316 0.005686 24.149430 0.00000 
0.174202 0.004877 35.716930 0.00000 
DimensionsBDS statisticStd. errorz-statisticProb.
0.085532 0.004984 17.161730 0.00000 
0.137316 0.005686 24.149430 0.00000 
0.174202 0.004877 35.716930 0.00000 

Data normalization

Before using the BP and SVM algorithms, it is important to normalize the data. Normalizing the data allows the data to be unified in the same range, which reduces the size of the data and facilitates the calculation. In this paper, we use the mapminmax function in Matlab to normalize the data with the following formula:
formula
(12)
where , are the minimum and maximum values for each row of the output matrix y, respectively, specified as scalars. x is the matrix you want to process, specified as an N-by-Q matrix. , are the minimum and maximum values for each row of the input matrix x, respectively.

Prediction results of the single models

First of all, three commonly used single models (ARIMA, BP and SVM) were used for prediction. Figure 6 and Table 3 show the predicted results and performance of the three single models. It can be seen that MAPEs of three single models were all lower than 10%, which mirrors that prediction performance of three single models was suitable for hourly water consumption prediction scenarios. Among them, MAPEs of BP and SVM algorithms were lower than 5% and better than that of ARIMA probably because of their nonlinear modeling ability.

Table 3

Evaluation indicators of single models

ModelsRMSE (m3/h)MAPE (%)R (%)
ARIMA 3,626.1686 5.7891 88.13 
BP 3,183.7223 4.3857 91.72 
SVM 3,199.7851 3.8470 90.18 
ModelsRMSE (m3/h)MAPE (%)R (%)
ARIMA 3,626.1686 5.7891 88.13 
BP 3,183.7223 4.3857 91.72 
SVM 3,199.7851 3.8470 90.18 
Figure 6

Prediction results of three single models.

Figure 6

Prediction results of three single models.

Close modal

Prediction results of three hybrid models

In order to further improve the prediction performance, the EEMD algorithm was introduced to develop hybrid prediction models. Based on the above three single models and the EEMD algorithm, three hybrid models (EEMD-ARIMA, EEMD-BP and EEMD-SVM) were developed and compared in the following sections.

Decomposition results of original data based on EEMD

The MATLAB EEMD toolkit was used to decompose the original hourly water supply flow series. The standard deviation of the added white noise was set to 0.2, and the number of iterations NE was set to 100 (Zhang et al. 2008, 2010; Ren et al. 2017). The decomposition results are shown in Figure 7. The original water supply flow series were decomposed into eight independent IMF components (IMF1, IMF2, …, IMF8) and a Res component, which were arranged in the order of frequency from high to low. It can be seen from Figure 7 that the periodicity of eight IMFs increases gradually with the decrease of frequency, while the amplitude of eight IMFs decreases in turn.

Figure 7

Decomposition results of the original series by the EEMD algorithm.

Figure 7

Decomposition results of the original series by the EEMD algorithm.

Close modal

Prediction results of EEMD-ARIMA

The ARIMA algorithm was used to predict eight IMFs and Res components, which were obtained by EEMD decomposition, respectively. For each component, the first 720 data were used as input to the ARIMA model to predict the water demand in the next 24 h. The model prediction results are obtained by summing the predictions of eight components. ARIMA model parameters (p, d, q) corresponding to different components (or IMFs) are shown in Table 4. The prediction results are shown in Figure 8. The final prediction results were compared with the actual values (24 data), and the performance of EEMD-ARIMA model was evaluated by RMSE, MAPE and R. According to calculation results, RMSE, MAPE and R of the prediction of EEMD-ARIMA were 3,713.1656, 5.2036 and 86.36%, respectively.

Table 4

EEMD-ARIMA parameter settings

Components (IMFs)pdq
IMF1 
IMF2 
IMF3 
IMF4 10 
IMF5 10 
IMF6 
IMF7 
IMF8 
Res 10 
Components (IMFs)pdq
IMF1 
IMF2 
IMF3 
IMF4 10 
IMF5 10 
IMF6 
IMF7 
IMF8 
Res 10 

Note: p is the order of autoregression, q is the order of moving average and d is the order of difference.

Figure 8

Result of the EEMD-ARIMA model.

Figure 8

Result of the EEMD-ARIMA model.

Close modal

Prediction results of EEMD-BP

A three-layer BP neural network model was used for prediction here. The number of neurons of the output layer was set to 1, which represents the water demand () to be predicted at time t. The number of input layer neurons was set to 5, which means that water supply data at should be used for prediction (Guo et al. 2018). The model was shown as the following formula:
formula
(13)
where was water supply flow at hour i.
The number of neurons of the hidden layer should be calculated according to the following formula:
formula
(14)
where m is the neuron number of input layer; n is the number of input layer neurons; a is a parameter and its range is [1,10]. In the case study, the resulting range of the neuron number of hidden layers was [3,13]. The number of the hidden layer neurons for eight components (IMFs) was optimized by the enumeration method, and results are shown in Table 5. Other parameters of the BP algorithm are shown in Table 6. The predicted results are shown in Figure 9. RMSE, MAPE and R of EEMD-BP were 1,036.7634, 1.4460 and 98.91%, respectively, which were all better than that of the single BP model. It shows that EEMD decomposition measures can improve the prediction performance of the BP algorithm.
Table 5

Number of hidden layer neurons in different components

ComponentsNumber of hidden layer neurons
IMF1 
IMF2 
IMF3 11 
IMF4 
IMF5 10 
IMF6 
IMF7 
IMF8 13 
Res 10 
ComponentsNumber of hidden layer neurons
IMF1 
IMF2 
IMF3 11 
IMF4 
IMF5 10 
IMF6 
IMF7 
IMF8 13 
Res 10 
Table 6

BP neural network algorithm parameter settings

ParametersValue settings
Training function Trainlm 
Transfer function ‘tansig’ for hidden layers, ‘purelin’ for output layers 
Epoch 10,000 
Performance 10−5 
Training speed 0.01 
Momentum parameter 0.9 
ParametersValue settings
Training function Trainlm 
Transfer function ‘tansig’ for hidden layers, ‘purelin’ for output layers 
Epoch 10,000 
Performance 10−5 
Training speed 0.01 
Momentum parameter 0.9 
Figure 9

Result of the EEMD-BP model.

Figure 9

Result of the EEMD-BP model.

Close modal

Prediction results of EEMD-SVM

Similar to the EEMD-BP model, the output of the EEMD-SVM model was the predicted value (water demand flow) at time t, while its inputs were the water supply data at times. The form of the SVM model is shown in the following formula:
formula
(15)

Some of the parameter settings in the LIBSVM toolbox are shown in Table 7.

Table 7

Parameter settings of the LIBSVM toolbox

Parameter-s-t-p
Value settings 0.01 
Parameter-s-t-p
Value settings 0.01 

Note: -s is the type of SVM, setting to 3 means e-SVR; -t is the type of kernel function, setting to 2 means radial basis function; -p is the value of the loss function in the e-SVR.

The final prediction result is shown in Figure 10. RMSE, MAPE and R of EEMD-SVM were 892.9561, 1.3424 and 99.21%, respectively. Firstly, they were all better than the single SVM model, which mirrors that EEMD decomposition measures can improve the prediction performance of the SVM algorithm dramatically. Second, the prediction result of EEMD-SVM was better than that of EEMD-BP.

Figure 10

Result of the EEMD-SVM model.

Figure 10

Result of the EEMD-SVM model.

Close modal

Impact of historical data length on model prediction accuracy

In the real practice, time-consumption of computers will increase if the historical training data used for the above models are too long. Conversely, model performance will be reduced significantly. Therefore, the appropriate length of historical data was an important parameter to be optimized in the actual project. Taking MAPE and RMSE as evaluation indicators, the impacts of the historical data lengths on the prediction performance of three hybrid models were investigated.

Impact of historical data length on EEMD-ARIMA model performance

In order to find the relationship between the length of the training data and the prediction performance, the adjacent 1, 2, 3, …, 11, 12 days of the historical data (training data) were used to predict the hourly water consumption (24 data) on July 31, respectively. The fitting results are shown in Figure 11, and variations of MAPE and RMSE are shown in Figure 12. From Figures 11 and 12, we can see that MAPE and RMSE of EEMD-ARIMA models decrease gradually with the increase of the length of historical data. Considering the requirements of reducing the length of training data and the accuracy of the prediction model, the optima of the historical data was selected as 11 days.

Figure 11

Impacts of different length of training data on EEMD-ARIMA prediction performance.

Figure 11

Impacts of different length of training data on EEMD-ARIMA prediction performance.

Close modal
Figure 12

Comparison of RMSE and MAPE for different data lengths for the EEMD-ARIMA model.

Figure 12

Comparison of RMSE and MAPE for different data lengths for the EEMD-ARIMA model.

Close modal

Impact of historical data length on EEMD-BP model performance

Similarly, for the EEMD-BP model, the adjacent 1, 2, 3, …, 11, 12 days of the historical data (training data) were used to predict the hourly water consumption (24 data) on July 31, respectively. The fitting results are shown in Figure 13, and variations of MAPE and RMSE are shown in Figure 14. From Figures 13 and 14, we can see that MAPE and RMSE of EEMD-BP models decrease gradually with the increase of the length of historical data. Considering the requirements of reducing the length of training data and the accuracy of the prediction model, the optima of the historical data was selected as 11 days.

Figure 13

Impacts of different length of training data on EEMD-BP prediction performance.

Figure 13

Impacts of different length of training data on EEMD-BP prediction performance.

Close modal
Figure 14

Comparison of RMSE and MAPE for different data lengths for the EEMD-BP model.

Figure 14

Comparison of RMSE and MAPE for different data lengths for the EEMD-BP model.

Close modal

Impact of historical data length on EEMD-SVM model performance

For the EEMD-SVM model, the adjacent 1, 2, 3, …, 11, 12 days of the historical data (training data) were also used to predict the hourly water consumption (24 data) on July 31, respectively. The fitting results are shown in Figure 15, and variations of MAPE and RMSE are shown in Figure 16. From Figures 15 and 16, we can see that MAPE and RMSE of EEMD-SVM models generally decrease with the increase of the length of historical data. Considering the requirements of reducing the length of training data and the accuracy of the prediction model, the optima of the historical data was selected as 10 days.

Figure 15

Impacts of different length of training data on EEMD-SVM prediction performance.

Figure 15

Impacts of different length of training data on EEMD-SVM prediction performance.

Close modal
Figure 16

Comparison of RMSE and MAPE for different data lengths for the EEMD-SVM model.

Figure 16

Comparison of RMSE and MAPE for different data lengths for the EEMD-SVM model.

Close modal

Three single models (ARIMA, BP and SVM) and three hybrid models (EEMD-ARIMA, EEMD-BP and EEMD-SVM) were investigated and compared in this study, as shown in Table 8.

Table 8

Comparison of prediction performance of six models

Forecasting modelsRMSE (m3/h)MAPE (%)R (%)
ARIMA 3,626.1686 5.7891 88.13 
BP 3,183.7223 4.3857 91.72 
SVM 3,199.7851 3.8470 90.18 
EEMD-ARIMA 3,713.1656 5.2036 86.36 
EEMD-BP 1,036.7634 1.4460 98.91 
EEMD-SVM 892.9561 1.3424 99.21 
Forecasting modelsRMSE (m3/h)MAPE (%)R (%)
ARIMA 3,626.1686 5.7891 88.13 
BP 3,183.7223 4.3857 91.72 
SVM 3,199.7851 3.8470 90.18 
EEMD-ARIMA 3,713.1656 5.2036 86.36 
EEMD-BP 1,036.7634 1.4460 98.91 
EEMD-SVM 892.9561 1.3424 99.21 

According to results of the above six models, the optimal prediction model can be selected in terms of prediction accuracy and computation time. According to Table 8, EEMD-SVM performed best among the six models, followed by EEMD-BP, SVM and BP models. First, we can see that the prediction performance of the artificial intelligent models (BP and SVM) was superior to that of the linear model ARIMA according to the above study results. This result is consistent with the previous literature (Li & Huicheng 2009; Antunes et al. 2018). The water consumption process is commonly affected by different kinds of factors and is complicated, nonstationary and nonlinear. The artificial intelligent models, based on deep machine learning, can deal with these different characteristics better than the traditional linear models. In addition, in terms of computer time, the artificial intelligent models (BP and SVM) were less than the linear model (ARIMA). Although their physical mechanisms are still not clear, the artificial intelligent prediction models (deep leaning or data-driven models) will be more and more widely used in future. Second, according to the above study results, the EEMD hybrid models were better than the single models, which reveal that EEMD was an effective algorithm to improve the prediction models. This is supported by previous studies on predictive models with a mixture of EEMD (Lin et al. 2012; Wei & Chen 2012; Wang et al. 2016). Coupling with EEMD decomposition, MAPE of ARIMA, BP and SVM models decreased by 10.114, 67.029 and 65.105%, respectively. Because the suitable models can be selected for the different variation characteristics of water demand series, the decomposition and ensemble prediction methodologies still have great potential for improving the prediction accuracy. Finally, seasonal factors have little influence on short-term (e.g., hourly) water consumption prediction. For forecasting on hourly scales, the use of historical data alone is sufficient (Cutore et al. 2008; Bakker et al. 2013b; Guo et al. 2018). Therefore, the advantages of ARIMA cannot be reflected and it is not suitable for this occasion.

Since the EEMD-SVM model performed the best of all models, we simulated the model again using longer data, predicting hourly water demand data for the coming week, and the fit results are shown in Figure 17. After calculation, MAPE = 1.24%, again with satisfactory performance. Therefore, the EEMD-SVM model is valuable for use in future pumping station scheduling and management.

Figure 17

Results of EEMD-SVM forecast of hourly water demand for the coming week.

Figure 17

Results of EEMD-SVM forecast of hourly water demand for the coming week.

Close modal
Figure 18

The PACF plot of 12-day historical data.

Figure 18

The PACF plot of 12-day historical data.

Close modal

Besides the prediction accuracy, the computer time of the prediction algorithm was another factor to consider. For this end, the historical data used for algorithm training (or deep learning) were particularly investigated and the optimal length of the training data was recommended for different prediction models. The results are shown in Table 9. In addition, we also performed partial autocorrelation analysis on the historical data, and the results showed that the partial autocorrelation was higher than the other ranges when lag took values between 150 and 200, indicating that the data there had certain influence on the prediction accuracy, as shown in Figure 18. Further study revealed that the lag in the interval of 150–200 corresponds to a point in time a few days before the week nearby, which is consistent with a week-long water-use pattern, because there are two significant modes, working day mode and rest day mode for the water consumption process. From this perspective, the historical data should include all the two modes. Therefore, the length of training data should be larger than 7 days (one cycle). Taking into account the above reasons and the simulation results in Table 9, 10 days or 11 days was the optimal length for the training data of EEMD hybrid models.

Table 9

Model evaluation indicators using different time lengths

Historical data length (d)Forecasting modelsRMSE (m3/h)MAPE (%)
EEMD-ARIMA 117,554.6544 167.1663 
EEMD-BP 4,282.9784 5.9437 
EEMD-SVM 2,978.1158 4.1900 
EEMD-ARIMA 25,529.9031 23.7486 
EEMD-BP 2,360.8980 3.3854 
EEMD-SVM 1,187.2721 1.7300 
EEMD-ARIMA 7,984.7288 11.3634 
EEMD-BP 2,467.6211 3.6033 
EEMD-SVM 1,292.8952 2.0000 
EEMD-ARIMA 13,003.7349 18.1692 
EEMD-BP 1,920.4515 2.4817 
EEMD-SVM 1,062.3696 1.6000 
EEMD-ARIMA 7,585.2915 10.5601 
EEMD-BP 1,870.5918 2.8169 
EEMD-SVM 1,037.3897 1.5100 
EEMD-ARIMA 6,261.3510 6.9679 
EEMD-BP 1,123.9871 1.6762 
EEMD-SVM 971.2917 1.3400 
EEMD-ARIMA 4,956.9568 6.5327 
EEMD-BP 1,170.9357 1.7468 
EEMD-SVM 945.0464 1.4500 
EEMD-ARIMA 3,459.7802 4.5406 
EEMD-BP 1,345.7225 1.6781 
EEMD-SVM 937.1297 1.3900 
EEMD-ARIMA 3,441.4380 4.7515 
EEMD-BP 1,211.5804 1.8909 
EEMD-SVM 1,058.1079 1.5600 
10 EEMD-ARIMA 3,473.5999 4.4029 
EEMD-BP 1,320.6420 1.8609 
EEMD-SVM 694.4904 1.0000 
11 EEMD-ARIMA 2,799.1188 3.5437 
EEMD-BP 938.6738 1.3486 
EEMD-SVM 735.0543 1.0400 
12 EEMD-ARIMA 4,480.1003 6.3668 
EEMD-BP 1,401.2050 1.9151 
EEMD-SVM 986.4205 1.4700 
Historical data length (d)Forecasting modelsRMSE (m3/h)MAPE (%)
EEMD-ARIMA 117,554.6544 167.1663 
EEMD-BP 4,282.9784 5.9437 
EEMD-SVM 2,978.1158 4.1900 
EEMD-ARIMA 25,529.9031 23.7486 
EEMD-BP 2,360.8980 3.3854 
EEMD-SVM 1,187.2721 1.7300 
EEMD-ARIMA 7,984.7288 11.3634 
EEMD-BP 2,467.6211 3.6033 
EEMD-SVM 1,292.8952 2.0000 
EEMD-ARIMA 13,003.7349 18.1692 
EEMD-BP 1,920.4515 2.4817 
EEMD-SVM 1,062.3696 1.6000 
EEMD-ARIMA 7,585.2915 10.5601 
EEMD-BP 1,870.5918 2.8169 
EEMD-SVM 1,037.3897 1.5100 
EEMD-ARIMA 6,261.3510 6.9679 
EEMD-BP 1,123.9871 1.6762 
EEMD-SVM 971.2917 1.3400 
EEMD-ARIMA 4,956.9568 6.5327 
EEMD-BP 1,170.9357 1.7468 
EEMD-SVM 945.0464 1.4500 
EEMD-ARIMA 3,459.7802 4.5406 
EEMD-BP 1,345.7225 1.6781 
EEMD-SVM 937.1297 1.3900 
EEMD-ARIMA 3,441.4380 4.7515 
EEMD-BP 1,211.5804 1.8909 
EEMD-SVM 1,058.1079 1.5600 
10 EEMD-ARIMA 3,473.5999 4.4029 
EEMD-BP 1,320.6420 1.8609 
EEMD-SVM 694.4904 1.0000 
11 EEMD-ARIMA 2,799.1188 3.5437 
EEMD-BP 938.6738 1.3486 
EEMD-SVM 735.0543 1.0400 
12 EEMD-ARIMA 4,480.1003 6.3668 
EEMD-BP 1,401.2050 1.9151 
EEMD-SVM 986.4205 1.4700 

In this study, prediction performance of three single models (ARIMA, BP and SVM) and three hybrid models (EEMD-ARIMA, EEMD-BP and EEMD-SVM) were investigated and compared based on three evaluation indicators (RMSE, MAPE and R). It can be concluded that (1) the MAPEs of EEMD-ARIMA, EEMD-BP, EEMD-SVM, ARIMA, BP and SVM are 5.2036, 1.4460, 1.3424, 5.7891, 4.3857 and 3.8470, respectively. (2) In terms of prediction accuracy and actual practice convenience, EEMD-SVM performs best among the above six models. (3) EEMD algorithm is effective for improving the prediction accuracy of six models. (4) The optimal historical data lengths of EEMD-ARIMA, EEMD-BP and EEMD-SVM are 11, 11 and 10 days, respectively.

We are very grateful to the editors and anonymous reviewers for their insightful suggestions and comments on this paper.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

All relevant data are included in the paper or its Supplementary Information.

Abiodun
F. T.
&
Ismail
F. S.
2013
Pump scheduling optimization model for water supply system using AWGA
. In:
2013 IEEE Symposium on Computers & Informatics (ISCI)
,
April 7–9, 2013
, pp.
12
17
.
Ai
P.
,
Song
Y.
,
Xiong
C.
,
Chen
B.
&
Yue
Z.
2022
A novel medium- and long-term runoff combined forecasting model based on different lag periods
.
Journal of Hydroinformatics
.
https://doi.org/10.2166/hydro.2022.116
.
Antunes
A.
,
Andrade-Campos
A.
,
Sardinha-Lourenço
A.
&
Oliveira
M. S.
2018
Short-term water demand forecasting using machine learning techniques
.
Journal of Hydroinformatics
20
(
6
),
1343
1366
.
doi:10.2166/hydro.2018.163
.
Bakker
M.
,
Vreeburg
J.
,
Palmen
L. J.
,
Sperber
V.
,
Bakker
G.
&
Rietveld
L. C.
2013a
Better water quality and higher energy efficiency by using model predictive flow control at water supply systems
.
Journal of Water Supply:Research and Technology – AQUA
62
(
1
),
1
13
.
https://doi.org/10.2166/aqua.2013.063
.
Bakker
M.
,
Vreeburg
J. H. G.
,
van Schagen
K. M.
&
Rietveld
L. C.
2013b
A fully adaptive forecasting model for short-term drinking water demand
.
Environmental Modelling & Software
48
,
141
151
.
https://doi.org/10.1016/j.envsoft.2013.06.012
.
Bazrkar
M. H.
&
Chu
X.
2022
Development of category-based scoring support vector regression (CBS-SVR) for drought prediction
.
Journal of Hydroinformatics
24
(
1
),
202
222
.
https://doi.org/10.2166/hydro.2022.104
.
Bougadis
J.
,
Adamowski
K.
&
Diduch
R.
2005
Short-term municipal water demand forecasting
.
Hydrological Processes
19
(
1
),
137
148
.
https://doi.org/10.1002/hyp.5763
.
Box
G. E. P.
&
Jenkins
G. M.
1973
Some comments on a paper by Chatfield and Prothero and on a review by Kendall
.
Journal of the Royal Statistical Society Series A (General)
136
(
3
),
337
352
.
https://doi.org/10.2307/2344995
.
Brentan
B. M.
,
Luvizotto
E.
Jr
,
Herrera
M.
,
Izquierdo
J.
&
Pérez-García
R.
2017
Hybrid regression model for near real-time urban water demand forecasting
.
Journal of Computational and Applied Mathematics
309
,
532
541
.
https://doi.org/10.1016/j.cam.2016.02.009
.
Broock
W. A.
,
Scheinkman
J. A.
,
Dechert
W. D.
&
LeBaron
B.
1996
A test for independence based on the correlation dimension
.
Econometric Reviews
15
(
3
),
197
235
.
https://doi.org/10.1080/07474939608800353
.
Candelieri
A.
,
Giordani
I.
,
Archetti
F.
,
Barkalov
K.
,
Meyerov
I.
,
Polovinkin
A.
,
Sysoyev
A.
&
Zolotykh
N.
2019
Tuning hyperparameters of a SVM-based water demand forecasting system through parallel global optimization
.
Computers and Operations Research
106
,
202
209
.
https://doi.org/10.1016/j.cor.2018.01.013
.
Carpitella
S.
,
Brentan
B.
,
Montalvo
I.
,
Izquierdo
J.
&
Certa
A.
2019
Multi-criteria analysis applied to multi-objective optimal pump scheduling in water systems
.
Water Supply
19
(
8
),
2338
2346
.
https://doi.org/10.2166/ws.2019.115
.
Chang
C.-C.
&
Lin
C.-J.
2011
LIBSVM: a library for support vector machines
.
ACM Transactions on Intelligent Systems and Technology
2
(
3
),
Article 27
.
https://doi.org/10.1145/1961189.1961199
.
Chen
G.
,
Long
T.
,
Xiong
J.
&
Bai
Y.
2017
Multiple random forests modelling for urban water consumption forecasting
.
Water Resources Management
31
(
15
),
4715
4729
.
https://doi.org/10.1007/s11269-017-1774-7
.
Cortes
C.
&
Vapnik
V.
1995
Support-vector networks
.
Machine Learning
20
(
3
),
273
297
.
https://doi.org/10.1007/BF00994018
.
Cutore
P.
,
Campisano
A.
,
Kapelan
Z.
,
Modica
C.
&
Savic
D.
2008
Probabilistic prediction of urban water consumption using the SCEM-UA algorithm
.
Urban Water Journal
5
(
2
),
125
132
.
https://doi.org/10.1080/15730620701754434
.
de la Perrière
L. B.
,
Jouglet
A.
,
Nace
A.
,
Nace
D.
2014
Water planning and management: an extended model for the real-time pump scheduling problem
. In:
Advances in Hydroinformatics: SIMHYDRO 2012 – New Frontiers of Simulation
(
Gourbesville
P.
,
Cunge
J.
&
Caignaert
G.
eds.).
Springer
,
Singapore
, pp.
153
170
.
Donkor
E. A.
,
Mazzuchi
T. A.
,
Soyer
R.
&
Roberson
J. A.
2014
Urban water demand forecasting: review of methods and models
.
Journal of Water Resources Planning and Management
140
(
2
).
https://doi.org/10.1061/(ASCE)WR.1943-5452.0000314
.
Du
B.
,
Zhou
Q.
,
Guo
J.
,
Guo
S.
&
Wang
L.
2021
Deep learning with long short-term memory neural networks combining wavelet transform and principal component analysis for daily urban water demand forecasting
.
Expert Systems with Applications
171
,
114571
.
https://doi.org/10.1016/j.eswa.2021.114571
.
Ghaddar
B.
,
Naoum-Sawaya
J.
,
Kishimoto
A.
,
Taheri
N.
&
Eck
B.
2015
A Lagrangian decomposition approach for the pump scheduling problem in water networks
.
European Journal of Operational Research
241
(
2
),
490
501
.
https://doi.org/10.1016/j.ejor.2014.08.033
.
Goldstein
R.
&
Smith
W.
2002
Water & Sustainability (Volume 4): U.S. Electricity Consumption for Water Supply & Treatment – The Next Half Century
. EPRI, Palo Alto, CA.
Guo
G.
,
Liu
S.
,
Wu
Y.
,
Li
J.
,
Zhou
R.
&
Zhu
X.
2018
Short-term water demand forecast based on deep learning method
.
Journal of Water Resources Planning and Management
144
(
12
),
04018076
.
https://doi.org/10.1061/(ASCE)WR.1943-5452.0000992
.
Herrera
M.
,
Torgo
L.
,
Izquierdo
J.
&
Pérez-García
R.
2010
Predictive models for forecasting hourly urban water demand
.
Journal of Hydrology
387
(
1
),
141
150
.
https://doi.org/10.1016/j.jhydrol.2010.04.005
.
Huang
N. E.
,
Shen
Z.
,
Long
S. R.
,
Wu
M. C.
,
Shih
H. H.
,
Zheng
Q.
,
Yen
N.-C.
,
Tung
C. C.
&
Liu
H. H.
1998
The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis
.
Proceedings of the Royal Society of London Series A: Mathematical, Physical and Engineering Sciences
454
(
1971
),
903
995
.
https://doi.org/10.1098/rspa.1998.0193
.
Li
W.
&
Huicheng
Z.
2009
Urban water demand forecasting based on HP filter and fuzzy neural network
.
Journal of Hydroinformatics
12
(
2
),
172
184
.
https://doi.org/10.2166/hydro.2009.082
.
Lin
C.-S.
,
Chiu
S.-H.
&
Lin
T.-Y.
2012
Empirical mode decomposition-based least squares support vector regression for foreign exchange rate forecasting
.
Economic Modelling
29
(
6
),
2583
2590
.
https://doi.org/10.1016/j.econmod.2012.07.018
.
Luna
T.
,
Ribau
J.
,
Figueiredo
D.
&
Alves
R.
2019
Improving energy efficiency in water supply systems with pump scheduling optimization
.
Journal of Cleaner Production
213
,
342
356
.
https://doi.org/10.1016/j.jclepro.2018.12.190
.
Mala-Jetmarova
H.
,
Sultanova
N.
&
Savic
D.
2017
Lost in optimisation of water distribution systems? A literature review of system operation
.
Environmental Modelling & Software
93
,
209
254
.
https://doi.org/10.1016/j.envsoft.2017.02.009
.
Malik
A.
,
Tikhamarine
Y.
,
Souag-Gamane
D.
,
Kisi
O.
&
Pham
Q. B.
2020
Support vector regression optimized by meta-heuristic algorithms for daily streamflow prediction
.
Stochastic Environmental Research and Risk Assessment
34
(
11
),
1755
1773
.
https://doi.org/10.1007/s00477-020-01874-1
.
Mikut
R.
&
Reischl
M.
2011
Data mining tools
.
Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery
1
(
5
),
431
443
.
https://doi.org/10.1002/widm.24
.
Odan
F. K.
&
Reis
L. F. R.
2012
Hybrid water demand forecasting model associating artificial neural network with Fourier series
.
Journal of Water Resources Planning and Management
138
(
3
).
https://doi.org/10.1061/(ASCE)WR.1943-5452.0000177
.
Oliveira
P. J.
,
Steffen
J. L.
&
Cheung
P.
2017
Parameter estimation of seasonal arima models for water demand forecasting using the harmony search algorithm
.
Procedia Engineering
186
,
177
185
.
https://doi.org/10.1016/j.proeng.2017.03.225
.
Piasecki
A.
,
Jurasz
J.
&
Kaźmierczak
B.
2018
Forecasting daily water consumption: a case study in Torun, Poland
.
Periodica Polytechnica Civil Engineering
62
(
3
),
818
824
.
https://doi.org/10.3311/PPci.11930
.
Qi
C.
&
Chang
N.-B.
2011
System dynamics modeling for municipal water demand estimation in an urban region under uncertain economic impacts
.
Journal of Environmental Management
92
(
6
).
https://doi.org/10.1016/j.jenvman.2011.01.020
.
Qin
M.
,
Li
Z.
&
Du
Z.
2017
Red tide time series forecasting by combining ARIMA and deep belief network
.
Knowledge-Based Systems
125
,
39
52
.
https://doi.org/10.1016/j.knosys.2017.03.027
.
Rangel
H. R.
,
Puig
V.
,
Farias
R. L.
&
Flores
J. J.
2016
Short-term demand forecast using a bank of neural network models trained using genetic algorithms for the optimal management of drinking water networks
.
Journal of Hydroinformatics
19
(
1
),
1
16
.
https://doi.org/10.2166/hydro.2016.199
.
Ren
Y.
,
Suganthan
P. N.
&
Srikanth
N.
2017
A comparative study of empirical mode decomposition-based short-term wind speed forecasting methods
.
IEEE Transactions on Sustainable Energy
6
(
1
),
236
244
.
https://doi.org/10.1109/TSTE.2014.2365580
.
Romano
M.
&
Kapelan
Z.
2014
Adaptive water demand forecasting for near real-time management of smart water distribution systems
.
Environmental Modelling & Software
60
,
265
276
.
https://doi.org/10.1016/j.envsoft.2014.06.016
.
Shu
S.
,
Dong
Z.
,
Liu
S.
,
Zhao
M.
&
Zhao
H.
2010
Power saving in water supply system with pump operation optimization
. In:
2010 Asia-Pacific Power and Energy Engineering Conference
.
Shukur
O. B.
&
Lee
M. H.
2015
Daily wind speed forecasting through hybrid KF-ANN model based on ARIMA
.
Renewable Energy
76
,
637
647
.
https://doi.org/10.1016/j.renene.2014.11.084
.
Tiwari
A.
&
Kanungo
P.
2010
Dynamic load balancing algorithm for scalable heterogeneous web server cluster with content awareness
. In
Trendz in Information Sciences & Computing (TISC2010)
,
December 17–19, 2010
, pp.
143
148
.
Tripathi
S.
,
Srinivas
V. V.
&
Nanjundiah
R. S.
2006
Downscaling of precipitation for climate change scenarios: a support vector machine approach
.
Journal of Hydrology
330
(
3
),
621
640
.
https://doi.org/10.1016/j.jhydrol.2006.04.030
.
Vijai
P.
&
Sivakumar
P. B.
2018
Performance comparison of techniques for water demand forecasting
.
Procedia Computer Science
143
,
258
266
.
https://doi.org/10.1016/j.procs.2018.10.394
.
Wang
T.
,
Zhang
M.
,
Yu
Q.
&
Zhang
H.
2012
Comparing the applications of EMD and EEMD on time–frequency analysis of seismic signal
.
Journal of Applied Geophysics
83
,
29
34
.
https://doi.org/10.1016/j.jappgeo.2012.05.002
.
Wang
S.
,
Zhang
N.
,
Wu
L.
&
Wang
Y.
2016
Wind speed forecasting based on the hybrid ensemble empirical mode decomposition and GA-BP neural network method
.
Renewable Energy
94
,
629
636
.
https://doi.org/10.1016/j.renene.2016.03.103
.
Wei
Y.
&
Chen
M. C.
2012
Forecasting the short-term metro passenger flow with empirical mode decomposition and neural networks
.
Transp Res Pt C-Emerg Technol
21
(
1
),
148
162
.
https://doi.org/10.1016/j.trc.2011.06.009
.
Wu
Z.
&
Huang
N. E.
2009
Ensemble empirical mode decomposition: a noise-assisted data analysis method
.
Advances in Adaptive Data Analysis
01
(
01
),
1
41
.
https://doi.org/10.1142/S1793536909000047
.
Wu
D.
,
Zhang
D.
,
Liu
S.
,
Jin
Z.
,
Chowwanonthapunya
T.
,
Gao
J.
&
Li
X.
2020a
Prediction of polycarbonate degradation in natural atmospheric environment of China based on BP-ANN model with screened environmental factors
.
Chemical Engineering Journal
399
.
https://doi.org/10.1016/j.cej.2020.125878
.
Wu
S.
,
Han
H.
,
Hou
B.
&
Diao
K.
2020b
Hybrid model for short-term water demand forecasting based on error correction using chaotic time series
.
Water
12
(
6
),
1683
.
https://doi.org/10.3390/w12061683
.
Yaqin
W.
,
Ronglei
G.
&
Jinzhen
Y.
2020
Prediction of coal and gas outburst: a method based on the BP neural network optimized by GASA
.
Process Safety and Environmental Protection
133
,
64
72
.
https://doi.org/10.1016/j.psep.2019.10.002
.
Yin
Z.
,
Jia
B.
,
Wu
S.
,
Dai
J.
&
Tang
D.
2018
Comprehensive forecast of urban water-energy demand based on a neural network model
.
Water
10
(
4
).
https://doi.org/10.3390/w10040385
.
Zhang
X.
,
Lai
K. K.
&
Wang
S.-Y.
2008
A new approach for crude oil price analysis based on empirical mode decomposition
.
Energy Economics
30
(
3
),
905
918
.
https://doi.org/10.1016/j.eneco.2007.02.012
.
Zhang
J.
,
Yan
R.
,
Gao
R. X.
&
Feng
Z.
2010
Performance enhancement of ensemble empirical mode decomposition
.
Mechanical Systems and Signal Processing
24
(
7
),
2104
2123
.
https://doi.org/10.1016/j.ymssp.2010.03.003
.
Zhang
C.
,
Wei
H.
,
Zhao
J.
,
Liu
T.
,
Zhu
T.
&
Zhang
K.
2016
Short-term wind speed forecasting using empirical mode decomposition and feature selection
.
Renewable Energy
96
,
727
737
.
https://doi.org/10.1016/j.renene.2016.05.023
.
Zhang
J.
,
Wei
Y.
,
Tan
Z.-f.
,
Wang
K.
,
Tian
W.
,
Yang
L.
,
Zhou
P.
&
Zhang
N.
2017
A hybrid method for short-term wind speed forecasting
.
Sustainability
9
(
4
).
https://doi.org/10.3390/su9040596
.
Zyl
J.
,
Savic
D. A.
&
Walters
G. A.
2004
Operational optimization of water distribution systems using a hybrid genetic algorithm
.
Journal of Water Resources Planning & Management
130
(
2
),
160
170
.
http://dx.doi.org/10.1061/(ASCE)0733-9496(2004)130:2(160)
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).