In this work, identification of 24-hours-ahead water demand prediction model based on historical water demand data is considered. As part of the identification procedure, the input variable selection algorithm based on partial mutual information is implemented. It is shown that meteorological data on a daily basis are not relevant for the water demand prediction in the sense of partial mutual information for the analysed water distribution systems of the cities of Tavira, Algarve, Portugal and Evanton East, Scotland, UK. Water demand prediction system is modelled using artificial neural networks, which offer a great potential for the identification of complex dynamic systems. The adaptive tuning procedure of model parameters is also developed in order to enable the model to adapt to changes in the system. A significant improvement of the prediction ability of such a model in relation to the model with fixed parameters is shown when a certain trend is present in the water demand profile.

## INTRODUCTION

Improving the efficiency of water management is essential for overcoming the growing problem of water scarcity and droughts. Efficient water management implies efficient control of water distribution systems (WDSs), which is based on the availability and accuracy of data on variables that affect the WDS. The WDS control often relies on decision support systems (DSSs). An important element of such a DSS is the water demand prediction system (WDPS). Availability of accurate water demand prediction can lead to a decrease of overall energy and material consumption in the WDS, decrease of loads on elements of the WDS and increase of quality of water delivered to consumers (Bakker *et al.* 2013).

For the analysis in this paper, artificial neural networks (ANNs) are used for short-term water demand prediction with a time horizon of 24 hours. The procedure employed for a 1-hour-ahead water demand prediction is readily applicable for prediction of water demand in multiple hour-time-instants in the future based on the data available at the current time. Two approaches to such an extension are analysed in this paper: (i) based on dynamic employment of the 1-hour-ahead prediction model; and (ii) based on static models developed for each hour of the 24-hour prediction horizon.

## MATERIAL AND METHODS

The overall water consumption, on the level of a water distribution network, consists of many different units. The pattern of water demand is a non-stationary stochastic time series that may include a trend in the mean, non-constant variance and discontinuities (Ghiassi *et al.* 2008). Research in this area shows that water demand includes non-linearities such that employment of ANNs in water demand prediction shows significantly better results compared with linear regression (Jain *et al.* 2001; Yalcinoz & Eminoglu 2005). Apart from the structure of a model, the performance of a WDPS depends on the model inputs as well.

Detailed methodology in ANN model development process is given in Maier *et al.* (2010) and is followed in this paper. One of the most important steps in the model development process is input selection, where the vector of appropriate model inputs is determined. However, this step is usually not considered to be of great importance, and most of the input variables are determined heuristically, which can result in including too many or too few input variables (May *et al.* 2008). As a consequence of omitting one or more relevant input variables, the model will not be able to describe the whole dynamics and phenomena in the system. Probability of omitting relevant input variables is much higher for time series in which input candidates are not only different variables, but also their lagged values (unless dynamic ANNs are used), which significantly increases the number of potential input variables. Including too many input variables can be caused by poorly assessed relevance of an input variable or by existence of a redundancy among them, where some of the chosen variables contain some useful information, but are interdependent, so they contain redundant information. With an increase of input variables, a number of model parameters are also increased, which as a consequence leads to decreased speed and quality of the model calibration.

The linear correlation coefficient is a commonly adopted measure of dependence between variables. However, the underlying assumption of linearly structured dependence is contradictory to the model development of the non-linear system (May *et al.* 2008). A particular algorithm which is more appropriate for finding non-linear dependences, and which also considers the interdependences among variables, was developed by using the concept of partial mutual information (PMI) (Sharma 2000). In this paper, the original algorithm is adapted in the sense that PMI is estimated using the ‘*k*-th nearest neighbour’ method described in Frenzel & Pompe (2007), which is more accurate and computationally less intensive than the original method. In addition, in relation to the original algorithm where the termination criterion is determined based on statistical significance of the estimated PMI, in this paper the predefined number of relevant inputs *M* is used as the termination criterion, i.e. the input variable selection (IVS) procedure creates a list of *M* most relevant inputs.

*et al.*2010), and MLPs with one hidden layer are used in this paper for modelling the demand. Feed-forward back-propagation method with Levenberg-Marquardt technique is used as the ANN training procedure because of its advantages in relation to the other gradient-descent techniques (Cigizoglu & Kişi 2005). The available data set for training the ANN is divided into three subsets: ‘training data’ (70%) are used for calculating the gradient and updating the model parameters; ‘testing data’ (15%) are used in the stopping criterion of the training process (cross-validation); and ‘validation data’ (15%) are used to determine the optimal number of inputs (out of

*M*) and optimal number of hidden neurons. A set of ANNs with different number of inputs, which are chosen as first

*m*elements from a list of

*M*most relevant inputs,

*m*≤

*M*, and different number of hidden neurons is trained − the optimal ANN is chosen as one with the lowest mean squared error (MSE) on the validation data. MSE is defined as follows: where

*n*is the number of data samples for which MSE is calculated,

*Y*is the actual output of the

_{i}*i*th sample and is the model output for the

*i*th sample.

The above-described procedure can be used to identify a 1-hour-ahead demand prediction model. For identifying a multiple-hours-ahead demand prediction model, we propose two approaches: (i) based on dynamic employment of the 1-hour-ahead prediction model; and (ii) based on static models developed for each hour of the 24-hour prediction horizon. These approaches are shown in Figure 1.

*k*) = Θ(

*k*+ 1)

*−*Θ(

*k*),

*α*is the learning coefficient, is the gradient of local criterion function on the corresponding data set, and

*γ*

_{m}is the non-negative momentum term which speeds up the learning convergence while attenuating the parasitic oscillations.

## RESULTS AND DISCUSSION

The steps followed in this paper for ANN modelling are described in Maier *et al.* (2010). In May *et al.* (2008), the IVS procedure based on the PMI is described and the advantages of using this measure are discussed. In this paper, PMI is calculated based on the algorithm presented in Frenzel & Pompe (2007), which is more accurate, computationally less intensive and less sensitive to parameter selection compared with the original algorithm. To the best of the authors’ knowledge, online tuning of the water demand prediction has not been reported before.

The WDPS is developed for the selected district metering area in the city of Tavira, Algarve, Portugal, using MATLAB Neural Network Toolbox. Hourly water demand data, daily meteorological data and a set of time indices, which provide information about time of day, day in week etc., are used as the candidates for model inputs. Historical data for the period from January 2011 until December 2012 are used for training the initial demand model. Using the developed IVS procedure, a set of *M* = 20 variables are chosen, as the most relevant for the 1-hour-ahead demand prediction in the sense of PMI, in the following order: *d*(*t* − 168), *d*(*t* − 24), *d*(*t* − 1), *t _{c}*

_{,D}= cos(2

*πt*

_{D}/24),

*t*

_{s}_{,D}= sin(2

*πt*

_{D}/24),

*d*(

*t*− 48),

*d*(

*t*− 167),

*t*

_{s}_{,W}= sin(2

*πt*

_{W}/168),

*t*

_{c}_{,W}= cos(2

*πt*

_{W}/168),

*d*(

*t*− 23),

*d*(

*t*− 169),

*d*(

*t*− 47),

*d*(

*t*− 25),

*d*(

*t*− 49),

*d*(

*t*− 2),

*d*(

*t*− 170),

*d*(

*t*− 50),

*d*(

*t*− 26),

*d*(

*t*− 12) and

*d*(

*t*− 11), where

*d*(

*t*−

*k*) denotes demand

*k*hours before the hour for which the prediction is to be determined,

*t*

_{D}is hour of day and

*t*

_{W}is hour in week from the last Monday midnight. The model with 20 inputs was chosen as the optimal one, and MLP with one hidden layer that consists of 25 neurons showed the best performance on the independent validation data (Table 1). Figure 3 depicts the regression plots for the chosen 1-hour-ahead prediction model on calibration (union of training and testing data) and validation data. Notation

*R*stands for the regression value.

Model . | Calibration data MSE . | Validation data MSE . |
---|---|---|

Linear | 91.7873 | 89.6176 |

MLP(15) | 67.6150 | 75.6435 |

MLP(20) | 64.0207 | 76.0082 |

MLP(25) | 66.2181 | 74.2241 |

MLP(30) | 66.0955 | 77.8976 |

MLP(35) | 66.0251 | 75.3533 |

MLP(40) | 64.9425 | 75.0948 |

MLP(45) | 65.9636 | 77.0206 |

MLP(50) | 65.0910 | 75.8998 |

Model . | Calibration data MSE . | Validation data MSE . |
---|---|---|

Linear | 91.7873 | 89.6176 |

MLP(15) | 67.6150 | 75.6435 |

MLP(20) | 64.0207 | 76.0082 |

MLP(25) | 66.2181 | 74.2241 |

MLP(30) | 66.0955 | 77.8976 |

MLP(35) | 66.0251 | 75.3533 |

MLP(40) | 64.9425 | 75.0948 |

MLP(45) | 65.9636 | 77.0206 |

MLP(50) | 65.0910 | 75.8998 |

MLP(*i*) stands for the multilayer perceptron neural network with *i* neurons in the hidden layer.

The procedure described above was also used for the identification of *k*-hours-ahead prediction models, *k* ∈ {2, …, 24}. Both static and dynamic approaches for 24-hours-ahead demand prediction were tested, and it was shown that they have quite similar prediction abilities. However, from the computational point of view, applying the dynamic approach is much more efficient, since only one model needs to be identified and a significantly smaller set of parameters have to be tuned if model adaptation is used. In addition, extension of the prediction model to a larger prediction horizon using dynamic approach is trivial, while for the static approach additional models should be developed.

Validation data . | Offline model MSE . | Offline model MAPE (%) . | Online model MSE . | Online model MAPE (%) . |
---|---|---|---|---|

Original | 98.6620 | 8.7325 | 100.0479 | 8.8047 |

Original + linear | 214.9956 | 6.8601 | 146.1174 | 5.9160 |

Original + sine | 658.8104 | 8.9333 | 416.3226 | 8.2597 |

Validation data . | Offline model MSE . | Offline model MAPE (%) . | Online model MSE . | Online model MAPE (%) . |
---|---|---|---|---|

Original | 98.6620 | 8.7325 | 100.0479 | 8.8047 |

Original + linear | 214.9956 | 6.8601 | 146.1174 | 5.9160 |

Original + sine | 658.8104 | 8.9333 | 416.3226 | 8.2597 |

Figure 4 shows that the online model tracks the actual demand curve better than the offline model does when a linear trend is added to the original demand.

To show applicability of the presented approach to other regions, the same analysis was performed for the data set of Evanton East, Scotland, UK. For brevity of the presentation, we only show performance of the offline and online 24-hour-ahead water demand prediction models in Table 3. The experiment was run in the same conditions as for the Tavira site. Results presented in Table 3 show even better performance of the offline model in terms of the MAPE than those in Table 2. Improvement of the prediction accuracy when using the online tuning procedure follows the same trend as in the case of the Tavira site. Note that MSE is not a relative statistic measure (see (1)), i.e. it is dependent on the units used for representing demand. Thus, it is not appropriate for comparing model performance for the two sites.

Validation data . | Offline model MSE . | Offline model MAPE (%) . | Online model MSE . | Online model MAPE (%) . |
---|---|---|---|---|

Original | 0.03088 | 4.821 | 0.03003 | 4.803 |

Original + linear | 0.04096 | 4.882 | 0.03656 | 4.713 |

Original + sine | 0.07098 | 5.463 | 0.05758 | 4.980 |

Validation data . | Offline model MSE . | Offline model MAPE (%) . | Online model MSE . | Online model MAPE (%) . |
---|---|---|---|---|

Original | 0.03088 | 4.821 | 0.03003 | 4.803 |

Original + linear | 0.04096 | 4.882 | 0.03656 | 4.713 |

Original + sine | 0.07098 | 5.463 | 0.05758 | 4.980 |

## CONCLUSIONS

This paper considers the identification of the water demand prediction model with a time horizon of 24 hours using two approaches, static and dynamic. It was shown that both analysed approaches have quite similar predicting abilities. However, from the computational point of view, applying the dynamic approach is much more efficient. Moreover, extension of the prediction model to a larger prediction horizon using the dynamic approach is trivial, while this is not the case for the static approach.

It was shown that for the analysed WDSs of the Tavira and Evanton East sites, daily meteorological variables are not relevant, in the sense of PMI, for describing the demand prediction, i.e. past demand and variables describing time of day and day in week can satisfactorily model 1-hour-ahead water demand prediction model.

Finally, the experimental results show that using the online tuning procedure of model parameters can lead to a significant improvement of the model prediction ability, in the sense of MSE, when a certain trend exists in the water demand profile. Also, the applicability of the approach for the WDPS modelling was confirmed for two distinct sites.

## ACKNOWLEDGEMENTS

This work has been supported by the European Community Seventh Framework Programme under grant No. 318602 (UrbanWater); and by the European Community Seventh Framework Programme under grant No. 285939 (ACROSS). This support is gratefully acknowledged. The authors would like to thank Tavira Verde and Scottish Water for providing the data for developing the prediction models. The authors are also grateful to the anonymous reviewers for their helpful comments and suggestions on improving the content and the presentation of this paper.