Abstract
Sustainable management of water resources is a key challenge nowadays and in the future. Water distribution systems have to ensure fresh water for all users in an increasing demand scenario related to the long-term effects due to climate change. In this context, a reliable short-term water demand forecasting model is crucial for the optimal management of water resources. This study proposes a novel deep learning model based on long short-term memory (LSTM) neural networks to forecast hourly water demand. Due to the limitations of using multiple input sequences with different time lengths using LSTM, the proposed deep learning model is developed with two modules that process different temporal sequences of data: a first module aimed at dealing with short-term meteorological information and a second module aimed at representing the longer-term information of the water demand. The proposed dual-module structure allows a multivariate selection of the inputs with sequences of a different time length. The performance of the proposed deep learning model is compared to a conventional multi-layer perceptron (MLP) and a seasonal integrated moving average (SARIMA) model in a real case study. The results highlight the potential of the proposed multivariate approach in short-term water demand prediction, outperforming the more conventional approaches.
HIGHLIGHTS
This study proposes a novel short-term water demand forecasting model using multivariate long short-term memory with meteorological data.
The proposed dual-module structure allows to process different temporal sequences of data.
The model is tested on a real case study, outperforming state-of-art methods.
This study highlights the importance of a multivariate approach, especially due to climate change.
Graphical Abstract
INTRODUCTION
Nowadays, the efficiency of a water distribution system (WDS) is crucial for preserving drinking water resources. This latter is demanding more sustainable management in the future. In the world of water management, the constant increase of water demands from agriculture, industries and humans due to climate change and socio-economic factors is forcing practitioners to deal with the demanding request to optimise as much as possible the management of the available water resource.
The water distribution infrastructures, following the similar trend of energy grids, have started an important renewal process in the last years in order to modify the actual paradigm of current networks and move towards the concept of smart grids (Menapace et al. 2020a; Ramos et al. 2020). This important paradigm change to a green and smart system is nowadays even more required for water supply systems, due to climate change that is increasing the global water demand and putting under strain the WDSs (Wang et al. 2016). Climate change cannot be ignored and it is now evident in many parts of the world (e.g., Yu et al. 2014; Elkiran et al. 2021), where the water resources availability is getting reduced by climate modification. In this context, wastes of water, like leakages in WDS, are not acceptable (Menapace et al. 2020b; Zanfei et al. 2020) and there is a need to be addressed through proper management of the water systems.
In this context, the era of Big Data and Machine Learning provides new tools and algorithms that are the future of sustainable water use (Savic 2019). Nowadays, these new techniques allow to develop reliable and advanced tools that effectively can contribute to improving water management. In the latest decades, many studies attempted to develop these tools for many tasks, for instance, developing digital twins for state estimation (Bonilla et al. 2022), algorithms for burst and leakage detection (Wu & Liu 2017), developing data analysis techniques for smart water metering systems (Rahim et al. 2020), modelling water demand (House-Peters & Chang 2011), for intrusion detection (Mboweni et al. 2021), and much more different and important tasks. Among plenty of possible techniques that can be developed with these data-driven approaches, one of the most important one is certainly water demand forecasting (e.g., Herrera et al. 2010; Pacchin et al. 2019; Zanfei et al. 2022a). In general, many applications of forecasting models show how such techniques can improve and optimise the management of water resources. For instance, such improvements can be seen in the context of hydropower systems optimisation (Avesani et al. 2022) or for improving the efficiency of the pumping schedules and reducing the operation costs related to the irrigation of water systems (Pulido-Calvo & Gutierrez-Estrada 2009) and surely for the WDSs management (Alvisi et al. 2007). For instance, Bakker et al. (2013) showed that the usage of a water demand forecasting model allowed important saves in a WDS in the Netherlands. In fact, the use of a reliable water demand forecasting model allows to manage better the water resource, meaning saving both water and energy and lowering costs.
In general, water demand forecasting can be categorised based on the forecasting horizon: short term, medium term and long term. In particular, medium-term and long-term forecasting is more related to decision problems with higher time horizons, like planning and investment making (Donkor et al. 2014). Short-term forecasting, which is the main target of this study, is particularly important for tackling operational decisions concerning the operational management and optimisation of a WDS. In this field, many researchers have provided important contributions in the last decade, especially using deep learning techniques that achieved high success in many fields (Schmidhuber 2015). This is the case of Guo et al. (2018), where the authors proposed a short-term water demand forecasting model based on a deep learning architecture using gated recurrent units (Chung et al. 2014). The authors proposed a model tested on data of some district metering areas in China. The authors highlighted the high performance of such a model, showing its ability to outperform conventional models like multi-layer perceptron (MLP) and a seasonal integrated moving average (SARIMA).
Furthermore, other authors also proposed combining deep learning techniques with ensemble approaches. For instance, Ambrosio et al. (2019) proposed to develop a machine learning committee model to forecast the water demand of a time series in Brazil. The authors aimed to propose an ensemble machine learning model able to efficiently use the strength of many sub-models (i.e., the single machine learning models). Therefore, the authors tested plenty of methodologies to combine the sub-models, showing that the final resulting ensemble was able to outperform all the single sub-models. Another application was proposed by Xenochristou & Kapelan (2020), where the authors proposed an ensemble stacked model to predict water demand data of a city in England. Also, in this case, the authors highlighted the superior performances of such an ensemble approach that was able to outperform the more conventional and single models, like a conventional artificial neural network, random forest and plenty of others.
Although plenty of studies developed water demand forecasting models using only past observation of water demand data as input (e.g., Guo et al. 2018; Mu et al. 2020; Zanfei et al. 2022b), some studies highlighted the importance of using meteorological variables due to their correlation with water demand (Brentan et al. 2017) and the impact of climate change in water consumption (Fiorillo et al. 2021). The present study proposes a deep learning model based on multivariate long short-term memory (LSTM) (Hochreiter & Schmidhuber 1997) units to build a short-term forecasting model to predict hourly urban water demand. This study aims to propose a novel and performing deep learning model with a multivariate approach for short-term water demand forecasting. Despite the scientific literature having plenty of important and powerful methodologies for such tasks, water demand forecasting is still an open topic that can be improved due to the rise of novel techniques and the continuous increase of computational power and data availability. Furthermore, conventional multivariate recurrent neural networks (RNNs) require to have inputs that, if composed of multiple features, these latter should have the same time lengths. Therefore, to overcome the problem of using samples with different time lengths using RNNs, this model is composed of two modules. The first module deals only with past observation data of water demand that accounts for short-time dependencies (i.e., immediate previous observations) and long-time dependencies (i.e., weekly past observations). This module has to extract the temporal information of the water demand using the past observation with a time horizon of a week. The second module regards the short-term representation of weather data, where the module has to extract meaningful features regarding the actual meteorological condition. It is worth noting that the two modules process data with different time horizons. The first module uses data with a longer horizon, while the meteorological module deals with shorter horizon data for providing a representation of the actual weather condition. The representation of the two modules is then merged to provide the prediction output. The proposed approach is tested on real water demand data and compared to two conventional methods, namely an MLP model and a SARIMA model. The results highlight the high prediction performance of the novel proposed methodology showing that such deep learning model can outperform the two conventional methods.
The paper is structured as follows: Section Materials and Methods presents the framework of the proposed methodology; Section LSTM units discusses the model architecture with particular attention to the input sequences, and Section Benchmark models presents the MLP and the SARIMA methods adopted as benchmarks in the study. Furthermore, Section Case study presents the data adopted for evaluating the methodology, Section Performance evaluation metrics shows the error measures adopted to assess the model performances, and Section Tuning M-LSTM shows the iterative grid search method used to optimally tune the presented multivariate model. Subsequently, Section Results and discussion presents the achievements and the major highlights of the study and finally, Section Conclusion reports the final remarks.
MATERIALS AND METHODS
The proposed M-LSTM model is shown in Figure 1, where the two-module structure is highlighted. The first module, called the water demand module, is responsible to deal only with water demand data. This module aims to extract and learn the temporal information of these data using the LSTM units. In particular, the data adopted as input for this module are the hourly past observations of water demand. This temporal sequence of data is 7 days long, meaning that it is composed of 168-hourly data, to allow the LSTM units to catch the daily and weekly seasonalities of the demand. Differently, the meteorological module uses less past observations, for providing an actual representation of the weather condition at the prediction time frame. It is worth noting that Figure 1 includes the input variables for the meteorological module. The considered weather data adopted include the temperature, the humidity, the radiation and the rainfall. All these data are fed into the module with a sequence that has past observation until the time step n and forecast until time step m. The parameters n and m depend on the aim of the forecasting model. In this study, it is proposed to test the model for two short-term forecasting applications. The first application concerns the test for predicting only the following hourly value. The second application is the simultaneous prediction of the following 24-hourly values. These two different tests have a different scope and can support the water utilities at different levels. For instance, the first application can support the detection of anomalies in the system like bursts, while the second application can provide important information for the actual operational management of the system. For these two applications, the input of the meteorological module is different. Concerning the prediction of only 1 h, n assumes the value of three, while m the value of zero. This means that the sequence input of the model has a length of four and comprises the three nearest past observations and the simultaneous value of the prediction at the time step t. Concerning the simultaneous prediction of 24-hourly values, n assumes the value of zero, while m becomes 23, meaning a sequence of 24 values that cover the whole duration of the time step target of the prediction. In this case, it is simplified by using the historical measured meteorological variables instead of the meteorological forecast, without evaluating the uncertainty related to this simplification. It is worth clarifying that the prediction of 24-hourly values is made simultaneously every 24 time steps, meaning that the output of the model covers, at the same moment, the whole day duration.
Once the two modules provide their outputs through the LSTM units, a concatenate layer merges their representations. This latter operation allows to unify the output of the two modules into a single one. For the sake of clarity, the concatenation allows only to merge the numerical output of the models, without any further operation. Therefore, the concatenated output is afterwards fed into a dense layer, in order to provide the final prediction with the desired dimension (i.e., number of outputs) depending on the aim of the prediction task. All the parameters of this model are further discussed in Section Tuning M-LSTM, where it is proposed to employ the iterative grid search from Menapace et al. (2021) in order to find the optimal set of nodes and layers for both the modules that composes the M-LSTM.
LSTM units
Benchmarks models
To provide a comparison between the proposed multivariate LSTM model and some baseline approaches, it is proposed to use as a benchmark a conventional MLP model and a SARIMA model. Concerning the first one, it has been widely adopted in many studies (e.g., Bougadis et al. 2005; Adamowski & Karapataki 2010) with remarkable results and it is still considered a solid benchmark (e.g., Guo et al. 2018; Mu et al. 2020). These kinds of networks are organised in a sequence of units, called perceptrons, that are grouped together in layers. An MLP can count on an input layer that is directly fed with the input data. This layer is followed by one or multiple hidden layers that propagate the information in the network until the output layer provides the desired outcome of the model. The MLP is also called feedforward, since the information goes progressively from the input layer, to the hidden layer and the output layer. In this study, it is proposed to use a conventional three-layer MLP, with an input layer, a hidden layer and an output layer. The selected number of neurons for the hidden layer resulting from a conventional grid search process is 128, with the rectified linear units as activation functions. The training is performed using the Adam optimiser (Kingma & Ba 2014).
Concerning the SARIMA model, this latter has been selected due to the seasonal patterns that affect the water demand time series (Oliveira et al. 2017). In addition, the order of the model is chosen using the conventional approach based on the auto-correlation function and the partial auto-correlation functions.
Case study
Concerning the demand data shown in panel (a) of Figure 3, it emerges the typical behaviour of the water demand, with its own daily seasonality. In addition, the reduced number of users of the WDS affects the variability of the demand signal, which has some important differences day by day. Whereas the other panels show the meteorological variables adopted in the study. These latter are the rainfall, the temperature, the humidity and the radiation in panels (b)–(e) of Figure 3, respectively. In addition, the data are filtered to deal with the few outliers and missing values that are in the datasets. In particular, the outliers that are at least 2 standard deviations larger than the mean value for that hour of the day are filtered. For this purpose, the K-nearest neighbour algorithm is adopted to perform the imputation process (Zanfei et al. 2022c).
The datasets have an hourly time step and are almost 7 years long, starting from January 2013 and ending in September 2019. The whole first 6 years of the data are used for training and validating the model, while 2019 (i.e., 9 months of data) is used for testing and evaluating the model performances. A computer with Windows 10 and with an AMD Threadripper 3960X processor is used to develop the predictions using some ad hoc codes developed based on the Keras (Chollet et al. 2015) and TensorFlow (Abadi et al. 2016) libraries in Python.
Performance evaluation metrics
RESULTS AND DISCUSSION
This section presents and discusses the results of the methodology. The section is divided into a first part that shows the results of the tuning process for the M-LSTM and a second part that discusses the forecasting results and the comparison with the benchmark models.
Tuning M-LSTM
Figure 4 shows the heatmap of the rank of each configuration (i.e., a combination of neurons and layers) evaluated, by averaging the score with each model. In particular, the process repeats for each configuration of layers and neurons of the tuning process. Each time, the initialisation is random. Therefore, due to the stochastic nature of the training process, which is an optimisation, the results may vary. To overcome this issue, the iterative grid search allows finding trends in the performance of each configuration, by repeating the tuning and evaluating the performance of the model at each iteration through the average of the score with the previous model results. This process is developed for the M-LSTM model, due to the complexity of the latter and the two-module structure. In this case, the performance with a higher and more constant rank is the one that achieved better results and it is also more stable. As shown in Figure 4, in this case, the better configuration is the one that has two layers of LSTM of 48 and 72 neurons for the water demand module, while three layers of LSTM of 48, 72 and 48 neurons for the meteorological module. In addition, all the models are trained using the Adam optimiser.
Forecasting results
This section presents and discusses the results of the proposed M-LSTM model in the short-term forecasting of the presented case study. These results highlight the prediction performance of the proposed M-LSTM, compared to the benchmark model used in this study. The metrics that refer to the task of predicting just the following hourly value of water demand are reported in Table 1.
. | 1-h prediction . | ||
---|---|---|---|
M-LSTM . | MLP . | SARIMA . | |
1.649 | 2.169 | 2.432 | |
0.864 | 1.067 | 1.084 | |
0.934 | 0.913 | 0.897 | |
8.82 | 10.427 | 10.665 |
. | 1-h prediction . | ||
---|---|---|---|
M-LSTM . | MLP . | SARIMA . | |
1.649 | 2.169 | 2.432 | |
0.864 | 1.067 | 1.084 | |
0.934 | 0.913 | 0.897 | |
8.82 | 10.427 | 10.665 |
Figure 5 shows the different predictions of the models in three different periods of the testing dataset. In particular, it emerges that the M-LSTM can achieve better performances compared to the benchmarks. In fact, during the period in February, the M-LSTM achieves an overall 8.3% of MAPE, compared to 11.5% of the MLP and 11.8% achieved by the SARIMA. Furthermore, in April, the M-LSTM performs with 10.2% of MAPE, compared to 10.6 and 11.9% achieved by the MLP and the SARIMA, respectively. The same behaviour can be noticed during June, when the M-LSTM achieves 8.4% of MAPE, compared to 10.1 and 9.3% of the MLP and the SARIMA, respectively. It is shown that how the proposed M-LSTM model can provide a more reliable prediction of the water demand of this case study. The full tables of the MAPE for each month are reported in the supplementary material.
Regarding the task of predicting simultaneously 24-hourly values, the results are reported in Table 2.
. | 24-h prediction . | ||
---|---|---|---|
M-LSTM . | MLP . | SARIMA . | |
2.368 | 3.444 | 4.173 | |
1.042 | 1.289 | 1.45 | |
0.905 | 0.862 | 0.823 | |
10.592 | 12.09 | 13.932 |
. | 24-h prediction . | ||
---|---|---|---|
M-LSTM . | MLP . | SARIMA . | |
2.368 | 3.444 | 4.173 | |
1.042 | 1.289 | 1.45 | |
0.905 | 0.862 | 0.823 | |
10.592 | 12.09 | 13.932 |
Table 2 further highlights the ability of the proposed M-LSTM model to predict the water demand time series. In fact, the M-LSTM can achieve 0.905 of , which is consistently better than 0.862 achieved by the MLP and better than 0.823 achieved by the SARIMA. It is also worth noting that the M-LSTM model used for simultaneously predicting 24 values achieves results comparable to the benchmark models used while predicting only the single next hourly value. This confirms the high prediction ability of the proposed model. These achieved performances in the 24-h prediction can also be attributed to the meteorological module, which provides the representation of the actual weather condition, especially thanks to the use of real data that are not affected by the uncertainty of the weather forecast.
Figure 6 highlights the different predictions of the models during the three periods. It is worth noting that, in all the figures, the M-LSTM appears to be able to accurately follow the true data, providing a reliable prediction. Differently, the MLP and the SARIMA models are not able to catch well with the time series, especially during the peaks. Once again, the proposed M-LSTM model shows its higher prediction performance. In fact, during the three periods, the M-LSTM achieves consistently better performances. In fact, the M-LSTM achieves a MAPE of 9.1, 13.7 and 10.2% during the periods in February, April and June, respectively. Differently, the MLP performs with 11.4, 15.1 and 16.5% of MAPE and the SARIMA with 13.4, 15.4 and 12.9% of MAPE during February, April and June, respectively. The same performances can be noticed during the other months and the full tables are reported in the supplementary material.
Finally, it is worth to mention that the M-LSTM is more computationally intensive compared to the benchmark models. In fact, in the task of predicting 1 h, an epoch with the M-LSTM requires approximately 10 s, while just 1 s with the MLP. This means that a complete training of 100 epochs requires approximately 15 min, compared to the 2 min of the MLP. Similar performances result also in the prediction of 24 h prediction tasks.
CONCLUSION
This study proposes a short-term forecasting model based on multivariate LSTM, named M-LSTM. The proposed model is designed with an innovative two-module structure to overcome the issue of using multiple input sequences with different time lengths. A first water demand module is proposed to deal only with the extraction of features from the time series of water demand. This first module's scope is to deal with the long-time dependencies that characterise water demand. The second module regards the short-term representation of meteorological data. This second module aims at providing a representation of the weather condition at the prediction moment. The proposed methodology is tuned by adopting an iterative grid search approach and tested on a real-time series of water demands. In addition, two conventional benchmark models are used to test the performance of the M-LSTM: an MLP model and a SARIMA model.
The results show that the M-LSTM model outperforms both benchmarks. In particular, the model is tested on two prediction tasks: the 1-h prediction task, where the aim is to predict only the following hourly water demand value, and the 24-h prediction task, where the purpose is to predict simultaneously 24 values of water demand and cover a whole day. The proposed M-LSTM achieved superior performances in both tasks. Especially while predicting 24 values, the module composition of the M-LSTM appears to be particularly effective for providing a reliable prediction. The tests made in this study highlighted the high prediction capability of the proposed M-LSTM model.
Future works will address the use of meteorological forecasts instead of historical values for weather data. Furthermore, the uncertainty related to these variables will also be taken into account during the model evaluation.
ACKNOWLEDGEMENTS
This study has been partially funded by the project ‘TESES-Urb—Techno-economic methodologies to investigate sustainable energy scenarios at urban level’ of the Free University of Bozen-Bolzano. The authors thank the anonymous reviewers for their valuable contribution. The authors also thank Novareti S.P.A for providing the data for this study.
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.