Abstract
Inappropriate scheduling plans can result in additional economic losses and the safety of water distribution network (WDN). Optimizing manual experience based scheduling plans can help water utilities rationally allocate water plants and pump stations, ensuring the safety, stability, and economy of the water supply system. However, there is a lack of real-time, rational, and optimized scheduling methods. To address this, we proposed a novel intelligent scheduling framework based on deep learning. In this framework, two neural network models, multi-heads convolutional gated recurrent unit network (MH-CGRU) and multi-head gated recurrent unit network (MH-GRU), can effectively extract key features from the WDNs. Operating data were used as decision variables to predict and generate scheduling orders for water plants and pump stations, respectively. The rationality of the orders is verified by combining a high precision online hydraulic model and the evaluation of the operational status of the WDNs. This system has been deployed in a real WDN and put into practical application. From June to November of 2022, the total adoption rate of all orders reached 96.29%, with the average deviation between predicted and actual control targets being less than 5%, and energy consumption decreased by 3.05% compared to the previous year.
HIGHLIGHTS
Proposed an optimized, real-time, and secure intelligent control method for water supply networks based on deep learning algorithms.
Presented a data evaluation approach for selecting high-quality samples from monitoring data in the water supply system.
Developed an intelligent verification mechanism that combines a high-precision hydraulic model with scheduling orders for improved control reliability.
INTRODUCTION
The scheduling of water distribution networks (WDNs) is an essential component of urban water supply system operation and management for ensuring hydraulic and water quality safety, as well as energy reduction in the WDNs. On the one hand, with urbanization, the structure and scale of the water supply network become more and more complex, and the pump scheduling that relies on traditional manual experience gradually begins to face challenges. On the other hand, 70–80% of energy consumption stems from pumping stations' transmission and distribution (C. U. W. Supply, D. Association 2021), the overall operating efficiency of pumps in WDNs ranges from 50 to 75%, and the energy consumption of water transmission and distribution units is usually above 370 km3/MPa, far exceeding the industry's optimization target value of 350 km3/MPa (C. U. W. S. Association 2005), which means that there is a large space for energy consumption optimization.
In order to improve the efficiency of WDNs, numerous studies have attempted to address this issue. The methods developed in these studies can be roughly divided into two categories: rule-based control and optimization algorithm-based methods (Giustolisi et al. 2013). Rule-based control methods typically involve setting thresholds to trigger the operation of pumps. The simplest rule-based control is called fixed trigger level (FTLs), which involves setting two water levels for controlling the pump's operation. The pump is activated when the water level exceeds the on-trigger level, and it is turned off when the water level falls below the off-trigger level (Alvisi & Franchini 2017). Reduced fixed trigger levels (RFTLs) are improved methods based on FTLs, which distinguish between peak and off-peak periods. The method optimizes the on-trigger level during off-peak periods and the off-trigger level during peak periods. It tries to achieve a beneficial condition by ensuring that the water level in the tank reaches the highest acceptable point at the end of the off-peak period and the lowest acceptable point at the end of the peak period (Alvisi & Franchini 2016; Creaco et al. 2016; Marchi et al. 2016). Compared to FTLs and RFTLs methods, the time-variable triggering method for pump control can provide an optimal arrangement between non-peak and peak power consumption periods. This method primarily calculates the triggering level patterns by solving a multi-objective optimization problem to minimize energy consumption and the number of pump station switches (Housh & Salomons 2019; Quintiliani & Creaco 2019). The advantage of these methods is that they can be used for real-time decision-making; however, such real-time decisions only respond to the current operating state without considering the overall optimality of the network operation over a period of time.
The optimization algorithm-based method usually refers to the method that uses an optimization algorithm to decide whether the pump should be on or off at a certain time. Linear programming (LP) is a classical method that can handle optimization problems with constrained objective functions and is suitable for optimization problems with the decision variables of WDNs (Giacomello et al. 2013; Puleoa et al. 2014); however, because the WDNs do not satisfy the assumption of a linear system, the optimization effect is poor. Nonlinear programming (NLP) takes into consideration the nonlinear hydraulics of the WDNs and can be applied to practical scenarios; however, the computation time grows exponentially with the number of decision variables, limiting its application to large WDNs (El Mouatasim 2012; Skworcow et al. 2014). The mixed-integer nonlinear (MINLP) programming approach is more computationally efficient than the nonlinear planning method for this situation, but the issue of computational time consumption remains (Samani & Zanganeh 2010; Fooladivanda & Taylor 2015; Costa et al. 2016; Khatavkar & Mays 2017). Meta-heuristic algorithms (such as genetic algorithms (Costa et al. 2010; Moreira & Ramos 2013; Odan et al. 2015; Makaremi et al. 2017), ant colony optimization (Hashemi et al. 2014; Babaei et al. 2015), and particle swarm optimization (Rajabpour et al. 2015a, 2015b; Rajabpour & Talebbeydokhti 2020), etc.) have been proven to be capable of handling large-scale problems and can be applied to WDNs in the real world. These methods are applicable to both continuous and discrete variables and can find global parallel optimization. Nevertheless, for real-time scheduling decisions, the following problems remain: first, the model calculation takes a long time, and it cannot achieve high-precision real-time response when applied to large WDNs; second, such algorithms frequently only consider optimality, without considering the safety of scheduling operations and the overall optimality of scheduling.
In recent years, deep learning algorithms have gained increasing attention as data-driven modeling tools for applications such as computer vision (LeCun et al. 2010; Leibe et al. 2010), natural language processing (Young et al. 2018; Otter et al. 2020), and speech recognition (Dahl et al. 2011; Nassif et al. 2019). For the pump scheduling problem in the WDNs, deep learning-related research is still very limited. Studziński & Ziółkowski (2020) propose a method using neural networks instead of hydraulic models, which can accelerate the model solution. There are also many studies applying deep reinforcement learning in pump station scheduling to optimize the reward function by controlling agents interacting with hydraulic models (Xu et al. 2021; Donâncio et al. 2022; Hu et al. 2023). However, a common problem in these studies is that the optimality and reliability of the scheduling orders depend on the simulation accuracy of the hydraulic model. Different from the above studies, this study automatically extracts temporal and spatial features from real water distribution network (WDN) monitoring data based on deep learning and establishes a mapping relationship between the operational status of the WDN and the scheduling orders.
In general, the main obstacles to the application of optimization and scheduling technologies for the WDNs are the inability to balance real-time decision-making, seek optimization capabilities, and ensure water supply safety. To address these technical bottlenecks, this paper innovatively implements a real-time intelligent scheduling system based on deep learning algorithms and applies it to a large-scale water supply network in Shanghai, with the following features:
Real-time capability: The system captures monitoring data from the water supply system and generates scheduling instructions through a pre-trained deep learning scheduling model at a frequency of every 5 min.
Optimality: The training data are optimized for model performance using various data evaluation indicators, and the results demonstrate improved energy-saving levels when compared to historical energy consumption.
Safety: The system integrates a high-precision hydraulic network model to assist in scheduling decisions and filters out unreasonable orders.
METHODOLOGY
Framework description
The first phase is the offline training process, which includes historical data collection, data preprocessing, data optimization, and ultimately generating a training dataset to train water plant pressure prediction models, pump station pressure prediction models, and water level prediction models.
The second phase is the online scheduling process, which involves receiving real-time monitoring data of the WDNs and inputting it into the deep learning model to obtain future water plant pressure, pump station pressure, and water level predictions. Then, these prediction results are used to trigger corresponding scheduling orders. The system then combines these orders with a high-precision hydraulic model simulation to validate the reasonableness and optimization of the orders. Finally, the orders are pushed to the operator for manual decision-making.
Data and case description
Water distribution system description
In addition, there are two pump stations in the area, XinFengXi and HuaXiang (hereinafter referred to as HX and XFX). The reservoir capacity of HX reservoir is 15,000 m3, and there are five pumps in the pumping station. The reservoir capacity of XFX pump station is 7,500 m3, and there are five pumps in the pump station. In terms of SCADA monitoring data, many SCADA monitoring points are arranged in the area. The types of monitoring points are mainly divided into two categories: flow monitoring points and pressure monitoring points. There are 24 flow meters in the area, which record the pressure and flow data of the main pipeline with the diameter of DN 500–1,000; 27 pressure monitoring points, which record the flow and pressure data of the pressure measurement nodes.
Data description and preprocessing
Data evaluation and training dataset construction
In deep learning algorithms, the model's understanding of the system properties of the learning object is directly related to the selection of the training dataset (Shokri & Shmatikov 2015). In the historical monitoring data of WDNs, differences in experience among operators can also lead to variations in the scheduling performance of the WDNs. To extract excellent scheduling experience from historical monitoring data, it is crucial to evaluate the quality of the WDNs' operation status. Therefore, this study proposes a data evaluation method, including pressure satisfaction and historical state similarity. This method evaluates the WDNs operation status from the perspectives of safety, rationality, and optimization, and selects data samples representing excellent scheduling experience as the training dataset.
For each moment, the operation status of the WDNs St is represented as a vector that includes water plant pressure and flow (WPPt WPFt), pump station pressure and flow (), and monitoring point pressure (). Discretize each component into n state intervals using f, where represents the indicator function and calculates the proportion of historical data with the same operating status as the current WDNs in the total historical data. Then, normalize the values to a range between 0 and 1.
Deep learning model for scheduling
The working principle of the deep learning-based WDNs scheduling model is based on model predictive control (MPC), in which the deep learning model predicts the future changes in the scheduling decision variables and generates corresponding orders based on the predictions (Camacho & Bordons 2007; Katz et al. 2020). In the specific application scenario of this research, the main control variables of the WDNs include water plant pressure, pump station pressure, and tank water level. Firstly, for the two different scheduling facilities (water plants and pump stations), two neural network structures, MH-CGRU and MH-GRU, are designed separately for predicting the target control variables.
Additionally, thanks to the proportional integral derivative (PID) negative feedback control device (Johnson & Moradi 2005), water plants can directly adjust the operation frequency of water pumps by setting the control pressure; for pump stations, there are mainly two types of pumps: booster-type pumps and reservoir-type pumps. Booster-type pumps work by increasing the water pressure flowing through them, directly improving the water pressure in the connected pipes, and their on–off status can be reflected by the changes in the outlet pressure. Reservoir-type pumps operate by transporting water from a reservoir to the WDNs, and their on-off status can be reflected by changes in the reservoir water level. Therefore, to obtain the control orders for the pump station, it is necessary to first predict the future outlet pressure and reservoir water level that need to be achieved, and then map the prediction results into specific control orders.
Prediction model for water plant operation
In the proposed MH-CGRU neural network, a multi-head structure with different model inputs is employed. Pump station outlet pressure, main pipeline pressure, and monitoring point pressure in the past 3 h are fed into head 1 to perceive the current environmental state, while pump station outlet pressure, main pipeline pressure, monitoring point pressure, water plant outflow, and water plant pressure in the past 7 days' concurrent data are fed into head 2 for learning the periodic variation patterns of the WDN operation state. In each head, two Conv1D layers are designed to extract spatial correlations among the monitoring data, and a GRU layer is developed to model long-term temporal dependencies. Ultimately, the encoded features are concatenated and input into a fully connected layer, generating the target control pressure for the water plant in the next hour.
Prediction model for water pump station
Pump stations are typically used for local pressurization to ensure the water supply pressure at the most disadvantageous consumption node (MDCN). Therefore, when predicting the pump station outlet pressure and tank water level, the relationship between the pressure at the MDCN and the pump station's attributes (such as tank water level, outlet pressure, and flow) needs to be considered. In this section, a network called MH-GRU that takes these factors into account is designed for predicting the pump station outlet pressure and water level, with the network structure illustrated in Figure 3(b).
Similar to MH-CGRU, the input of MH-GRU is also divided into two parts: the past 3 h of input data is used to perceive the current network operation state, while the data from the same period 7 days before are used for learning the periodic patterns of the prediction targets. However, the MH-GRU network does not include Conv1D blocks. This is attributable to the fact that for water plants that need to supply water to the entire region, the model must consider the spatial correlations of global input features, while for pump stations mainly used for local pressurization, the pressure of the MDCN and the station's own state are the primary considerations.
After obtaining the predicted values for the pump station outlet pressure and tank water level , there needs to be a method to convert these predictions into control signals. Therefore, this study implements a threshold-triggered method to convert the predicted pump station pressure and tank water level into pump station control orders. The main principle of this triggering method is to set specific trigger thresholds for different control scenarios, and jointly consider the predicted variable value and the change in the predicted variable (i.e., ). This is because capturing the timing of pump station switching requires considering both the current state of the predicted variable and its transient changes.
Model validation
In the practical application of intelligent scheduling systems for WDNs, it is crucial to ensure the rationality of control orders from a programming perspective. To evaluate control orders without actual execution, hydraulic simulation is clearly the best choice. Therefore, in this section, an evaluation method for scheduling schemes based on hydraulic model simulation is proposed and combined with the evaluation indicators of the operating status of the WDNs, the rationality of orders is assessed and automatically filtered through the changes before and after simulation.
Real-time hydraulic modeling of the WDNs
Firstly, a hydraulic model was established based on EPANET 2.0, shown in Figure 2 (Rossman 2000). The hydraulic model, after calibration, has a mean absolute error (MAE) of less than 2 m for the pressure of the scheduled facility (water plant and pump station) and pressure monitoring points, and a relative error of less than 10% for the scheduled facility flow rate. The specific errors can be found in Appendix A.
An online hydraulic model calculation interface was then developed based on WNTR. This calculation interface accepts real-time water monitoring data of the WDNs and control orders generated by the neural network model as inputs and performs a single-step simulation to obtain the operating status of the WDNs after the execution of the control orders, and the specific input and output items are shown in Table A.5 and Table A.6.
In this formula, represents the observed data at time t, while and represent the simulated data at times t and , respectively. This is because the error in the simulation result comes from the simulation error of and the simulation error of , here the simulation error of is eliminated by replacing the simulated a1 value with .
Orders filtering
Variable frequency control
Due to the presence of variable frequency pumps in the pumping station, in order to more accurately optimize the scheduling performance, it is not only necessary to predict the on/off status of each pump but also to set the optimal frequency for each variable frequency pump. In this study, the intelligent scheduling system uses the neural network model to predict the pump switching signals and combines the online hydraulic model to simulate the operation status of the WDNs after the order execution under different frequency conditions. Finally, based on the evaluation of the operation status of the WDNs, the frequency with the highest score is selected.
Apart from the pump's switching signals, during the scheduling operation, the order trigger of the variable frequency pump is subject to negative feedback control from the pressure at the MDCN When it exceeds or falls below , the aforementioned operation is repeated to simulate the operation state of the WDN under different frequencies, thereby generating the optimal control frequency.
From the aforementioned, it is clear that the frequency control method relies on the real-time calculation of the intelligent scheduling model, the multi-conditions hydraulic simulation of the online hydraulic model, and the evaluation of the operation status, which can further optimize the scheduling performance based on the generation of pump-on/off signals and is a crucial component of pump station scheduling. It may enrich the control methods of the intelligent scheduling system under various operating conditions and make the pressure regulation of pipeline networks more stable and reasonable.
RESULTS AND DISCUSSION
Optimization of data evaluation weights
In consideration of the data evaluation method mentioned in section 2.3, one of the challenges is to find an optimal combination of weights to calculate scores and filter data, such that the model trained has the best performance on the validation dataset. To address this, we designed a set of experiments using four combinations of weights to filter the data. Then, we tested the performance of the trained model on the validation dataset (from 23 May 2021 to 31 December 2021), and the specific results are shown below. It should be noted that from 20 July 2022 to 11 August 2022, and 14 September 2022 to 25 October 2022, the system was temporarily suspended and switched to manual scheduling due to pump cutting and saltwater intrusion, respectively.
From Table 1, it can be observed that the performance of the model improves as the weight of pressure satisfaction increases. The best performance is achieved with the weight combination of (0.6, 0.4), which results in the lowest RMSE, MAE, and MAPE. This suggests that the pressure satisfaction plays a significant role in filtering suitable data samples for training the model.
Weight combination () . | RMSE . | MAE . | MAPE . |
---|---|---|---|
(0.2, 0.8) | 19.10 | 17.38 | 7.58 |
(0.4, 0.6) | 14.96 | 11.04 | 4.79 |
(0.6, 0.4) | 10.45 | 7.92 | 3.32 |
(0.8, 0.8) | 10.49 | 8.06 | 3.37 |
Weight combination () . | RMSE . | MAE . | MAPE . |
---|---|---|---|
(0.2, 0.8) | 19.10 | 17.38 | 7.58 |
(0.4, 0.6) | 14.96 | 11.04 | 4.79 |
(0.6, 0.4) | 10.45 | 7.92 | 3.32 |
(0.8, 0.8) | 10.49 | 8.06 | 3.37 |
Scheduling system operation performance
In order to accurately evaluate the actual performance of the intelligent scheduling system, we continuously collected scheduling order data from June 2022 to November 2022 and analyzed the adoption rate, evaluation scores, and energy consumption per unit.
Order adoption rate
The order adoption rate is the ratio of the actual number of executed orders to the number of orders pushed to the operator by the intelligent scheduling system. A higher adoption rate means that professionals have a higher appreciation of the reasonability and reliability of the orders generated by the intelligent scheduling system. The data of the order adoption rate from June 2022 until November 2022 have been summarized in Table 2.
. | Number of orders generated . | Number of orders executed . | Orders adoption rate (%) . |
---|---|---|---|
Water plant | 224 | 223 | 99.55 |
Pump station | 1,446 | 1,385 | 95.78 |
Total | 1,670 | 1,608 | 96.29 |
. | Number of orders generated . | Number of orders executed . | Orders adoption rate (%) . |
---|---|---|---|
Water plant | 224 | 223 | 99.55 |
Pump station | 1,446 | 1,385 | 95.78 |
Total | 1,670 | 1,608 | 96.29 |
The result shows that the total adoption rate of all the orders generated by the intelligent scheduling system is 96.29%, including 99.55% for water plant orders and 95.29% for pumping station orders, which indicates that the orders generated by the intelligent scheduling system are reasonable and can meet the water supply demand of the service area most of the time. The very few scheduling orders that were not adopted were mainly false triggers due to unstable monitoring data. The control algorithm should be further modified in the future to avoid the influence of monitoring noise on scheduling orders.
Model evaluation score
According to the statistics of the evaluation scores for the scheduling orders, 89% of the scores after executing the scheduling orders are greater than 75, which are considered ‘excellent’ level. It can be found that the low-scoring periods of scheduling scoring orders are mainly concentrated in the period when the intelligent scheduling system is shut down, while the scheduling scores are higher in other periods, which indicates that compared with manual scheduling, the scheduling orders generated by the intelligent scheduling system have a better performance in terms of rationality, optimization, stability, and accuracy.
Unit energy consumption
In Figure 8, the dark blue line represents the change in unit energy consumption data in 2022, the light blue line represents the change in unit energy consumption data in 2021, and the portion checked in red represents the period when the intelligent scheduling system is out of operation due to water system maintenance and the impact of salty tides. The results show that the unit energy consumption in 2021 is higher than the unit energy consumption in 2022 during the time when the intelligent scheduling system is in operation, and the overall unit energy consumption reduces by 3.05% on average compared to the historical period. Meanwhile, comparing the energy consumption data in 2022, it can be found that the energy consumption of the WDNs increases noticeably when the intelligent scheduling system is switched to manual scheduling, which indicates that the scheduling orders generated by the intelligent scheduling system can effectively reduce the energy consumption of the WDNs compared with manual scheduling.
CONCLUSIONS
This paper proposes a novel real-time intelligent scheduling system for the large-scale WDNs based on deep learning models. The workflow of the real-time intelligent scheduling system includes: (1) screening samples from historical data through the data evaluation method; (2) constructing two novel deep learning neural networks(MH-CGRU and MH-GRU) to generate scheduling orders by constructing the mapping relationship between the operating status of the WDNs and the decision-making variables of the pump station; and (3) combining with the hydraulic model to further simulate the execution performance of the scheduling orders to ensure its reasonableness.
The proposed framework is successfully applied to the real-time scheduling in the Qingdong region of Shanghai, and more than 97% of the system-generated scheduling orders are adopted by manual operators during the actual operation period, which demonstrates the reasonableness of the generated orders. In addition, the framework has proven to have good tracking performance, with a tracking error of less than 3%. More importantly, the energy consumption of the water supply system was reduced compared to the manual scheduling data of the previous year, which demonstrates the optimization of the intelligent scheduling system.
In future work, reinforcement learning algorithms can be further incorporated to construct intelligent agents that fully interact with the hydraulic model and design reward functions in conjunction with the existing scheduling evaluation mechanism to expand the application of the scheduling system under various abnormal conditions (e.g., pumps shutdown and water supply accidents).
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.