ABSTRACT
This study investigates rapid dynamic pressure variations in water distribution networks due to critical incidents such as pipe bursts and valve operations. We developed and implemented a machine learning (ML)-based methodology that surpasses traditional slow cycles of pressure data acquisition, facilitating the efficient capture of transient phenomena. Employing the Orion ML library, which features advanced algorithms including long short-term memory dynamic threshold, autoencoder with regression, and time series anomaly detection using generative adversarial networks, we engineered a system that dynamically adjusts data acquisition frequencies to enhance the detection and analysis of anomalies indicative of system failures. The system's performance was extensively tested using a pilot-scale water distribution network across diverse operational conditions, yielding significant enhancements in detecting leaks, blockages, and other anomalies. The effectiveness of this approach was further confirmed in real-world settings, demonstrating its operational feasibility and potential for integration into existing water distribution infrastructures. By optimizing data acquisition based on learned data patterns and detected anomalies, our approach introduces a novel solution to the conventionally resource-intensive practice of high-frequency monitoring. This study underscores the critical role of advanced ML techniques in water network management and explores future possibilities for adaptive monitoring systems across various infrastructural applications.
HIGHLIGHTS
Developed a machine learning (ML)-based system to dynamically monitor pressure in water networks.
Utilized advanced Orion ML library algorithms for real-time anomaly detection.
Enhanced operational efficiency by optimizing data acquisition rates.
Demonstrated effectiveness in pilot and real-world water distribution settings.
Integrated Orion ML library with advanced algorithms to improve infrastructure resilience.
INTRODUCTION
In water distribution networks, events such as pipe bursts and valve operations manifest through the dynamics of pressure waves. However, the slow cycles of pressure data acquisition, relative to the rapid propagation of pressure waves, pose challenges in capturing these dynamics effectively. Ongoing research has been directed toward overcoming this limitation in actual water distribution systems (WDSs). For instance, Choi et al. (2015) analyzed the complexities of pressure monitoring in water supply systems, particularly challenges arising from valve operations. They identified the constraints of existing supervisory control and data acquisition (SCADA) systems in recording rapid transient events and stressed the importance of strategic data sampling locations and intervals. Furthermore, Starczewska et al. (2015) examined the occurrence and impacts of transient events on water networks, highlighting the need for a thorough analysis of transient phenomena within complex network configurations.
Considering the increasing complexities in monitoring and managing WDSs, recent research has significantly enhanced our understanding and capabilities. Dai et al. (2024a) conducted a comparative assessment of global sensitivity approaches to elucidate the uncertainty in water resources models, revealing crucial insights into the constraints and possibilities of current hydrological models under varying parameters. Furthermore, Dai et al. (2024b) introduced a novel two-step Bayesian network-based process for sensitivity analysis in complex nitrogen reactive transport modeling, providing a new perspective on managing the intricacies of nutrient transport in water networks. Complementing this research, Dai et al. (2023) undertook experimental and numerical studies on the mechanisms of ground collapse resulting from underground drainage pipe leakage, which directly impacts the structural integrity and operational reliability of urban water systems. Additionally, Hu et al. (2023) have provided advanced techniques for enhancing defect feature purification in multilabel sewer defect classification, pushing the boundaries of technology in precisely identifying infrastructure anomalies. These studies collectively underscore the critical role of integrating advanced analytical methodologies and the latest machine learning (ML) techniques to improve the robustness and accuracy of anomaly detection and management in water distribution networks. Based on these recent advancements, this study proposes an innovative method to dynamically adjust data acquisition frequency based on ML-driven insights, thereby addressing both immediate and long-term challenges in water network management.
Insights obtained from recent studies underscore the critical operational benefits of event detection in water distribution networks. High-frequency, transient-flow pressure data facilitate the precise and cost-efficient identification of system conditions, such as leaks, variations in wave speed, blockages, and trapped air pockets. These detection methods typically utilize inverse transient analysis (ITA), an approach that often involves intentionally inducing transient flows. While effective, this method poses a risk of serious hydraulic incidents. Consequently, the scholarly contributions of Colombo et al. (2009), Duan et al. (2011, 2014), Ferrante et al. (2014), and Vítkovský et al. (2007) in leak detection, as well as the research by Covas & Ramos (2010), Kim (2011), and Kim et al. (2014) on identifying wave speeds, blockages, and air pockets, hold a significant value. To detect such parameters in a pipeline system using the ITA technique, the intentional generation of transient flow should precede it; however, the intentional generation of transient flow can lead to a severe hydraulic accident in the WDS. Therefore, attempts have been made to inject a small amount of pressure through the pipeline network and record the reflected pressure signal without high risk (Brunone et al. 2021; Lee et al. 2021). Recent advancements have shown the effectiveness of high-resolution pressure sensors for accurate leak localization (Levinas et al. 2021). Furthermore, time-series analysis using multiple pressure sensors has been utilized to improve leakage detection in WDSs.
Transient-flow events commonly occur in WDSs owing to the periodic operation of the hydraulic components. However, these events have rarely been observed by data acquisition systems in the WDSs, owing to the low frequency of pressure sampling. Starczewska et al. (2015) argued that the current regulation of WDSs in the UK for pressure monitoring at 15-min intervals is not fast enough to capture changes in pressure due to transient flow. The authors collected high-frequency (100 Hz) pressure data from various points in a real WDS, and the number of severe transient-flow events was ascertained, which could not be determined via low-frequency data collection. Choi et al. (2015) conducted a similar study. They reported a valve-induced transient event in a real WDS with a 1-s time interval (1 Hz). This event cannot be recorded with the existing SCADA system of real WDSs, which records the pressure signal every 1 min (1/60 Hz).
Thus, transient events, which can be used for the diagnosis of WDSs, cannot be observed with low-frequency data acquisition but can be observed with high-frequency sampling. However, few papers have proposed appropriate data sampling frequencies for WDSs. Ye & Fenner (2014) investigated the appropriate sampling interval of flow data for burst alarms in a WDS. In these studies, various data sampling intervals of the flow rate were applied to observe the impact on the accuracy of burst detection. The results indicated that burst events with long durations can be detected even with low-frequency data by applying an adaptive Kalman filter algorithm. Recent field tests have further validated the importance of high-frequency data acquisition for accurate pipeline monitoring (Brunone et al. 2024). However, there have been no studies on the appropriate data sampling frequency for observing transient-flow events in WDSs.
In response to these challenges, this study introduces a device engineered to dynamically adjust its data acquisition frequency based on detected network events. We developed and implemented an unsupervised ML algorithm to facilitate this dynamic adjustment. The efficacy of this device and the ML algorithm's performance were rigorously tested in a pilot-scale water distribution network experimental setup, as well as in actual operational networks.
METHODS
Time-series anomaly detection using the Orion ML library
In this study, we utilized the Orion ML library, which was developed by the AI Laboratory at the Massachusetts Institute of Technology. This library is specifically tailored for the unsupervised detection of anomalies in time-series data and includes a suite of automated ML tools designed to handle a diverse array of datasets – from spacecraft telemetry signals to soil moisture levels and urban traffic patterns (Alnegheimish et al. 2022).
Anomaly detection in our study is defined as the process of identifying unexpected changes in water pressure that may signal critical failures, such as pipe bursts or abrupt valve closures within a water distribution network. These incidents are among the most detrimental that can occur in these systems.
The ML models provided by Orion follow several key steps. Initially, an ML algorithm is trained to recognize patterns within the data. Following the training phase, the model constructs a predictive time series of values derived from these learned patterns. This predictive series is then compared against actual observed data. Any significant deviations between the model's predictions and the actual data series are identified as anomalies.
ML models
Our study utilizes ML models from the Orion library, which have been shown to outperform traditional autoregressive integrated moving average techniques for processing time-series data (Wong et al. 2022). The employed models include the long short-term memory (LSTM) dynamic threshold, autoencoder with regression (AER), and time-series anomaly detection using generative adversarial networks (TadGANs).
The LSTM dynamic threshold method, specifically utilized by Hundman et al. (2018) for detecting anomalies in spacecraft, employs LSTM networks to dynamically adjust thresholds for pinpointing anomalies. This method combines the robust capabilities of LSTM networks with a non-parametric dynamic thresholding approach to achieve precise anomaly detection. Furthermore, the effectiveness of this technique has been validated with real spacecraft data, illustrating various strategies to enhance system performance under actual operational conditions.
The AER model integrates the strengths of both prediction- and reconstruction-based approaches, employing a joint objective function to train an autoencoder with a regression component. This design enables the generation of both reconstruction- and prediction-based anomaly scores. Wong et al. (2022) introduced this innovative architecture to enhance anomaly detection in time-series data by overcoming the limitations inherent in existing methodologies. The AER model yields more precise anomaly scores by executing bidirectional predictions and reconstructions concurrently. Their research further investigates various methods for combining prediction- and reconstruction-based scores, demonstrating that such integration significantly improves the performance of anomaly detection systems.
TadGAN utilizes generative adversarial networks (GANs) to reconstruct time-series data and detect anomalies through contextual error assessment. Introduced by Geiger et al. (2020), TadGAN represents a cutting-edge GAN-based framework for anomaly detection that employs both Generators and Critics, integrated with LSTM networks, to effectively capture and reconstruct time-series distributions. This architecture benefits from cycle consistency loss, notably enhancing its ability to detect anomalies by accurately reconstructing time-series data.
Hyperparameters
The Orion ML library facilitates the construction of ML pipelines that integrate time-series data preprocessing techniques with anomaly detection methods and ML models, aiming to optimize classification performance through the ideal combination of hyperparameters. Particularly, the ‘interval’ and ‘window_size_portion’ hyperparameters were identified as having a significant impact on the classification performance in our experimental settings.
The ‘interval’ parameter dictates the frequency at which signal preprocessing is executed. The data for validation were collected at a rate of 100 Hz, and since the Orion library is not configured to process such high-frequency data directly, we preprocessed this data into a Unix time format compatible with the Orion library.
The ‘window_size_portion’ is critical in determining the accuracy with which the model interprets the discrepancy between predicted and actual time-series values. Anomalies are identified based on data within the specified window size, and the ‘window_size_portion’ represents the proportion of the window size relative to the total data size. A smaller ratio suggests a narrower window, enhancing the detection of rapidly evolving intricate patterns, whereas a larger ratio facilitates the observation of broader trends. However, an excessively small window size can result in false positives (FPs) by incorrectly classifying non-existent events, underscoring the necessity of selecting an appropriate ‘window_size_portion’.
Other hyperparameters, which appeared to have minimal impact on the classification performance of our experimental data, were maintained at their default settings in the Orion library; for example, the Adam optimizer was used with a batch size of 64. However, reducing the number of training epochs due to computational constraints and operational requirements is imperative. The duration of training epochs directly affects computation time, and given the operational exigencies of real-world water distribution networks, the response time – from the occurrence of an event to its detection – must be within 1 h. Consequently, training and event detection needed to be completed within this specified timeframe.
Evaluation
This metric is particularly significant in the typical contexts of water distribution networks, which may involve outdoor or underground settings, where using high-specification computing devices is impractical. Relative execution time offers insights into the operational feasibility of ML algorithms, ensuring they are precise in anomaly detection and operationally efficient. Such efficiency is indispensable for systems requiring real-time analysis and implementation in edge-computing scenarios.
Pilot-scale WDS
Pipe ID . | Node 1 . | Node 2 . | Diameter (mm) . | Length (m) . | Material . |
---|---|---|---|---|---|
P1 | RUpstream | N1 | 300 | 64.50 | DCIP |
P2 | N1 | N2 | 300 | 15.94 | DCIP |
P3 | N2 | N3 | 300 | 70.05 | PVC |
P4 | N3 | N4 | 300 | 43.33 | PE |
P5 | N4 | N5 | 300 | 218.59 | PVC |
P6 | N5 | N6 | 300 | 134.26 | PVC |
P7 | N5 | N8 | 300 | 261.34 | PVC |
P8 | N6 | B | 300 | 135.08 | PVC |
P9 | B | N7 | 300 | 68.73 | PVC |
P10 | N7 | A | 300 | 21.09 | PE |
P11 | A | PDAQ | 300 | 29.45 | PE |
P12 | PDAQ | N8 | 300 | 22.32 | DCIP |
P13 | B8 | N9 | 300 | 27.98 | DCIP |
P14 | N9 | N10 | 300 | 50.36 | PVC |
P15 | N10 | N11 | 300 | 50.25 | SP |
P16 | N11 | N12 | 300 | 77.24 | PVC |
P17 | N12 | N13 | 300 | 34.11 | PVC |
P18 | N13 | N14 | 300 | 26.77 | PE |
P19 | N14 | N15 | 300 | 70.00 | SP |
P20 | N15 | RDownstream | 300 | 6.99 | PVC |
Pipe ID . | Node 1 . | Node 2 . | Diameter (mm) . | Length (m) . | Material . |
---|---|---|---|---|---|
P1 | RUpstream | N1 | 300 | 64.50 | DCIP |
P2 | N1 | N2 | 300 | 15.94 | DCIP |
P3 | N2 | N3 | 300 | 70.05 | PVC |
P4 | N3 | N4 | 300 | 43.33 | PE |
P5 | N4 | N5 | 300 | 218.59 | PVC |
P6 | N5 | N6 | 300 | 134.26 | PVC |
P7 | N5 | N8 | 300 | 261.34 | PVC |
P8 | N6 | B | 300 | 135.08 | PVC |
P9 | B | N7 | 300 | 68.73 | PVC |
P10 | N7 | A | 300 | 21.09 | PE |
P11 | A | PDAQ | 300 | 29.45 | PE |
P12 | PDAQ | N8 | 300 | 22.32 | DCIP |
P13 | B8 | N9 | 300 | 27.98 | DCIP |
P14 | N9 | N10 | 300 | 50.36 | PVC |
P15 | N10 | N11 | 300 | 50.25 | SP |
P16 | N11 | N12 | 300 | 77.24 | PVC |
P17 | N12 | N13 | 300 | 34.11 | PVC |
P18 | N13 | N14 | 300 | 26.77 | PE |
P19 | N14 | N15 | 300 | 70.00 | SP |
P20 | N15 | RDownstream | 300 | 6.99 | PVC |
SP, steel pipe; DCIP, ductile iron pipe; PVC, polyvinyl chloride; PE, polyethylene.
To facilitate event generation within the network, two devices were strategically installed at designated Points A and B. Point A, fitted with a 50-mm ball valve connected to the 300 mm water main, is positioned 51.77 m away from the pressure data acquisition point. This setup is optimized for the direct detection of significant hydraulic events. Conversely, Point B, equipped with a 15-mm ball valve on a fire hydrant, is located 141.59 m from the data acquisition site, making it ideal for capturing smaller-scale events from a distance.
The experimental data were gathered at a site 801.02 m downstream from the upstream reservoir (Point of DAQ). To ensure the precision of data acquisition, high-accuracy pressure sensors were employed. These sensors, Model PXJ409-1.0 MGI from Omega Engineering Inc., feature a measurement range of 0 –1.0 MPa and boast an accuracy of 0.08%. Data acquisition was facilitated by the NI-9253 module from National Instruments Inc., capable of capturing pressure and flow data at a sampling rate of up to 1,000 Hz. Additionally, a software routine was developed in LabVIEW to configure and manage the data acquisition system effectively.
Field validation
Pipe ID . | Node 1 . | Node 2 . | Diameter (mm) . | Length (m) . | Material . |
---|---|---|---|---|---|
P1 | N1 | N2 | 400 | 257.63 | DCIP |
P2 | N2 | Site A | 200 | 94.13 | DCIP |
P3 | Site A | N3 | 100 | 47.10 | DCIP |
P4 | N2 | N4 | 400 | 172.01 | DCIP |
P5 | N4 | Site B | 200 | 18.77 | DCIP |
Pipe ID . | Node 1 . | Node 2 . | Diameter (mm) . | Length (m) . | Material . |
---|---|---|---|---|---|
P1 | N1 | N2 | 400 | 257.63 | DCIP |
P2 | N2 | Site A | 200 | 94.13 | DCIP |
P3 | Site A | N3 | 100 | 47.10 | DCIP |
P4 | N2 | N4 | 400 | 172.01 | DCIP |
P5 | N4 | Site B | 200 | 18.77 | DCIP |
DCIP, ductile iron pipe.
In this study, the sensors installed at each site within the actual water distribution network were outfitted with a broad temperature compensation range of −40 to 105 °C, rendering them highly suitable for environments experiencing significant diurnal and seasonal temperature fluctuations. The employed pressure sensor, Model SPT-I2 from Prignitz Inc., boasts a measurement range of 0–1.0 MPa and an accuracy of 0.5%.
RESULTS AND DISCUSSION
Detection of simple events in the pilot-scale water distribution network
At Point A, a controlled event was simulated by the rapid opening and subsequent slow closing of a 50-mm ball valve, which induced two pronounced rapid pressure drops, effectively demonstrating the transient-flow characteristic of a rupture event. However, the pressure wave behavior resulting from the valve closure was less discernible. Given the potential for serious safety incidents or pipe damage within the 300-mm network when a 50-mm valve is suddenly closed, the valve was intentionally closed slowly to mitigate the generation of large pressure waves. The distinct pressure waves prompted by valve-induced ruptures contrast sharply with the sinusoidal waves produced by the pumps, indicating that the detection of such anomalies could be facilitated by setting specific thresholds. Notably, the pressure wave behavior associated with the valve opening was recorded twice during the 130-s interval, specifically from 13.32 to 28.467 s and from 74.32 to 89.45 s, each instance triggered by the rapid activation of the valve.
At Point B, operations analogous to those at Point A were conducted, with events initiated by the rapid opening and closing of a smaller 15-mm ball valve. However, unlike the events at Point A, these actions at Point B induced only minor pressure fluctuations – challenging to discern visually against the continuous background of sinusoidal waves despite the generation of more severe events through rapid valve operations. Hence, distinguishing these subtle fluctuations from the ongoing sinusoidal background remains complex.
The dynamics of the pressure waves resulting from valve operations at Point B were recorded during four distinct events within the 130-s test period. Specifically, a rapid opening of the valve from 45.23 to 56.32 s triggered one pressure wave, while a subsequent rapid closing of the already-open valve from 67.13 to 78.21 s resulted in another. Furthermore, another rapid valve opening from 90.32 to 101.40 s and a rapid closing from 116.23 to 127.32 s each produced additional pressure waves.
We employed the LSTM dynamic threshold model to assess the detectability of rupture events at Points A and B. The settings for the performance evaluation were configured as follows: the interval and the window_size_portion were set to 86,000 s to 0.33, respectively, with all other hyperparameters remaining at their default values. Figure 3(b) displays the outcomes of the event detection. The actual event periods are marked with blue shading, while the red shading indicates the periods of detected anomalies using the dynamic threshold model. At Point A, the model effectively recognized distinct pressure waves resulting from valve operations, accurately identifying the first event from 14.78 to 25.85 s and the second event from 75.45 to 88.30 s, consistent with the actual timings of the valve operations. Although the rupture events at Point B were visually subtle, the model successfully detected the first valve opening event from 45.68 to 54.43 s, the subsequent valve closing from 68.89 to 77.40 s, and the next valve opening from 91.87 to 101.89 s. However, the model failed to recognize the final valve-closing event. The rapid opening and closing of the 15-mm valve at Point B, which coincided with a peak of the pump-induced sinusoidal wave, posed detection challenges, possibly attributed to the configured interval being significantly longer than the actual input data interval. Furthermore, the default hyperparameter settings, such as the window size, might not have been optimal for capturing the specific characteristics of the anomalies targeted in our study.
Model selection and hyperparameter tuning
The primary objective was to identify the most suitable ML model and optimal hyperparameter settings based on the experimental outcomes, with specific attention given to the interval and window_size_portion. Table 3 provides a detailed summary of the model and hyperparameter combinations tested in the experiments. We evaluated 48 unique combinations, assessing each one's performance based on recall, F1-score, and relative execution time. These metrics were crucial for determining each combination's effectiveness in accurately detecting events while also considering execution efficiency and speed.
Model . | Interval . | Window_size_portion . |
---|---|---|
LSTM dynamic threshold | 21,600 | 0.01 |
AER | 28,800 | 0.13 |
TadGAN | 32,400 | 0.23 |
36,000 | 0.33 |
Model . | Interval . | Window_size_portion . |
---|---|---|
LSTM dynamic threshold | 21,600 | 0.01 |
AER | 28,800 | 0.13 |
TadGAN | 32,400 | 0.23 |
36,000 | 0.33 |
Model . | Recall . | F1-score . | Relative execution time . |
---|---|---|---|
AER | 0.188 ± 0.252 | 0.215 ± 0.234 | 0.6 ± 0.1 |
LSTM dynamic threshold | 0.688 ± 0.162 | 0.802 ± 0.114 | 1.2 ± 0.1 |
TadGAN | 0.365 ± 0.172 | 0.416 ± 0.166 | 13.2 ± 3.3 |
Model . | Recall . | F1-score . | Relative execution time . |
---|---|---|---|
AER | 0.188 ± 0.252 | 0.215 ± 0.234 | 0.6 ± 0.1 |
LSTM dynamic threshold | 0.688 ± 0.162 | 0.802 ± 0.114 | 1.2 ± 0.1 |
TadGAN | 0.365 ± 0.172 | 0.416 ± 0.166 | 13.2 ± 3.3 |
Field validation: assessing operational performance under real-world conditions
CONCLUSIONS
In this study, we developed and implemented a method using ML algorithms to detect pressure waves generated by various anomalies (such as pipe bursts and valve operations) in water distribution networks. We rigorously evaluated and selected the most effective ML model and hyperparameter combinations based on experiments conducted within a pilot-scale network. To validate the applicability of our methodology, we installed and operated a device in an actual water distribution network, which adjusted the data acquisition cycle variably according to the outputs from the ML model. This approach enabled the detection of nuanced behaviors of pressure waves, which were challenging to discern with the standard 1 s pressure acquisition cycle in the network. The direct application of the developed pressure acquisition device in real water distribution networks is anticipated to provide critical operational insights. These insights are expected to be instrumental in transient-flow analysis and other related applications. Our methodology also promotes data storage efficiency by maintaining standard data acquisition frequencies during periods devoid of detectable events, thereby ensuring that critical incidents are captured without fail.
Despite these advancements, the lack of a clear standard for modifying the data acquisition frequency based on the detection or absence of network events indicates a pressing need for further development of refined methodologies. Given the limitations of current pressure acquisition systems within water distribution networks, our study highlights the imperative of advancing and fine-tuning our ML-based anomaly detection system. This technology holds particular promise for regions prone to transient pressure disturbances and frequent pipeline ruptures – events that conventional systems often fail to detect. Implementing our system in such settings could reveal the root causes of previously undetected events, potentially leading to significant enhancements in network management and emergency response protocols. Future research will focus on establishing practical prediction intervals based on actual operational data – crucial for optimizing our methodology and ensuring its effective application in real-world settings.
ACKNOWLEDGEMENT
This work was supported by the Korea Planning & Evaluation Institute of Industrial Technology funded by the Ministry of the Interior and Safety (MOIS, Korea; Project Name: Development of water quality platform to prevent with tap/drinking water accidents/Project Number: 20025188).
DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.
CONFLICT OF INTEREST
The authors declare there is no conflict.