Abstract
This research article presents a data-driven approach for detecting bursts in water distribution networks (WDNs). The framework uses spatiotemporal information from monitoring pressure and unsupervised learning model. This approach employs three stages: (1) benchmark dataset acquisition, (2) spatiotemporal information analysis, and (3) burst detection model construction. First, the benchmark datasets were the normal dataset initially obtained by the clustering algorithm. Second, spatiotemporal information features are extracted from multimoment time windows from multiple sensors, including the distance and shape features. Third, burst detection was performed based on the isolation forest technique. A WDN is used to evaluate the performance of the method. Results show that the method can effectively detect the burst.
HIGHLIGHTS
Burst detection method based on the unlabeled monitoring data.
Multiple consecutive moments of data.
Spatial data information for multiple meters.
Two dimensions of data features.
The burst detection method trained with monitoring data.
INTRODUCTION
Urbanization, climate change, and global water scarcity pose threats to water supply security (Butler et al. 2014). Water distribution networks (WDNs) are one of the main infrastructures of cities and ensure the urban development and livelihood of the inhabitants (Gong et al. 2014; Oliker & Ostfeld 2014; Yan et al. 2019). The safe, efficient, and economic operation of WDNs is essential in the modern society (Jensen & Jerez 2018). Water loss is primarily caused by bursts and leaks in WDNs, often occurring in aging and deteriorating infrastructure. For example, China lost approximately 7.85 × of water in 2017 (MOHURD 2017). Bursts occur at high to moderate flow rates, and their impact can persist for a few hours when reported or potentially several days when unreported. On the other hand, leaks are characterized by lower flow rates and longer durations, often found throughout WDNs. Background leakage refers to the small leaks that happen at service connections, generating flow rates too low to be physically detected (Farley & Trow 2003). In addition, burst events may decrease the life span of water supply infrastructure and cause environmental problems (Xu et al. 2014) and other social losses, such as floods and road collapses (Fox et al. 2016; Laucelli et al. 2016, 2017). One of the crucial research tasks in WDNs is focused on timely burst detection and mitigating the hazards they present. Burst events consist of unawareness, detection, location, and repair periods (Mounce et al. 2010; Wu & Liu 2017). Consequently, this notion means that the effect of the burst detection method is closely related to the length of burst events. The occurrence of burst events can be identified in a timely and effective manner, shortening the run time of the burst events and reducing the volume of bursts (Tornyeviadzi et al. 2023).
Previous research on burst detection has focused on transient-based, model-based, data-driven detection methods. The transient-based methods detect bursts by analyzing monitoring data in time or frequency domains (Datta & Sarkar 2016). One technique involves directly detecting transient negative pressure waves caused by bursts. Burst detection is then completed by analyzing the simulation results of the transient model and the differences in the transient measurement data (Colombo et al. 2009; Srirangarajan et al. 2013; Lee et al. 2016). However, this approach requires collecting large amounts of high-frequency data, which can be expensive to transmit. Moreover, transient signals due to bursts can be masked by background noise and other events. Another technique is to analyze the negative pressure propagation and reflection processes caused by the burst (Mpesha et al. 2001; Gong et al. 2013, 2014). However, due to the complexity of the actual WDNs, including valves, tanks, and other components, any one of which may lead to severe attenuation of transients. So far, most of the tests are carried out in a simple WDN under strictly controlled experimental conditions (Wu & Liu 2017). It is still difficult to apply the transient-based method in actual WDNs, especially in large and complex ones.
The model-based methods (Meseguer et al. 2014; Jensen & Jerez 2019; Sophocleous et al. 2019; Li et al. 2021; Zhang et al. 2021a, 2021b) in previous studies detect bursts by analyzing the correlation between the estimates given by the hydraulic model and measurements obtained from the Supervisory Control and Data Acquisition (SCADA) system. The performance of the methods is influenced by the availability and accuracy of hydraulic models (Savic et al. 2009; Sanz et al. 2016; Menapace et al. 2018; Huang et al. 2022). Reliable hydraulic models consist of hydraulic model construction and hydraulic model calibration. The hydraulic model calibration involves verifying the unmeasurable parameters, including nodal demand, pipe roughness, and pipe diameter, based on measurable parameters such as pipe length, pipe flow rate, and nodal head. Hydraulic models contain the errors in model construction, nodal demand uncertainties, and measurement noise (Blesa & Perez 2018). Furthermore, high computational costs also hinder the application of model-based methods (Romero et al. 2022).
The new era of artificial intelligence and big data has resulted in the development of innovative methods for creating sustainable management modes (Savic 2019; Wu et al. 2021). With sufficient acquisition equipment, a good and widely concerned method is to utilize and analyze the measured data collected by the SCADA system to detect bursts (Hu et al. 2021). The data-driven approaches based on a large amount of data to detect burst are highly effective (Zaman et al. 2020) and do not rely on the accuracy of the model. This method can be categorized as classification methods, prediction–classification methods, statistical methods, and unsupervised clustering methods (Wu & Liu 2017; Hu et al. 2021). Classification methods such as artificial neural networks (ANNs) (Mounce et al. 2014; Pérez-Pérez et al. 2021), graph neural networks (Zanfei et al. 2022), convolutional neural networks (Shukla & Piratla 2020), and k-nearest neighbor (Bermúdez et al. 2020; Tariq et al. 2021) distinguish bursts from normal data. Prediction–classification methods aim to detect bursts by evaluating the residual errors of predicted and actual values as per certain evaluation criteria. Examples include ANN (Mounce et al. 2002, 2010; Romano et al. 2014; Fang et al. 2019), long short-term memory (Wang et al. 2020), linear Kalman filter (Ye & Fenner 2011), polynomial functions based on weighted least squares (Ye & Fenner 2014), and random forest (Huang et al. 2018; Zhang et al. 2021a). The accuracies of this approach depend on the accuracy of the prediction or classification model (Hu et al. 2021). Statistical approaches detect bursts based on the statistical process control instead of prediction or classification models. The burst detection methods have been widely used such as Western Electric Company (WEC) rules (Ahn & Jung 2019), Hotelling control chart (Palau et al. 2012; Hashim et al. 2020), cumulative sum method (Ahn & Jung 2019), and exponentially weighted moving average (Nam et al. 2019). However, the inappropriate assumption for data distribution in statistical methods results in performance deterioration (Wu & Liu 2017). Specifically, the distribution of the monitoring data is assumed to be Gaussian. The method can serve as an important step of other detection methods to improve the performance in detecting bursts. In addition, the unsupervised clustering algorithm allows features, such as cosine distances (Wu et al. 2018), to be extracted from available data for burst detection.
The aforementioned classification and prediction–classification methods are based on the supervised binary classification models. An important prerequisite for the successful application of the aforementioned algorithm is to have a large amount of labeled data. The major limitation is that not all the accurately labeled burst data are available. In future research, the application of an unsupervised model is promising and worth investigating (Wu & Liu 2017). In addition, the statistical approach can serve as an important step in other detection methods and be combined with other methods to detect bursts (Wu & Liu 2017; Hu et al. 2021).
Burst events will result in different pressure variation amplitudes of multiple sensors. These events also result in a sustained reduction of pressure over a period of time. Consequently, pressure data from different sensors are temporally and spatially connected. On the one hand, in the previous research, the burst is detected by a single time step (Wang et al. 2020). On the other hand, monitoring data information from multiple sensors is more reliable than a single sensor. The correlation features of the monitoring data from multiple pressure sensors are extracted to detect pipe bursts (Wu et al. 2018; Xu et al. 2020). The method for extracting spatiotemporal correlation pressure information can improve the accuracy of burst detection.
In recent years, district metering areas (DMAs) have been widely adopted for the control and management of leakage as well as burst in WDNs. WDNs are typically divided into smaller areas called DMAs, which are convenient for independent metering (Moors et al. 2018; Zhang et al. 2019). However, burst detection is difficult due to unpredictable variations in consumer demand, measurement noise, and seasonal effects. The DMA-based burst detection method in this article was investigated to fill the aforementioned gaps. The main idea of the method is to extract pressure data features and detect burst by using an unsupervised model. The major three contributions are as follows: (1) the normal pressure benchmark dataset was established on the basis of historical monitoring data to provide a benchmark data basis for timely data analysis; (2) the data features of the temporal and spatial information were extracted to maximize the distinction between normal and burst data; (3) an unsupervised burst detection model based on multidimensional spatiotemporal information has been developed to improve the effectiveness of burst detection. The case network will allow the demonstration and analysis of the overall burst detection strategy from a hydraulic perspective.
METHODOLOGY
Benchmark dataset acquisition
The benchmark data are the monitoring pressure data filtered to exclude burst data based on the distribution characteristics of historical monitoring data. Pressure changes serve as the main basis for burst monitoring. During the burst events, the monitoring data values are smaller than those of normal moments and substantially deviate from the distribution of normal pressure data. Given this feature, the benchmark data are calculated through clustering. In addition, the historical monitoring pressure data of each pressure sensor in WDNs are periodic, meaning that pressure data from the same pressure sensor at the same moment on different days are similar (Wang et al. 2020). These data can be classified into one dataset for clustering and obtaining the benchmark pressure data at that moment. Each time moment for each pressure sensor corresponds to a benchmark pressure data point.
The normal data cluster has a high density, indicating a concentrated distribution, while burst data are dispersed and have uneven density. The density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm can adapt to datasets with different distribution patterns, allowing for the identification of low-density and high-density clusters in case of uneven dataset distribution (Schubert et al. 2017). In this research, two parameters – Eps and MinPts – were set for the selected DBSCAN clustering algorithm on the basis of a controlled proportion β of the data size in the data cluster with the maximum internal density in the total data. Specifically, the two parameters mainly describe the tightness of the sample distribution in neighborhoods. Eps denotes the threshold for the neighborhood distance of one point, while MinPts indicates the threshold for the number of data points within Eps. The data points within the Eps of one data point are determined to form a cluster, with the single data point as the core point. A large value for Eps, when MinPts is not changed, will cause most data to be clustered in the same cluster. Moreover, a small value for Eps splits a cluster and causes data in the same cluster to be marked as outliers. However, a large number of core data points will be found when MinPts is considerably small.
The acquisition process of the benchmark data is carried out through the following steps:
- (a)
The historical monitoring data are collected from the pressure data acquired by the SCADA system.
- (b)
The two-dimensional data points are clustered for the sake of clustering feasibility and operation efficiency. In this research, the number of time window steps for clustering was set as two, including the data from two consecutive moments.
- (c)
The initial values for Eps and MinPts are set. If a data point selected is the core point, then the data points within the range of Eps form a cluster.
- (d)
If the selected data point is not the core point, then the other data points are preferred.
- (e)
Steps (c) and (d) are repeated until all points are processed.
- (f)
The data cluster division results are acquired to evaluate whether β is met; otherwise, Eps and MinPts in (c) are reset, and steps (c)–(f) are repeated.








Spatiotemporal information analysis of the pressure data
Temporal information extraction of the pressure data
- (1)The monitoring dataset and previously determined benchmark dataset are divided into time windows using a time window step of s. The multi-time pressure data are then classified into the same time window:where
is the pressure data of the pressure sensor k at the sth step point on the first day;
represents the time window pressure data vector at the sth step point on the first day, including s pressure data points; and
denotes the time window data matrix of the pressure sensor k on the ith day:



Schematic of the time window division of pressure data for sensor k.
Schematic of the time window division of benchmark pressure data for sensor k.
- (2)
Data analysis is performed on the time axis for the monitoring pressure data window of the pressure sensor k at time j and the corresponding benchmark pressure data time window. The features distinguishing burst data from normal data are extracted. In this research, the distance and shape features of time windows at time j are calculated.
- (3)
Step (2) is repeated to acquire the time window data features of all pressure sensors at all moments on all days in the historical and latest monitoring datasets from the SCADA system.



Spatial information extraction of pressure data
To achieve burst detection within DMAs, monitoring data from multiple pressure sensors are combined. Sensors in such monitoring networks are grouped based on their distance to reduce the dimensionality of the data feature and improve the method efficiency. The number of groups and the average number of pressure sensors are determined based on the specific WDNs. Closer pressure sensors are grouped together to form the same group.
Burst detection model
IF model


Evaluation criteria


CASE STUDY
Two monitoring pressure datasets were mainly included: a benchmark dataset and a dataset for model training and testing. The former specifically referred to the 1,000-day historical monitoring dataset. The latter consisted of a historical monitoring dataset and a latest monitoring pressure database. The training dataset was composed of 1,200-day pressure data, while the test dataset consisted of 400-day pressure data.
EPANET 2.2 was used to perform the simulation analysis of hydraulic data to facilitate the setting of working conditions under different external noises and bursts. The simulation involved several steps. First, noise disturbance was added to the initial water demand at each node to generate the 1 day water demand. Second, the burst data were mainly correlated with the position, flow rate
, and occurrence time of burst events, and the pressure data at each monitoring point were generated. Third, noise disturbance
was added to the pressure data to generate the pressure data of burst events. Finally, the aforementioned steps were repeated to generate multiday pressure data. In this context,
indicates the random change in users' daily water consumption caused by weather, temperature, and other factors (Wu et al. 2016). The actual water consumption is determined by multiplying
by the initial water consumption.
denotes the measurement errors during the pressure acquisition process. This monitoring pressure is obtained by adding
to EPANET-simulated pressure. This is because
can mask pressure drops by bursts, making detection more difficult (Xu et al. 2020). In this article, both
and
follow a normal distribution (Xu et al. 2020).
Results and discussion
Pressure dataset for obtaining the benchmark dataset at sensor J204 (unit: m)
Days . | 1 . | 2 . | 3 . | 4 . | 5 . | … . | 998 . | 999 . | 1,000 . |
---|---|---|---|---|---|---|---|---|---|
Time step 1 | 28.390 | 28.666 | 28.556 | 28.512 | 28.527 | 28.592 | 28.646 | 28.652 | |
Time step 2 | 28.651 | 28.654 | 28.413 | 28.434 | 28.566 | 28.545 | 28.366 | 28.415 |
Days . | 1 . | 2 . | 3 . | 4 . | 5 . | … . | 998 . | 999 . | 1,000 . |
---|---|---|---|---|---|---|---|---|---|
Time step 1 | 28.390 | 28.666 | 28.556 | 28.512 | 28.527 | 28.592 | 28.646 | 28.652 | |
Time step 2 | 28.651 | 28.654 | 28.413 | 28.434 | 28.566 | 28.545 | 28.366 | 28.415 |
Benchmark database for all sensors in DMA1 WDNs (unit: m)
. | J39 . | J204 . | . | J417 . | . | J7 . | J156 . |
---|---|---|---|---|---|---|---|
Time step 1 | 28.390 | 46.203 | … | 29.600 | … | 46.298 | 43.673 |
Time step 2 | 28.651 | 46.580 | … | 29.631 | … | 46.546 | 43.555 |
Time step 3 | 28.422 | 46.459 | … | 29.612 | … | 46.734 | 43.746 |
Time step 4 | 28.537 | 47.252 | … | 29.633 | … | 47.121 | 43.778 |
Time step 5 | 28.318 | 47.293 | … | 29.672 | … | 47.175 | 43.965 |
Time step 6 | 28.341 | 47.374 | … | 29.691 | … | 46.956 | 43.677 |
Time step 7 | 28.419 | 48.925 | … | 30.651 | … | 47.147 | 43.915 |
… | … | … | … | ||||
Time step 93 | 31.365 | 54.177 | … | 39.844 | … | 51.209 | 45.634 |
Time step 94 | 31.359 | 54.085 | … | 39.972 | … | 51.602 | 45.787 |
Time step 95 | 31.576 | 54.292 | … | 40.126 | … | 51.694 | 45.754 |
Time step 96 | 31.517 | 54.393 | … | 40.471 | … | 51.873 | 45.892 |
. | J39 . | J204 . | . | J417 . | . | J7 . | J156 . |
---|---|---|---|---|---|---|---|
Time step 1 | 28.390 | 46.203 | … | 29.600 | … | 46.298 | 43.673 |
Time step 2 | 28.651 | 46.580 | … | 29.631 | … | 46.546 | 43.555 |
Time step 3 | 28.422 | 46.459 | … | 29.612 | … | 46.734 | 43.746 |
Time step 4 | 28.537 | 47.252 | … | 29.633 | … | 47.121 | 43.778 |
Time step 5 | 28.318 | 47.293 | … | 29.672 | … | 47.175 | 43.965 |
Time step 6 | 28.341 | 47.374 | … | 29.691 | … | 46.956 | 43.677 |
Time step 7 | 28.419 | 48.925 | … | 30.651 | … | 47.147 | 43.915 |
… | … | … | … | ||||
Time step 93 | 31.365 | 54.177 | … | 39.844 | … | 51.209 | 45.634 |
Time step 94 | 31.359 | 54.085 | … | 39.972 | … | 51.602 | 45.787 |
Time step 95 | 31.576 | 54.292 | … | 40.126 | … | 51.694 | 45.754 |
Time step 96 | 31.517 | 54.393 | … | 40.471 | … | 51.873 | 45.892 |
Schematic of the clustering distribution for the six time windows at sensor J204.
Schematic of the clustering distribution for the six time windows at sensor J204.
- (1)
Detection effect analysis under different time window lengths
Information of the selected pipe burst events
. | Start date . | ![]() | ![]() | ![]() | ![]() |
---|---|---|---|---|---|
Burst event 1 | 4 | 9:15 | 12:00 | J189 | 7 |
Burst event 2 | 6 | 2:00 | 5:30 | J180 | 16 |
Burst event 3 | 27 | 13:45 | 17:15 | J188 | 10 |
Burst event 4 | 29 | 16:30 | 19:15 | J4 | 17 |
Burst event 5 | 34 | 9:00 | 12:45 | J1157 | 19 |
Burst event 6 | 36 | 16:15 | 20:00 | J337 | 21 |
Burst event 7 | 58 | 8:15 | 11:30 | J226 | 24 |
Burst event 8 | 133 | 8:15 | 20:00 | J226 | 24 |
Burst event 9 | 136 | 1:45 | 2:45 | J376 | 19 |
Burst event 10 | 142 | 13:30 | 16:00 | J432 | 7 |
. | Start date . | ![]() | ![]() | ![]() | ![]() |
---|---|---|---|---|---|
Burst event 1 | 4 | 9:15 | 12:00 | J189 | 7 |
Burst event 2 | 6 | 2:00 | 5:30 | J180 | 16 |
Burst event 3 | 27 | 13:45 | 17:15 | J188 | 10 |
Burst event 4 | 29 | 16:30 | 19:15 | J4 | 17 |
Burst event 5 | 34 | 9:00 | 12:45 | J1157 | 19 |
Burst event 6 | 36 | 16:15 | 20:00 | J337 | 21 |
Burst event 7 | 58 | 8:15 | 11:30 | J226 | 24 |
Burst event 8 | 133 | 8:15 | 20:00 | J226 | 24 |
Burst event 9 | 136 | 1:45 | 2:45 | J376 | 19 |
Burst event 10 | 142 | 13:30 | 16:00 | J432 | 7 |

















- (2)
Detection performance analysis under a single and multiple pressure sensors
Burst detection performance of different s in varying with four noise conditions.
Burst detection performance of different s in varying with four noise conditions.
Optimal s intervals for the testing dataset with four noise conditions
![]() . | ![]() . | ![]() . | ![]() . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
s . | TPR (%) . | FPR (%) . | DT (15 min) . | s . | TPR (%) . | FPR (%) . | DT (15 min) . | s . | TPR (%) . | FPR (%) . | DT (15 min) . | s . | TPR (%) . | FPR (%) . | DT (15 min) . |
10 | 93.33 | 2.95 | 1.67 | 7 | 83.33 | 4.95 | 3.00 | 7 | 85.00 | 2.77 | 2.77 | 6 | 76.67 | 4.83 | 3.85 |
9 | 93.33 | 2.87 | 1.70 | 8 | 81.67 | 3.92 | 3.3 | 8 | 85.00 | 2.80 | 2.85 | 10 | 76.67 | 2.94 | 4.48 |
11 | 93.33 | 2.99 | 1.76 | 9 | 81.67 | 4.44 | 3.45 | 9 | 85.00 | 2.86 | 2.9 | 7 | 75.00 | 5.38 | 4.12 |
12 | 93.33 | 2.97 | 1.98 | 10 | 81.67 | 4.52 | 3.53 | 10 | 85.00 | 2.89 | 3.12 | 8 | 75.00 | 2.84 | 4.40 |
13 | 93.33 | 2.99 | 2.00 | 11 | 81.67 | 2.91 | 3.83 | 11 | 85.00 | 2.94 | 3.10 | 9 | 75.00 | 2.86 | 4.45 |
![]() . | ![]() . | ![]() . | ![]() . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
s . | TPR (%) . | FPR (%) . | DT (15 min) . | s . | TPR (%) . | FPR (%) . | DT (15 min) . | s . | TPR (%) . | FPR (%) . | DT (15 min) . | s . | TPR (%) . | FPR (%) . | DT (15 min) . |
10 | 93.33 | 2.95 | 1.67 | 7 | 83.33 | 4.95 | 3.00 | 7 | 85.00 | 2.77 | 2.77 | 6 | 76.67 | 4.83 | 3.85 |
9 | 93.33 | 2.87 | 1.70 | 8 | 81.67 | 3.92 | 3.3 | 8 | 85.00 | 2.80 | 2.85 | 10 | 76.67 | 2.94 | 4.48 |
11 | 93.33 | 2.99 | 1.76 | 9 | 81.67 | 4.44 | 3.45 | 9 | 85.00 | 2.86 | 2.9 | 7 | 75.00 | 5.38 | 4.12 |
12 | 93.33 | 2.97 | 1.98 | 10 | 81.67 | 4.52 | 3.53 | 10 | 85.00 | 2.89 | 3.12 | 8 | 75.00 | 2.84 | 4.40 |
13 | 93.33 | 2.99 | 2.00 | 11 | 81.67 | 2.91 | 3.83 | 11 | 85.00 | 2.94 | 3.10 | 9 | 75.00 | 2.86 | 4.45 |
Best performance in each for the different noise conditions
. | Case 1 . | Case 2 . | Case 3 . | Case 4 . | Case 5 . | Case 6 . | Case 7 . | Case 8 . | Case 9 . | Case 10 . |
---|---|---|---|---|---|---|---|---|---|---|
TPR (%) | 65.00 | 38.33 | 40.00 | 38.33 | 83.33 | 56.67 | 65.00 | 50.00 | 96.67 | 71.67 |
FPR (%) | 2.36 | 2.45 | 2.19 | 2.44 | 2.66 | 4.72 | 2.54 | 2.57 | 2.89 | 4.83 |
DT (15 min) | 6.37 | 8.33 | 8.47 | 8.35 | 3.55 | 6.25 | 6.06 | 7.15 | 2.05 | 4.32 |
. | Case 11 . | Case 12 . | Case 13 . | Case 14 . | Case 15 . | Case 16 . | Case 17 . | Case 18 . | Case 19 . | Case 20 . |
TPR (%) | 81.67 | 66.67 | 100.00 | 81.67 | 88.33 | 78.33 | 100.00 | 86.67 | 96.67 | 85.00 |
FPR (%) | 2.74 | 4.72 | 2.85 | 2.92 | 2.86 | 2.88 | 2.85 | 2.99 | 3.11 | 4.90 |
DT (15 min) | 4.16 | 5.15 | 0.8 | 4.38 | 2.47 | 4.78 | 0.8 | 2.73 | 2.13 | 2.92 |
. | Case 1 . | Case 2 . | Case 3 . | Case 4 . | Case 5 . | Case 6 . | Case 7 . | Case 8 . | Case 9 . | Case 10 . |
---|---|---|---|---|---|---|---|---|---|---|
TPR (%) | 65.00 | 38.33 | 40.00 | 38.33 | 83.33 | 56.67 | 65.00 | 50.00 | 96.67 | 71.67 |
FPR (%) | 2.36 | 2.45 | 2.19 | 2.44 | 2.66 | 4.72 | 2.54 | 2.57 | 2.89 | 4.83 |
DT (15 min) | 6.37 | 8.33 | 8.47 | 8.35 | 3.55 | 6.25 | 6.06 | 7.15 | 2.05 | 4.32 |
. | Case 11 . | Case 12 . | Case 13 . | Case 14 . | Case 15 . | Case 16 . | Case 17 . | Case 18 . | Case 19 . | Case 20 . |
TPR (%) | 81.67 | 66.67 | 100.00 | 81.67 | 88.33 | 78.33 | 100.00 | 86.67 | 96.67 | 85.00 |
FPR (%) | 2.74 | 4.72 | 2.85 | 2.92 | 2.86 | 2.88 | 2.85 | 2.99 | 3.11 | 4.90 |
DT (15 min) | 4.16 | 5.15 | 0.8 | 4.38 | 2.47 | 4.78 | 0.8 | 2.73 | 2.13 | 2.92 |

Burst detection performance of the different numbers of sensors in DMA1.
Burst detection performance of the different numbers of sensors in the various flowrates.
Burst detection performance of the different numbers of sensors in the various flowrates.
DISCUSSION
The method employed in this study detects sudden pipe burst events in WDNs by capturing changes in pressure data. However, it is unable to detect pre-existing background leaks with a very low flow rate. The monitoring data of each pressure sensor obtained by the SCADA system were to extract data features for pipe burst detection. Pipe burst events could be identified in an efficient and timely manner by effectively extracting the data features distinguishing burst from normal data. The burst could be amplified to a certain extent by combining the spatiotemporal correlations of the pressure (Zhang et al. 2021b). However, the pressure data reduction amplitude caused by pipe bursts could be easily covered by noise when a single pressure sensor was used, leading to a decrease in the accuracy of burst detection. To address the issue, multiple pressure sensors were integrated to avoid the low burst detection accuracy by the failure of a single sensor. Overall, the multi-time data performance of the multiple pressure sensors can contribute to the improvement of burst detection accuracy. In this research, an innovative approach for pipe burst detection was proposed. The approach could rapidly extract the data features revealing burst events from large amounts data and detect bursts based on such features. The framework for burst detection was constituted as follows: The spatiotemporal data feature information of time windows from different pressure sensor groups was extracted using the benchmark data estimated from historical data as the reference data. The IF-based unsupervised learning model was then used to detect bursts. Herein, the pressure monitoring data within certain time window length were used to comprehensively monitor burst in the whole DMA, leading to a satisfactory detection as indicated by the three indexes mentioned in this article. Effectively utilizing the spatiotemporal information of multiple sensors and data from multiple time moments is highly effective for timely and efficient detection of bursts (Zhang et al. 2021b). Moreover, detecting bursts in a timely manner and minimizing damage is crucial, particularly when they occur at night or in rural areas (Xu et al. 2020).
CONCLUSIONS
The proposed method can rapidly detect burst events and minimize the damage caused by pipe bursts. This study extensively uses spatiotemporal information, and the corresponding detection results are highly reliable, achieving up to 93.33, 83.33, 85, and 76.67% for four noise cases with the time window lengths ranging from 6 to 13 time steps. The main findings based on the case study are as follows:
- 1.
The pressure change patterns of burst event are entirely different from that under normal conditions. On this basis, the proposed method can extract the essential information features from space and time, providing a viable solution for burst detection.
- 2.
The proposed method effectively detected bursts based on the time window data of each pressure sensor in the DMA. Multiple pressure sensors with time window data in a certain interval achieve the best detection performance, as shown by the comparison results for the detection performance of a single, two, and multiple pressure sensors under different time window lengths. This research overcomes the low reliability and detection rate of single-time and single pressure sensor data in relevant literature.
- 3.
The proposed method, which utilizes unlabeled monitoring pressure data obtained in the DMA, has promising prospects and guiding significance for practical engineering. Moreover, this method avoids the sole reliance on flowmeters and hydraulic model accuracy and overcomes the lack of data with burst and nonburst labels in practical engineering.
As the data-driven method is heavily dependent on monitoring data from sensors, the accuracy of the method depends on the sensor arrangement scheme, which is also its main limitation. To address this, the combination of model-based and data-driven methods will be investigated in the future. Currently, it is challenging to distinguish between abnormal events caused by pipe bursts, significant water usage by large consumers, and operational changes in the pipeline network. In the future, more detailed distinctions will be made between these events through further research and development of advanced detection and identification methods. In addition, the proposed method utilizes steady-state pressure data, which may not perform as well in real-time processing compared to transient-based methods. To improve the timely performance of this method, we plan to couple transient and steady-state data in future studies. Another issue beyond the scope of this study is burst localization. Future research will focus on the accurate burst localization based on pressure after the burst detection.
ACKNOWLEDGEMENTS
This work was supported by the National Key Research and Development Program of China (2022YFF06069004), National Natural Science Foundation of China (52070167), and Zhejiang Provincial Natural Science Foundation of China (LHY22E080003).
DATA AVAILABILITY STATEMENT
All relevant data are available from an online repository or repositories. See https://github.com/zhangxiangqiu/burst-detection.
CONFLICT OF INTEREST
The authors declare there is no conflict.