Model-based leakage localisation in water distribution networks requires accurate estimates of nodal demands to correctly simulate hydraulic conditions. While digital water meters installed at household premises can be used to provide high-resolution information on water demands, questions arise regarding the necessary temporal resolution of water demand data for effective leak localisation. In addition, how do temporal and spatial data gaps affect leak localisation performance? To address these research gaps, a real-world water distribution network is first extended with the stochastic water end-use model PySIMDEUM. Then, more than 700 scenarios for leak localisation assessment characterised by different water demand sampling resolutions, data gap rates, leak size, time of day for analysis, and data imputation methods are investigated. Numerical results indicate that during periods with high/peak demand, a fine temporal resolution (e.g., 15 min or less) is required for the successful localisation of leakages. However, regardless of the sampling frequency, leak localisation with a sensitivity analysis achieves a good performance during periods with low water demand (localisation success is on average 95%). Moreover, improvements in leakage localisation might occur depending on the data imputation method selected for data gap management, as it can mitigate random/sudden temporal and spatial fluctuations of water demands.

  • A real-world water distribution network is extended with stochastically generated demands to represent spatial and temporal water demand fluctuations.

  • The effectiveness of leakage localisation varies throughout the day and depends on the leakage size and the water demand sampling resolution.

  • Leakage localisation performance improves if the sum of estimated nodal demands matches the actual water demand.

Water losses (or nonrevenue water), as the difference between the total water input into a water distribution network and billed water demand, are estimated to account for approximately 30% of water system input volumes across the world, with country-specific differences, which reach peak levels higher than 50% (EPA 2013; EurEau 2017; Liemberger & Wyatt 2019). Besides revenue losses, water losses can generate several undesired cascading effects, including induced water scarcity (Zyoud et al. 2016), an unnecessary increase in energy demand for pumping (Colombo & Karney 2002), and damages to piping infrastructure due to erosion of the pipe beds (Mora-Rodríguez et al. 2013). Therefore, water losses represent a major challenge for the operation of water distribution networks and accurate water loss management is of the highest interest for the operators and authorities (Oberascher et al. 2020).

Water loss management involves the timely detection and localisation of water leakages (Puust et al. 2010). Several leak detection and localisation methods have been developed and demonstrated in the literature, recently fostered by the Battle of the Leakage Detection and Isolation Methods (BattLeDIM; Vrachimis et al. 2022). These include both data-driven (Daniel et al. 2022; Romero-Ben et al. 2022) and model-based methods (Steffelbauer et al. 2022b). While both types of methods achieve comparable results in leak detection, model-based algorithms are so far more reliable for leak localisation, as they account for the spatial structure of a water distribution network by means of a calibrated hydraulic computer model of the drinking water network.

Model-based methods for leak management are further subdivided into calibration-based, sensitivity analysis-based, and classification-based methods (Li et al. 2015; Hu et al. 2021; Wan et al. 2022; Romero-Ben et al. 2023). However, the basic concept is similar for all of them: a numerical model of the real network is first used to simulate hydraulic parameters (e.g., pressure at hydrants). The simulated values are then compared to pressure measurements gathered via sensors distributed in the networks and their difference is used to determine the spatial location of a leakage. An accurately calibrated hydraulic model of the water distribution network is a requirement to accomplish this task, providing a precise illustration of the pressure conditions. Therefore, water demand data from the different water users are required as model input to compute the nodal demands in the hydraulic model. Up to now, mainly quarterly or annual readings from mechanical water meters are downscaled to the desired simulation time step (Oberascher et al. 2022), representing one of the main uncertainties in numerical models (Sanz & Pérez 2014). Furthermore, as a recent review by Mohan Doss et al. (2023) concludes, most existing leak localisation approaches using a hydraulic numerical model rely on the assumption that water demand is deterministic, whereas in reality, water demands can be highly stochastic.

Digital water meters provide detailed information on water demands, as they can measure individual water demand at the household level and at high temporal resolution (e.g., sub-daily), enabling a representation of different water demand patterns throughout the water distribution network. They can thus provide essential data for the improvement of model-based leak detection algorithms. With the increasing deployment of digital water meters, recent research has focused on their application potential at the network scale (Antzoulatos et al. 2020; Farah & Shahrour 2017; Huang et al. 2020; Jun et al. 2021; Spedaletti et al. 2022). As these different studies and their results show, high-resolution water demand data improve various aspects of water loss management (e.g., water balance, leakage detection, and leakage localisation) compared to the current state of the art. However, the following question arises, acknowledging that digital water meters can record and transmit water demand data at different temporal resolutions, whereas higher resolution comes also with costs and drawbacks (e.g., shorter meter battery life): ‘What temporal resolution (sampling interval) of water demand data is required for an effective leakage localisation in model-based approaches?’. While similar investigations have already been carried out for applications at the household level based on synthetic and measured water demand data (Cominola et al. 2018; Heydari et al. 2022), to the best of the authors' knowledge, the above question is still unsolved for the purpose of model-based leakage localisation. In this context, it is assumed that the resolution of water demand data and thus knowledge of water demands as input parameters of the hydraulic model have an influence on model-based leakage localisation.

Furthermore, while digital water meters allow the recording of detailed information on water demand, water demand data underlie temporal and spatial data gaps in reality. For example, wireless communication technologies such as Wireless M-Bus or technologies associated with Low Power Wide Area Networks operate in the public frequency bandwidths and data losses can be expected (Oberascher et al. 2022). High-resolution water demand data fall also under the European General Data Protection Regulation (2016/679/EU 2016), requiring the consumer's consent for the installation and operation of digital water meters. Consequently, the water demand in an area might not be entirely known and imputation methods are required to replace the missing values.

The detailed objectives of this work can be described as follows:

  • To investigate the effectiveness of a hydraulic model-based method for leakage localisation subject to high temporal and spatial water demand fluctuations.

  • To determine an optimal temporal resolution of household water demand data for effective model-based leakage localisation.

  • To analyse the impact of spatial and temporal data gaps in household water demand data on the leakage localisation performance and identify opportunities for their mitigation with different imputation methods.

To address these research gaps, first, the hydraulic model of a real-world case study is extended with synthetic demand data simulated with PySIMDEUM (Steffelbauer et al. 2022a), i.e., the Python implementation of the state-of-the-art stochastic end-use model SIMDEUM (Blokker et al. 2010 2017). SIMDEUM allows the generation of unique water demand patterns for each household with 1 min interval and water demands with highly varying spatial-temporal patterns across the case study are represented. These demand patterns are used for the simulation of the observed pressure data in a baseline scenario, providing an ideal setting for the robust evaluation of a model-based method for leakage localisation. Subsequently, the generated high-resolution time series are resampled to different temporal sampling intervals and applied as nodal demand for sensitivity assessment of the model-based leakage localisation to water demand data resolution. Finally, temporal and spatial data gaps are randomly included in the digital water meter readings, and the effectiveness of different imputation methods to mitigate data gap impacts on leakage localisation is investigated.

This work builds on the case that, as part of a smart city project, digital water meters are installed at the household level and the received data are used to implement an early warning system for detection and localisation of new leakages in the water distribution network. A typical early warning system would operate with three main sequential modules: (1) a real-time data-driven approach for leakage detection and leak size estimation, (2) a real-time hydraulic model-based leakage localisation, and (3) a subsequent fine search on site. The focus of this work is on the model-based leakage localisation phase as highlighted in Figure 1, and the experiments are implemented using data from a real-world network extended with synthetic water demand data.
Figure 1

Overview of the modules of a typical early warning system for leakage detection and localisation and focus of this work.

Figure 1

Overview of the modules of a typical early warning system for leakage detection and localisation and focus of this work.

Close modal

High-resolution water demand generation

To consider fluctuations in water demands during the day, a unique high-resolution water demand pattern is generated for each household in the case study. PySIMDEUM, i.e., the Python implementation of SIMDEUM (Steffelbauer et al. 2022a), is used for this purpose, given its ability to generate stochastic water end uses and water demand patterns (Blokker et al. 2010). In this work, the total water demand of each household is generated by randomly setting its number of occupants (one person, two persons, or family households) and by using the default profile. The default profile is based on household statistics from the Netherlands (e.g., household type, gender, age, employment), survey information about the usage of household appliances (e.g., frequency of use, duration, intensity), and information on the water-using appliances (e.g., signatures of water flow). Based on this information, a time series of hot and cold-water demand at 1 s resolution is generated based on probability functions for each household appliance, which is aggregated to the household level and to a temporal resolution of 1 min. For more information about the functionality of SIMDEUM and the default user profile, refer to Blokker et al. (2010) and Blokker et al. (2017). The original time series with a temporal resolution of 1 min is assumed to represent the real water demand behaviour and to simulate the observed data in a baseline scenario. Later, it is further processed to account for different sampling intervals and data gaps in digital water meter readings.

Water demand scaling at different sampling intervals

To investigate different sampling intervals, the original time series of water demands are resampled to simulate digital water meters with different resolutions. Figure 2 shows the normalised daily pattern of the water demand for the entire case study for the investigated day. In this work, sampling intervals of 5 min, 15 min, 1 h, 2 h, and 4 h are investigated. In addition, the quarterly water demand over a billing period, here assumed at one quarter, and the quarterly water demand scaled to the inflow are computed as comparative scenarios.
Figure 2

Normalised daily water demand pattern of the entire water distribution network at different resampled resolutions. The four times of day considered for analysis are highlighted with dashed red lines.

Figure 2

Normalised daily water demand pattern of the entire water distribution network at different resampled resolutions. The four times of day considered for analysis are highlighted with dashed red lines.

Close modal

Furthermore, different times of day are also examined to analyse the influence of total water demand in relation to the leakage size. Based on the pattern profile, the times of day selected for performance assessment are 03:00 (minimum night flow), 07:45 (morning peak), 19:00 (average daily demand), and 23:00 (evening/night peak).

Data gaps and data reconstruction

As described in the Introduction the collected water demand data is subject to temporal and spatial data gaps. In this work, the effect of data gaps is investigated for a sampling interval of 15 min (15 min was chosen because preliminary experiments showed a good leakage localisation performance for all times of day and leakage size, thus it was assumed that data losses will have a high impact at this data sampling resolution). The temporal data losses are assumed to be between 0 and 40% per household, meaning that the number of successfully transmitted data packets is between 96 and 58 per day for a sampling interval of 15 min. Furthermore, the degree of digital water meter penetration varies between 60 and 100%, corresponding to 40% of the households without a digital water meter and full penetration, respectively. Various combinations of spatial and temporal data gaps are thereby randomly implemented.

Three different approaches are applied for data imputation of the temporal and spatial data gaps. The first two methods (zero values, historical mean) are taken from Jun et al. (2021). In ‘zero values’, the missing values are filled by zero values, assuming that there is no water demand during this sampling interval. In ‘historical mean’, the missing value () is replaced with the average demand of the past time steps on a weekly basis over the available time series length:
(1)
where n is the number of weeks in the available time series length and is the water demand at the ith time step in jth week. For example, if Monday 1 p.m. is missing, this value is replaced by the mean demand of all historical Mondays at 1 p.m.

While an advantage of the above two methods for data imputation is their simplicity, they do not consider the current inflow, which creates a mismatch between the measured inflow and the applied nodal demand in the hydraulic model. Therefore, a third method is developed here and adapted to the specification of the case study using the inflow measurements as a reference value as follows. First, the water demand is measured cumulatively in the considered water distribution network, meaning that the total water demand over a certain period is known even if intermediate (localised) values are missing. Subsequently, the total water demand of each household is scaled in relation to the inflow during this period to fill the missing temporal values. Second, the households are classified into different clusters based on the water demand per billing period to fill the spatial data gaps. Thereby, it is assumed that all households in a cluster have similar behaviour regarding water demand over the day. Under this assumption, the missing water demand patterns of the households are determined by averaging the water demand of the other known households in each cluster. The values are further adjusted to the total water demand of each household and the missing residual quantity, i.e., inflow minus known demand and estimated loss quantity due to leakage and background losses, to comply with the mass balance. The number of clusters should be high enough to cover a wide range of household characteristics, but a higher number of clusters can also cause no real-time values of households to be present in the cluster due to spatial and temporal data losses. Since the aim here is to test the applicability of this approach, four clusters are assumed for simplicity for this case study using the k-means clustering method with the default setting of the Python library scikit-learn (Pedregosa et al. 2011).

Leakage localisation

A sensitivity analysis method for model-based leakage localisation is implemented here by comparing measured and hydraulically modelled pressure data to identify possible leakage regions. The pressure residual vector () is calculated as
(2)
where is the measured pressure at sensor node j and is the simulated pressure at sensor node j for the leakage-free scenario. The sensitivity matrix () is then created using the first-order estimations of pressure (Perez et al. 2014):
(3)
where is the simulated pressure at sensor node j for the model-based placement of the leakage at node i.
Afterwards, the leakage localisation is determined by correlating the pressure residual vector with each row of the sensitivity matrix containing all possible leakage scenarios, based on the Pearson correlation coefficient (used, e.g., in Perez et al. (2014) and Steffelbauer et al. (2022b)). Thereby, the highest match is used for the selection of the candidate nodes for the leakage localisation. In this work, the model-based leakage localisation is intended to identify the region where leakage is likely to occur, thus facilitating the subsequent pinpointing by fine search on site. The 5% of nodes with the highest match are selected and identified as the leakage region. If the leakage point is within these selected nodes, the leakage is classified as successfully localised (true positive), otherwise as not localised (false negative). For the comparison of the different scenarios investigated, the metric ‘leakage localisation success’ (LLS) is calculated:
(4)

Experimental setup

Case study

The implementation area of the smart city project used here as a case study is an area predominantly occupied by single-family houses. A total of 160 households are identified as possible installation sites for digital water meters. For the project, the area was redesigned as a district metering area and is connected to the main water distribution network via two supply points. At each supply point, the inflow and water pressure are measured with a temporal resolution of 15 min. The total pipe length of the case study is 9.3 km, and six hydrants can be used for the installation of pressure sensors for leakage localisation. Figure 3 provides an overview of the simplified representation of the existing water distribution network including the spatial distribution of household and hydrant nodes over the area.
Figure 3

Schematic illustration of the existing water distribution network used as case study, highlighting the households with digital water meters (blue) and the hydrants (orange).

Figure 3

Schematic illustration of the existing water distribution network used as case study, highlighting the households with digital water meters (blue) and the hydrants (orange).

Close modal

The hydraulic model of the networks is created in EPANET 2.2 (Rossman et al. 2020) and calibrated with data from a measurement campaign run in the summer of 2021. Afterwards, a unique water demand time series is created for each household using PySIMDEUM and assigned to a respective hydraulic model node (referred to the 160 household nodes in Figure 3). The generated time series has a duration of 4 months, using the first 3 months for statistical analyses and feature extraction (quarterly readings, clustering and classification of the households, historical mean demand) and the last month for selecting a random day for leak localisation performance assessment. The Python package WNTR (Klise et al. 2017) is utilised for the hydraulic simulations.

Sampling interval and data gap scenarios

Table 1 summarises the parameters sampled for building different scenarios for leakage localisation performance assessment, along with their considered range. In total, 140 scenarios are initially simulated to analyse the influence of different sampling intervals of water demands, leakage sizes, and analysis times of day, without any spatial and temporal data gaps. Further, for the investigation of data gaps and data imputation effects, leakage size, sampling interval, and analysis time of day are fixed to 2 l/s (corresponds to the equivalent value of 0.002 m³/s in SI units), 15 min, and morning peak, respectively, whereas the spatial and temporal data gaps are reconstructed with the three different imputation methods detailed above. To reduce the influence of the randomness of data gap implementation on the results, each configuration is simulated 10 times, resulting in an additional 600 scenarios. All simulations are performed on a computer with Windows 10 Enterprise 64-bit as the operating system, with an Intel® Core™ i7-7000 processor with 3.60 GHz, and a working memory of 16.4 GB.

Table 1

Parameters considered for scenario building and leakage localisation performance assessment

ParameterRange/values
Leakage size 1, 2, 3, 4, and 5 l/s 
Sampling interval 5 min, 15 min, 1 h, 2 h, 4 h, one quarter (av), one quarter scaled to the inflow (avscal
Analysis time of day minimum night flow (03:00), morning peak (07:45), average daily demand (19:00), and evening/night peak (23:00) 
Spatial data gaps 0, 10, 20, 30, and 40% 
Temporal data gaps 0, 10, 20, 30, and 40% 
Imputation method Zero values, historical mean, adapted to inflow 
ParameterRange/values
Leakage size 1, 2, 3, 4, and 5 l/s 
Sampling interval 5 min, 15 min, 1 h, 2 h, 4 h, one quarter (av), one quarter scaled to the inflow (avscal
Analysis time of day minimum night flow (03:00), morning peak (07:45), average daily demand (19:00), and evening/night peak (23:00) 
Spatial data gaps 0, 10, 20, 30, and 40% 
Temporal data gaps 0, 10, 20, 30, and 40% 
Imputation method Zero values, historical mean, adapted to inflow 

Note: Parameter names and their range/values are reported.

Key assumptions

As the focus of this work is on the use of digital water meters for leakage localisation, rather than on leakage simulation of hydraulic model development, the following simplifications are assumed:

  • There is only one leakage at a time, with perfect detection (i.e., leakages are always detected) and estimation of the leakage size.

  • A calibrated hydraulic model of the water distribution network is available, without having model uncertainties (e.g., roughness, background losses).

  • The digital water meters and the pressure sensors at hydrants have no measurement errors.

  • A regular household water demand is used without considering season-dependent water demands due to, e.g., garden irrigation or swimming pool filling.

  • Temporal data gaps are equally probable for all digital water meters, without having spatially concentrated losses due to poor radio coverage.

Influence of water demand sampling intervals

Figure 4 shows the influence of different water demand sampling intervals of the digital water meters on the LLS, using an exemplary scenario with a leakage size of 3 l/s at the analysis time of day ‘evening/night peak’. Thereby, a successfully localised leakage at a network node is marked with a green dot, whereas nodes represented in orange indicate leaks that were not correctly localised. As can be observed, there is a clear correlation between the water demand sampling resolution and the leak localisation success, with the localisation success decreasing as water demand measurements become coarser. In this regard, most of the leakages can be successfully localised with a temporal resolution of 5 min (Figure 4(a)). However, even with this resolution, there are a few areas that cannot be successfully localised with the present sensor locations. The LLS is 94% for the 5 min resolution, which is reduced to 79% for a sampling interval of 15 min (Figure 4(b)). As expected, the localisation success decreases further with a coarser resolution of the sampling time, being 65 and 60% for a temporal resolution of 1 and 4 h (Figure 4(c) and (d)), respectively. Interestingly, these temporal resolutions are only slightly better than using the quarterly water demand over one quarter (Figure 4(e)) or the quarterly water demand scaled to the inflow (Figure 4(f)), both with 59%.
Figure 4

Successfully localised leakages in comparison with not localised leakages for a sample scenario characterised by a leakage with a size of 3 l/s in the morning peak. Different sampling intervals of the digital water meters are considered: (a) 5 min, (b) 15 min, (c) 1 h, (d) 4 h, (e) quarterly demand, and (f) quarterly demand scaled to the inflow.

Figure 4

Successfully localised leakages in comparison with not localised leakages for a sample scenario characterised by a leakage with a size of 3 l/s in the morning peak. Different sampling intervals of the digital water meters are considered: (a) 5 min, (b) 15 min, (c) 1 h, (d) 4 h, (e) quarterly demand, and (f) quarterly demand scaled to the inflow.

Close modal
These evaluations were carried out for all scenarios characterised by different leakage sizes and sampling intervals at the four analysis times of the day. A summary of the results obtained across all 140 scenarios is shown with heatmaps in Figure 5, where the leakage size and the water demand sampling rate are shown on the x-axis and y-axis, respectively, for each heatmap. Thereby, the LLS is between 0.55 and 0.96 for the minimum night flow (Figure 5(a)) and between 0.51 and 0.92 for the average daily demand (Figure 5(c)). With higher water demands, corresponding to higher flows within the water distribution network, the LLS generally decreases and is between 0.30 and 0.91 for the morning peak (Figure 5(b)) and between 0.24 and 0.99 for the evening/night peak (Figure 5(d)).
Figure 5

LLS for different leakage sizes and sampling intervals for four analysis times of day: (a) minimum night flow, (b) morning peak, (c) average daily demand, and (d) evening/night peak.

Figure 5

LLS for different leakage sizes and sampling intervals for four analysis times of day: (a) minimum night flow, (b) morning peak, (c) average daily demand, and (d) evening/night peak.

Close modal

In general, the best performances are attained at finer sampling intervals, i.e., 5 and 15 min across all scenarios considered, as they rely on high-resolution knowledge of water demand. The LLS decreases with a coarser sampling interval (e.g., 1–4 h), and the worst performance is shown for the scenarios that rely on quarterly water demand, without considering any spatial and temporal variations. Interestingly, the quarterly water demand scaled to the inflow performs well across all scenarios and has one of the best performances for low (minimum night flow, Figure 5(a)) and medium (quarterly daily demand, Figure 5(c)) water demands. In general, larger leakages can be better localised than smaller leakages for all analysis times and sampling intervals. The friction losses due to increased water flow in the pipes increase with larger leakages, resulting in higher pressure changes and a clearer identification of correct candidate nodes.

In addition, the LLS varies with the analysis time of day and achieves, on average, the best results at the minimum night flow and the worst results at the morning and evening peak. For example, the sampling interval of 15 min has a localisation success of 88% for a leakage size of 2 l/s at the minimum night flow, which decreases to 70% at the evening peak. In general, good results can be achieved at the minimum night flow for all sampling intervals including small leakages. In contrast, it requires a finer sampling interval during the demand peaks to achieve reasonable results.

Interestingly, a finer sampling interval does not always result in a better localisation of the leakages. For example, a sampling interval of 15 min has the poorest performance (except the quarterly water demand) at the analysis point ‘minimum night flow’. This can be explained by the fact that all sampling intervals represent a mean value of the actual water demand over a specific period of time, whereas the pressure measurements are conducted at exactly the time of analysis in this work. The real water demand of the case study is 0.19 l/s at 03:00 a.m., whereas the measured average water demand is 0.14, 0.10, and 0.20 l/s for sampling intervals of 5 min, 15 min, and 1 h, respectively. Similarly, the sampling time of 15 min is closest to the real water demand at the morning peak followed by 1 h and 5 min, and in this order is also the localisation success rate. Subsequently, the localisation success is higher if the average measured water demand is closer to the actual water demand at that specific time point.

Influence of data gaps

For the influence of data gaps on leakage localisation, the sampling interval of 15 min is used, as this resolution corresponds to the planned sampling interval of the digital water meters in the considered case study. A leakage size of 2 l/s at the analysis point morning peak is further applied for the analysis, as the LLS shows a high variation in the sampling intervals at this time point and thus a high impact of data gaps on the results is assumed. As a reference value for benchmarking, this scenario has an LLS of 73% without any temporal and spatial data gaps.

Figure 6 shows the LLS for different rates of missing data for the three imputation methods for data gap management. The label ‘combined’ refers to equal amounts of missing temporal and spatial data, while the labels ‘temporal’ and ‘spatial’ refer to only one type of data gap.
Figure 6

LLS with varying ranges of temporal and spatial data gaps for three different imputation methods (a) zero values, (b) historical mean, and (c) adapted to inflow in comparison with the benchmarking (73%).

Figure 6

LLS with varying ranges of temporal and spatial data gaps for three different imputation methods (a) zero values, (b) historical mean, and (c) adapted to inflow in comparison with the benchmarking (73%).

Close modal
As expected, the performance overall decreases with a higher amount of missing data, as knowledge of water demand becomes more unreliable. However, only little differences between temporal and spatial data gaps are noticeable, whereas the combined scenarios usually lead to a lower performance. The highest decrease is observed for the imputation method ‘zero values’, having LLS values between 38 and 66%. In this method, all missing values are replaced with zero values, which increases the differences between the total water demand of the case study and utilised water demand in the model. This affects the hydraulic simulation and deteriorates the correlation between measured and simulated pressure residuals. The LLS of the imputation method ‘historical mean’ varies between 51 and 88%, whereas the imputation method ‘adapted to inflow’ yields a success rate between 51 and 94%. Interestingly, some scenarios show an increase in the LLS compared to the reference value without any data gaps. One possible explanation is that by replacing the temporal and spatial data gaps, the average water demands used in the model approximate better the real water demands at the time of the pressure measurement (exactly 07:45) than the perfect 15 min sampling interval. To illustrate this situation, Figure 7 compares the different utilised water demand data (in l/s). The water demand with 1 min resolution applied for simulating the observed pressure measurements (Figure 7(a)) is mainly concentrated on a few network nodes but with higher nodal demands up to 0.2 l/s. At this specific point in time, the sum of the nodal demands has a temporary peak of 2.18 l/s. In contrast, the average nodal demand over 15 min has a higher spatial distribution and is lower with a total nodal demand of 1.97 l/s (Figure 7(b)). Through the imputation adapted to the inflow, the water demand per sampling interval is temporally redistributed (Figure 7(c)), and the total nodal demand is increased to 2.21 l/s being similar to the real water demand at this time point. For a better differentiation, Figure 7(d) shows the difference between the sampling rate of 15 min without data gaps and the sampling rate of 15 min including the imputation of the data gaps.
Figure 7

Spatial distribution of water demand for four different scenarios: morning peak for (a) 1 min demand data used for simulated observed pressure measurements, (b) 15 min sampling interval without data gaps, (c) 15 min sampling interval with temporal gaps of 20% and spatial data gaps of 0%, respectively, and (d) the difference between these 15 min sampling intervals.

Figure 7

Spatial distribution of water demand for four different scenarios: morning peak for (a) 1 min demand data used for simulated observed pressure measurements, (b) 15 min sampling interval without data gaps, (c) 15 min sampling interval with temporal gaps of 20% and spatial data gaps of 0%, respectively, and (d) the difference between these 15 min sampling intervals.

Close modal

These results also indicate that even if complete time series are available, it is favourable to temporarily smooth the water demand over several sampling intervals (e.g., with rolling averages) to achieve higher robustness in leakage localisation against random/sudden temporal and spatial fluctuations in the water demand. Conversely, one could adjust and average the pressure measurements with a finer sampling interval to the sampling interval of the digital water meters or use the average pressure measurements over multiple time steps.

Computational effort, limitations, and outlook

The applied sensitivity analysis compares measured with simulated data (Wan et al. 2022), requiring a hydraulic simulation for the leakage-free scenario as well as one hydraulic simulation for each possible leakage node for each leakage scenario. Due to the fine resolution of the network, 316 hydraulic nodes were selected as possible leakage nodes, resulting in 317 hydraulic simulations with EPANET, whereby the average computational time per leakage scenario was around 90 s. Thus, the sensitivity analysis is computationally efficient compared to other model-based leakage localisation methods, and the computational time strongly correlates with the number of possible leakage nodes.

This work is based on the following assumptions, influencing the results of model-based leakage localisation:

  • First, a perfectly calibrated hydraulic model of the water distribution network and no measurement errors are assumed. However, in reality, the calibrated hydraulic model and the required pressure measurements are subject to uncertainties, potentially increasing the distance between identified leakage regions and the real leakage place (Marzola et al. 2022).

  • Second, a perfect detection and estimation of the size of simulated leaks was assumed, without considering different characteristics of leaks (e.g., abrupt or incipient). However, since this assumption is directly incorporated into the leakage localisation method, it is expected that the LLS will decrease in case of incorrect estimations of the leakage size.

  • Third, only one model-based method was applied. As shown by the literature, the results may differ for other techniques (Casillas Ponce et al. 2014), suggesting combining multiple techniques for a more robust leakage localisation.

  • Fourth, a single household water demand profile was applied, showing similar total water demand patterns throughout the year. In reality, a more realistic case study would include heterogeneous water demand profiles, including domestic, commercial, touristic, and agricultural water users, either increasing (e.g., outdoor season-dependent household water end uses, such as garden irrigation or swimming pool filling) or decreasing (e.g., constant consumption) seasonal and daily water demand fluctuations. Subsequently, future research could investigate the effect of different water demand profiles including their combinations on the leakage localisation efficiency.

  • Finally, an equal probability of temporal data gaps was assumed for all digital water meters, whereas data losses would be spatially aggregated due to poor radio coverage and thus differently impacting their performance. This topic could be addressed by future research.

Therefore, the results obtained for perfect conditions in this work can be seen as an upper limit, but the LLS is expected to decrease in reality or when the above assumptions are relaxed. Thereby, the various uncertainty parameters should be considered together within a comprehensive sensitivity/robustness analysis, thus a systematic error propagation still requires further research (Mohan Doss et al. 2023). This will be especially relevant for small leakage sizes, as the pressure fluctuations caused by the leakage are minor compared to the mentioned uncertainties.

Nonetheless, it is clear from the results of this work that the applied nodal demand estimation has a major influence on the effectiveness of the model-based leakage localisation. Therefore, it requires careful coordination between the available resolution of the water demand data and the time of analysis for leakage localisation, as the achievable efficiency varies throughout the day. On the contrary, as experiences from the real-world implementation of the digital waters in the case study showed, the efforts in installing and operating such a system are still quite high (Oberascher et al. 2024). Subsequently, sufficient benefits are required to compensate for the initial investment, which depends on the case study (e.g., water shortage, pumping or treatment required) and requires an individual and detailed quantitative assessment.

As part of the above-mentioned smart city project, one or more simultaneous leakages will be simulated in reality to estimate the potential of model-based leakage localisation in a real-world environment and test how the results of this study will change in such settings.

Water losses are a major challenge for the operation of water distribution networks and a timely detection and localisation of leakages is of greatest interest for network operators and municipalities. Hydraulic model-based methods can be applied for leak localisation. They estimate leak localisation by minimising the differences between simulated and measured pressure time series. Yet, they require nodal demands as an input for the numerical model. Therefore, high-resolution water meter readings can be utilised to enhance the performance of leakage localisation. However, data from digital water meters can be recorded with different sampling resolutions due to hardware/software constraints and are also subject to temporal (e.g., packet losses during data communication) and spatial (e.g., consumers' agreement for the installation due to privacy regulations) data gaps.

In this work, an existing water distribution network is first extended with synthetic demand data simulated with PySIMDEUM (Steffelbauer et al. 2022a), i.e., the Python implementation of the state-of-the-art stochastic end-use model SIMDEUM (Blokker et al. 2010), to obtain highly temporal and spatial variations of water demand across the nodes in the network. Afterwards, artificial leakages with different leakage sizes are implemented at each possible network node and the candidate region for the following fine search is determined by using a sensitivity-based approach with Pearson correlation. Using these settings, the aim was to address the following questions to support a real-world implementation of an early warning system for leakage detection and localisation: ‘What sampling resolution is required for water demand data recording to achieve an effective leakage localisation?’ and ‘What is the impact of temporal and spatial data gaps’?

Based on the obtained results, the following conclusions can be made in reply to the above questions:

  • If leak localisation is run in times of day with low water demand, nearly every tested sampling interval showed a high-performance rate for leakage localisation. For example, there were only minor differences between coarser (e.g., from 1 h to quarter readings) and finer (e.g., 5–15 min) temporal resolution at the minimum night flow.

  • However, a finer temporal resolution (e.g., 15 min or less) is required for the successful localisation of leakages during periods with higher demands or even at peak demands, whereas the performance improves with the leakage size.

  • If the sum of applied nodal demand estimations corresponds to the real water demand of the case study, the LLS increases (e.g., from 0.73 without any data gaps to 0.81 for a temporal data gap of 20% for a leakage size of 2 l/s). In this context, quarterly readings scaled to the inflow show a good performance also during the day and represent a good alternative in case of missing high-resolution demand data.

  • Temporal and spatial data gaps of demand data generally decrease the performance with an increasing amount of missing data. However, the choice of a data imputation method strongly influences the result, as two out of the three tested methods show an increase in the LLS even beyond the reference value without any data gaps.

  • These findings also suggest that temporal averaging of water demand data (e.g., rolling average) is favourable even if complete time series are available to become more robust against random/sudden temporal and spatial fluctuations of water demand magnitudes and patterns.

  • Conversely, a similar behaviour is expected if the pressure measurements with a finer sampling interval are resampled to the sampling interval of the digital water meters or averaged over multiple timesteps.

This publication was produced as part of the ‘REWADIG’ project. This project was funded by the Climate and Energy Fund and is part of the programme ‘Smart Cities Demo – Boosting Urban Innovation 2020’ (project 884788).

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

2016/679/EU
2016
Regulation on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation). European Parliament
.
Antzoulatos
G.
,
Mourtzios
C.
,
Stournara
P.
,
Kouloglou
I. O.
,
Papadimitriou
N.
,
Spyrou
D.
,
Mentes
A.
,
Nikolaidis
E.
,
Karakostas
A.
,
Kourtesis
D.
,
Vrochidis
S.
&
Kompatsiaris
I.
2020
Making urban water smart: The SMART-WATER solution
.
Water Sci. Technol.
82
(
12
),
2691
2710
.
https://doi.org/10.2166/wst.2020.391
.
Blokker
E. J. M.
,
Vreeburg
J. H. G.
&
Dijk
J. C. v.
2010
Simulating residential water demand with a stochastic end-use model
.
J. Water Resour. Plann. Manage.
136
(
1
),
19
26
.
https://doi.org/10.1061/(ASCE)WR.1943-5452.0000002
.
Blokker
M.
,
Agudelo-Vera
C.
,
Moerman
A.
,
van Thienen
P.
&
Pieterse-Quirijns
I.
2017
Review of applications for SIMDEUM, a stochastic drinking water demand model with a small temporal and spatial scale
.
Drinking Water Eng. Sci.
10
(
1
),
1
12
.
https://doi.org/10.5194/dwes-10-1-2017
.
Casillas Ponce
M. V.
,
Garza Castañón
L. E.
&
Cayuela
V. P.
2014
Model-based leak detection and location in water distribution networks considering an extended-horizon analysis of pressure sensitivities
.
J. Hydroinf.
16
(
3
),
649
670
.
https://doi.org/10.2166/hydro.2013.019
.
Colombo
A. F.
&
Karney
B. W.
2002
Energy and costs of leaky pipes: Toward comprehensive picture
.
J. Water Resour. Plann. Manage.
128
(
6
),
441
450
.
https://doi.org/10.1061/(ASCE)0733-9496(2002)128:6(441)
.
Cominola
A.
,
Giuliani
M.
,
Castelletti
A.
,
Rosenberg
D. E.
&
Abdallah
A. M.
2018
Implications of data sampling resolution on water use simulation, end-use disaggregation, and demand management
.
Environ. Modell. Software
102
,
199
212
.
https://doi.org/10.1016/j.envsoft.2017.11.022
.
Daniel
I.
,
Pesantez
J.
,
Letzgus
S.
,
Khaksar Fasaee
M. A.
,
Berglund
E.
,
Mahinthakumar
G.
&
Cominola
A.
2022
A sequential pressure-based algorithm for data-driven leakage identification and model-based localization in water distribution networks
.
J. Water Resour. Plann. Manage.
148
.
https://doi.org/10.1061/(ASCE)WR.1943-5452.0001535
.
EPA
2013
Water Audits and Water Loss Control for Public Water Systems
. .
EurEau
2017
Europe's Water in Figures – An Overview of the European Drinking Water and Waste Water Sectors
. .
Farah
E.
&
Shahrour
I.
2017
Leakage detection using smart water system: Combination of water balance and automated minimum night flow
.
Water Resour. Manage.
31
(
15
),
4821
4833
.
https://doi.org/10.1007/s11269-017-1780-9
.
Heydari
Z.
,
Cominola
A.
&
Stillwell
A. S.
2022
Is smart water meter temporal resolution a limiting factor to residential water end-use classification? A quantitative experimental analysis
.
Environ. Res. Infrastruct. Sustainability
2
(
4
).
https://doi.org/10.1088/2634-4505/ac8a6b
.
Hu
Z.
,
Chen
B.
,
Chen
W.
,
Tan
D.
&
Shen
D.
2021
Review of model-based and data-driven approaches for leak detection and location in water distribution systems
.
Water Supply
21
(
7
),
3282
3306
.
https://doi.org/10.2166/ws.2021.101
.
Huang
Y.
,
Zheng
F.
,
Kapelan
Z.
,
Savic
D.
,
Duan
H. F.
&
Zhang
Q.
2020
Efficient leak localization in water distribution systems using multistage optimal valve operations and smart demand metering
.
Water Resour. Res.
56
(
10
).
https://doi.org/10.1029/2020wr028285
.
Jun
S.
,
Jung
D.
&
Lansey
K. E.
2021
Comparison of imputation methods for end-user demands in water distribution systems
.
J. Water Resour. Plann. Manage.
147
(
12
).
https://doi.org/10.1061/(asce)wr.1943-5452.0001477
.
Klise
K. A.
,
Bynum
M.
,
Moriarty
D.
&
Murray
R.
2017
A software framework for assessing the resilience of drinking water systems to disasters with an example earthquake case study
.
Environ. Modell. Software
95
,
420
431
.
https://doi.org/10.1016/j.envsoft.2017.06.022
.
Li
R.
,
Huang
H.
,
Xin
K.
&
Tao
T.
2015
A review of methods for burst/leakage detection and location in water distribution systems
.
Water Supply
15
(
3
),
429
441
.
https://doi.org/10.2166/ws.2014.131
.
Liemberger
R.
&
Wyatt
A.
2019
Quantifying the global non-revenue water problem
.
Water Supply
19
(
3
),
831
837
.
https://doi.org/10.2166/ws.2018.129
.
Marzola
I.
,
Alvisi
S.
&
Franchini
M.
2022
A comparison of model-based methods for leakage localization in water distribution systems
.
Water Resour. Manage.
36
(
14
),
5711
5727
.
https://doi.org/10.1007/s11269-022-03329-4
.
Mohan Doss
P.
,
Rokstad
M. M.
,
Steffelbauer
D.
&
Tscheikner-Gratl
F.
2023
Uncertainties in different leak localization methods for water distribution networks: A review
.
Urban Water J.
1
15
.
https://doi.org/10.1080/1573062x.2023.2229301
.
Mora-Rodríguez
J.
,
Delgado-Galván
X.
,
Ramos
H. M.
&
López-Jiménez
P. A.
2013
An overview of leaks and intrusion for different pipe materials and failures
.
Urban Water J.
11
(
1
),
1
10
.
https://doi.org/10.1080/1573062x.2012.739630
.
Oberascher
M.
,
Möderl
M.
&
Sitzenfrei
R.
2020
Water loss management in small municipalities: The situation in Tyrol
.
Water
12
,
12
.
https://doi.org/10.3390/w12123446
.
Oberascher
M.
,
Rauch
W.
&
Sitzenfrei
R.
2022
Towards a smart water city: A comprehensive review of applications, data requirements, and communication technologies for integrated management
.
Sustainable Cities Soc.
76
.
https://doi.org/10.1016/j.scs.2021.103442
.
Oberascher
M.
,
Maussner
C.
,
Hinteregger
P.
,
Knapp
J.
,
Halm
A.
,
Kaiser
M.
,
Gruber
W.
,
Truppe
D.
,
Eggeling
E.
&
Sitzenfrei
R.
2024
Experiences from a Large-Scale Implementation of Digital Water Meters Used for Improved Leakage Management
.
EGU General Assembly
,
Vienna, Austria
,
14–19 Apr 2024, EGU24-8150. https://doi.org/10.5194/egusphere-egu24-8150
.
Pedregosa
F.
,
Varoquaux
G.
,
Gramfort
A.
,
Michel
V.
,
Thirion
B.
,
Grisel
O.
,
Blondel
M.
,
Prettenhofer
P.
,
Weiss
R.
&
Dubourg
V.
2011
Scikit-learn: Machine learning in Python
.
J. Mach. Learn. Res.
12
,
2825
2830
.
Perez
R.
,
Sanz
G.
,
Puig
V.
,
Quevedo
J.
,
Escofet
M. A. C.
,
Nejjari
F.
,
Meseguer
J.
,
Cembrano
G.
,
Tur
J. M. M.
&
Sarrate
R.
2014
Leak localization in water networks: A model-based methodology using pressure sensors applied to a real network in Barcelona
.
IEEE Control Syst. Mag.
34
(
4
),
24
36
.
https://doi.org/10.1109/MCS.2014.2320336
.
Puust
R.
,
Kapelan
Z.
,
Savic
D. A.
&
Koppel
T.
2010
A review of methods for leakage management in pipe networks
.
Urban Water J.
7
(
1
),
25
45
.
https://doi.org/10.1080/15730621003610878
.
Romero-Ben
L.
,
Alves
D.
,
Blesa
J.
,
Cembrano
G.
,
Puig
V.
&
Duviella
E.
2022
Leak localization in water distribution networks using data-driven and model-based approaches
.
J. Water Resour. Plann. Manage.
148
.
https://doi.org/10.1061/(ASCE)WR.1943-5452.0001542
.
Romero-Ben
L.
,
Alves
D.
,
Blesa
J.
,
Cembrano
G.
,
Puig
V.
&
Duviella
E.
2023
Leak detection and localization in water distribution networks: Review and perspective
.
Annu. Rev. Control
55
,
392
419
.
https://doi.org/10.1016/j.arcontrol.2023.03.012
.
Rossman
L. A.
,
Woo
H.
,
Tryby
M.
,
Shang
F.
,
Janke
R.
&
Haxton
T.
2020
EPANET 2.2 User Manual Water Infrastructure Division
.
Center for Environmental Solutions and Emergency Response, U.S. Environmental Protection Agency
,
Cincinnati, OH
.
Sanz
G.
&
Pérez
R.
2014
Demand pattern calibration in water distribution networks
.
Procedia Eng.
70
,
1495
1504
.
https://doi.org/10.1016/j.proeng.2014.02.164
.
Spedaletti
S.
,
Rossi
M.
,
Comodi
G.
,
Cioccolanti
L.
,
Salvi
D.
&
Lorenzetti
M.
2022
Improvement of the energy efficiency in water systems through water losses reduction using the district metered area (DMA) approach
.
Sustainable Cities Soc.
77
.
https://doi.org/10.1016/j.scs.2021.103525
.
Steffelbauer
D.
,
Hillebrand
B.
&
Blokker
E.
2022a
pySIMDEUM – An open-source stochastic water demand end-use model in Python
. In:
2nd International Joint Conference on Water Distribution Systems Analysis & Computing and Control in the Water Industry
.
Universitat Politècnica de València
,
Valencia, Spain
.
Steffelbauer
D. B.
,
Deuerlein
J.
,
Gilbert
D.
,
Abraham
E.
&
Piller
O.
2022b
Pressure-leak duality for leak detection and localization in water distribution systems
.
J. Water Resour. Plann. Manage.
148
(
3
),
04021106
.
https://doi.org/10.1061/(ASCE)WR.1943-5452.0001515
.
Vrachimis
S.
,
Eliades
D.
,
Taormina
R.
,
Kapelan
Z.
,
Ostfeld
A.
,
Liu
S.
,
Kyriakou
M.
,
Pavlou
P.
,
Qiu
M.
&
Polycarpou
M.
2022
Battle of the leakage detection and isolation methods
.
J. Water Resour. Plann. Manage.
148
,
04022068
.
https://doi.org/10.1061/(ASCE)WR.1943-5452.0001601
.
Wan
X.
,
Kuhanestani
P. K.
,
Farmani
R.
&
Keedwell
E.
2022
Literature review of data analytics for leak detection in water distribution networks: A focus on pressure and flow smart sensors
.
J. Water Resour. Plann. Manage.
148
,
10
.
https://doi.org/10.1061/(asce)wr.1943-5452.0001597
.
Zyoud
S. H.
,
Kaufmann
L. G.
,
Shaheen
H.
,
Samhan
S.
&
Fuchs-Hanusch
D.
2016
A framework for water loss management in developing countries under fuzzy environment: Integration of fuzzy AHP with fuzzy TOPSIS
.
Expert Syst. Appl.
61
,
86
105
.
https://doi.org/10.1016/j.eswa.2016.05.016
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).