Experiments based comparative evaluations of machine learning techniques for leak detection in water distribution systems

Leakage in water distribution systems is a significant long-standing problem due to the huge economic and ecological losses. Different leak detection studies have been examined in literature using different types of technologies and data. Currently, although machine learning techniques have achieved tremendous progress in outlier detection approaches, they are still limited in terms of water leak detection applications. This research aims to improve the leak detection performances by refining the choices of learning data and techniques. From this perspective, commonly used techniques for leak detection are assessed in this paper, and the characteristics of hydraulic data are investigated. Four intelligent algorithms are compared, namely k-nearest neighbors, support vector machines, logistic regression, and multi-layer perceptron. This study focuses on six experiments based on identifying outliers in various packages of pressure and flow data, yearly data, seasonal data, night data, and flow data difference to detect leakage in water distribution networks. Different scenarios of realistic water demand in two networks from the benchmark dataset LeakDB are used. Results demonstrate that the leak detection accuracy varies between 30% and 100% depending on the experiment and the choices of algorithms and data.


INTRODUCTION
Water stress is a prominent issue across the globe. Some of the most plausible reasons include the critical conditions of water distribution networks, uncontrolled and untreated leaks, and breaks in the distribution pipelines all over the world. According to Kanakoudis et al. (2015), around 50% of the water volume entering a water distribution system is lost. Therefore, inspecting and monitoring water distribution networks can detect and repair leaks to reduce this huge water loss (Kanakoudis & Tolikas 2001). Leakage detection and location have become the central focus of most research works over the last years. Several software leak detection methods were presented in the literature. Some of them are based on transient analysis of pressure wave reflection using physical equations (Srirangarajan et al. 2013); others use statistical equations and algorithms grounded on pressure point analysis (bin Md Akib et al. 2011).
Further methods based on model conception encompassed all the hydraulic characteristics of a real distribution network using mathematical equations (Adedeji et al. 2017). Recently, with the emergence of big data and artificial intelligence algorithms, data-driven based methods have appeared, particularly prediction and classification methods which rest on machine learning-based outlier detection (Wu & Liu 2017). These methods rely upon data analysis to find abnormal features that do not conform to the expected behavior. In leak detection literature, these methods were developed to detect leaks as abnormal patterns in water distribution networks through analyzing hydraulic data, mainly pressure and flow data. Different data types and machine learning techniques were used to determine intelligent and efficient leak detection solutions. Some research works were based on historical datasets using test networks, while others worked with generated datasets using modeled real networks.
The efficient key of outlier detection methods, as a learning category, corresponds to the quality of input data and their compatibility with machine learning techniques (Ahmed et al. 2016). However, consistent noise in data tends to be similar to the actual outliers, which makes it difficult to be distinguished and classified. For example, in water distribution networks, the variations of pressure and flow data are closely related to changes in water consumption demands. These consumptions are periodic, according to seasons and days, and unplanned due to unpredictable demands or repair. Therefore, a leak can be considered as an unpredictable increase in demands, which makes it difficult to be classified as an outlier. Furthermore, an increase in demand which is not associated with leakage in training input data can deeply disturb the learning quality and subsequently the detection efficiency. Within this framework, this study unveils six experiments using, for each one, various pressure and flow training data, namely yearly data, seasonal or night data, flow difference data, and data with incipient or abrupt leaks. All these packages are trained separately to eliminate related noises and evaluate for each experiment the consequential effect. This work aims to compare and evaluate four machine learning algorithms for leak detection applications: artificial neural networks (ANN), k-nearest neighbors (KNN), support vector machine (SVM), and logistic regression (LR). This experimental comparison rests upon the benchmark dataset LeakDB using different scenarios for realistic water demand.
The chief objective of this investigation resides in enhancing the performances of leak detection methods by evaluating the behavior of learning techniques. Different experiments were conducted to examine the effect of learning data choice on improving the efficiency of machine learning algorithms. Additionally, this research represents the pioneering comparative work that uses a common benchmark dataset LeakDB. The results can contribute considerably to the development of robust leak detection methods in water distribution networks based on artificial intelligent techniques.
The next section displays the existing related literature works, then the general research methodology is identified, and the proposed dataset and algorithms are introduced. The paper then illustrates the comparative experiments and the obtained results, and finally concludes the work and provides new perspectives for future works.

RELATED WORKS
Artificial intelligence methods are frequently applied in the leak detection field to improve the performance and efficiency of software solutions, which are mostly either very complex and expensive or inefficient. Numerous machine learning algorithms have been adopted. However, unsupervised methods are so as used more frequently than supervised ones because the latter require a labeled dataset for training, which is not always available and is particularly expensive. Caputo & Pelagagge (2002) presented a prediction leak detection method based on generated data using ANN. They compared three architectures: probabilistic neural network (PNN), radial basis function (BRF), and multi-layer perceptron (MLP). They found that MLP presents the best results in predictive capacity and noise rejection. Mounce et al. (2011) applied SVM for anomaly detection using flow and pressure data. Kang et al. (2017) introduced a water leakage detection system using acoustic noise data. They combined a one-dimensional convolutional neural network (CNN) with SVM. The method achieved 99.3% of accuracy. Chen et al. (2004) highlighted a pressure transient wave-based method. They used SVM to detect negative pressure waves in the pressure curve. Their method offered better performance compared to other transient wave-based ones. Zhou et al. (2019) also developed a method of leak identification based on machine learning techniques analyzing real recorded transient pressure wave data. They used a CNN to extract the informative texture-features of transient wave samples and the deep neural network (DNN) to train the neural weights and produce a leak event indication. It has been reported that this method was able to achieve a failure rate that is lower than 6 Â 10 4 when the signal-to-noise ratio is 0 dB.
Ayadi et al. (2019) adopted a combined kernelized leak detection technique based on pressure data, using Kernel Fisher Discriminant Analysis as a reduction dimensionality technique. Additionally, they compared One Class Support Vector Machine (OCSVM) and KNN classifiers. Results revealed that OCSVM was more efficient than KNN. Oliveira et al. (2018) proposed a classification method for leak detection using the LR algorithm. The study was based on defining the threshold between anomalies and normal data. They found that the choice of the threshold is a compromise between improving accuracy and decreasing false alarm. Kayaalp et al. (2017) analyzed pressure data with KNN to real-time leakage detection and location. Bjerke (2019) used generated data from the LeakDB dataset and compared two recurrent neural network architectures for leak detection. Results disclosed good accuracies according to the chosen scenarios. However, the research study did not use the original data from the LeakDB dataset. Moreover, incipient leaks, which are hard to detect, were neglected.
The previous works centered around improving the efficiency of detection and classification, and optimizing the parameters and functions of the proposed algorithms. Most of them neglected the difficulties related to the data type, flow behaviors of water, or disturbances of consumption demands. In fact, they uncommonly defined architectures and sizes of their water distribution networks, as well as sizes and types of injected leaks. Furthermore, they adopted an unsupervised learning mode, which further complicates the validation of their proposed models.
The present work displays a clear comparative study of different algorithms within a supervised learning using the benchmark dataset LeakDB. An experimental implementation using the same dataset and the same types of networks and leaks is used to compare performances and efficiency of intelligent techniques and to evaluate their behavior through applying them with hydraulic data. The methodology of the study and various experiments are detailed in the next sections.

RESEARCH METHODOLOGY
Software methods for detecting leaks in water distribution networks are mainly based on analyzing hydraulic data, pressure, flow, or acoustic noise data. Variations of these data depend on the architecture of installations, their dimensions, shapes, position, and water runoff in pipelines, which relays on water demands. These demands vary periodically according to the usual consumption, and sometimes randomly according to unpredictable demands or repair. Leakage phenomenon can be regarded as a sudden or gradual increase in water demands, which complicates its classification as an outlier in certain cases. In fact, there are gradual increases in water demands which may refer to seasonal changes, intensive water demand, or the existence of large or progressive growing leaks. These assumptions in addition to ordinary measurement disturbances, distribution pipe faults, and even environmental changes yield a noisy flow and pressure data field. This can greatly affect the gait and performances of detection techniques, especially machine learning techniques.
These findings were so inspiring that they were an impetus for this research to propose an experimental study that compares and evaluates the most-used intelligent learning techniques, namely KNN, SVM, LR, MLP. These techniques affect a series of experiments, each of which uses a different package of pressure and flow data, as yearly, seasonal, night, flow difference data, or data with incipient or abrupt leaks. Each package choice allows us to eliminate the linked noise and to assess the consequent effect on detection results. Figure 1 shows the general process of this evaluation. These packages are created from the LeakDB benchmark database generating pressure and flow data using a realistic demand scenario from two different networks: Water Distribution Network (WDN) Hanoi and Net-1. The central objective of this research work is to identify anomalies for each data package.

EVALUATION SUPPORTS
To test the behavior of different types of intelligent techniques for leak detection application, we opted to compare four analytical techniques that belonged to different categories of learning (KNN, SVM, LR, and MLP) using the benchmark datast LeakDB.

LeakDB dataset
LeakDB corresponds to a benchmark dataset of simulated hydraulic data, created in 2018 by Vrachimis et al. (2018) using the python library Water Network Tool for Resilience (Klise et al. 2017) and EPANET which was widely used in the analysis of the hydraulic characteristics of water distribution networks (Cheung et al. 2005).
This dataset rests on two different architectures of water distribution networks: Net-1 and Hanoi. Net-1 is a simple, small, and virtual WDN architecture, used for small-scale testing. It consists of nine nodes, one reservoir, and one tank. Meanwhile, Hanoi is a simplified architecture of the real WDN from the city of Hanoi (Vietnam). It consists of one reservoir, 32 nodes, and 34 pipes. Figure 2 depicts these architectures. For each WDN, 1,000 scenarios were created with structural parameters that were modified from the original values based on an uncertainty value fixed at 25%. Demand nodes were simulated for 1 year based on a model of historical real data from water utilities. The model relies on three-signal components, namely yearly component, weekly periodic component, and random component (Eliades & Polycarpou 2012). The yearly component describes variations in water consumption referring to seasonal change over the course of a year. The weekly periodic component displays changes in demand signals throughout a week based on the social and economic charac-teristics of consumers. Both of them are approximated using Fourier series (Yamauchi & Huang 1977). The random component describes  corrected Proof random variations due to unpredictable demands or repair. It was created using a normal distribution with zero mean and a standard deviation fixed at 0.33 in this dataset. Figure 3 illustrates an example of the three components. Flow and pressure data were generated every 30 min and stored as .csv files. Leaks occurred randomly in terms of date, duration, types, positions, and size. A scenario may contain 0 to 2 leaks, incipient or abrupt, with a diameter varying in the range of [2 cm, 20 cm]. The leak can last from a few hours to several months.

Machine learning algorithms
KNN is a classification algorithm categorized as a lazy learning algorithm (Zhang 2016). It directly classifies new input data according to their distances from previously classified data without building a predictive model. In our case, there are two classes, namely data related to normal consumption or abnormal data due to a leak. KNN calculates the distance and stores the KNN (k ¼ 3) for each sample of learning pressure or flow input, and then deduces the class of the new input by looking for the class which constitutes the majority neighbors of the latter.
SVM is an algorithm belonging to the category of linear classifiers. The process resides in creating a separation among the different input data class with a hyper-plane, called support vector, maximizing the margin between them. SVM plots each sample of pressure and flow data as a point in n-dimensional space. Then, it calculates and estimates the most plausible location of the hyper-plane between classes. Therefore, the algorithms are trained to learn to predict the class of the new input. SVM is based on its kernel function, which transforms the input vectors from low-dimensional space to a high-dimensional space. In this study, the radial basis function kernel is chosen from the best results reported in literature (Bernhard Schölkopf 2018).
MLP is a category of ANN which models a mathematical relationship between input training data and their output labels. MLP consists of input and output layers, in addition to one or more non-linear hidden layers. This study introduces an architecture based on one hidden layer with 100 units and a Rectified Linear activation function.
LR is a linear statistical algorithm based on the estimation of a linear relationship between input data and their output labels to predict the classification of test data. For the binary classification presented in this study, the sigmoid activation function in logistic regression was more appropriate (Rymarczyk et al. 2019).

Metrics
Capacity of detection, false alarms rate, and accuracy were calculated in every node using the standard classification metrics of the confusion matrix: true positive rate (TPR), false positive rate (FPR), and accuracy (ACC): TPR, also called sensitivity, measures the proportion of data labeled as leaks that are correctly identified. It demonstrates clearly the capacity of detection. FPR, also called specificity, measures the proportion of data labeled as non-leaks which are identified as leaks. It indicates the false alarms rate. ACC measures the proportion of correctly labeled data; where true positive (TP) indicates the number of correctly identified labeled leaky data; false positive (FP) corresponds to the number of leaky data incorrectly identified; false negative (FN) stands for the number of non-leaky data incorrectly identified, and true negative (TN) refers to the number of non-leaky data correctly identified.

EMPIRICAL VALIDATION & RESULTS
Considering the big size of the dataset LeakDB and the similarity of events, this study did not use all scenarios. In fact, unrealistic scenarios, whose leaks last for a few hours or several months, were ignored. The set of selected scenarios includes the different categories of leaks and demands in each network. Two reduced datasets were defined. The first one consists of 54 leakage scenarios from Hanoi network, containing 34 incipient leaks and 35 abrupt leaks. The second consists of 13 scenarios from Net-1 network, containing nine incipient leaks and nine abrupt leaks. Tables 1 and 2 show all used scenarios and the characteristics of their leaks, which are leak nodes, types, durations, and leak demands range in cubic meter per hour (CMH). The advocated choice ensures the existence of the two types of leaks with different sizes in all nodes positions, with a duration extending from 7 days to 8 months.
Subsequently, a series of implementation was applied using a kit of seven scenarios for training and three scenarios for testing from each reduced dataset. In this section, the undertaken experiments as well as the obtained results are shown in details.

Leaky and non-leaky scenarios
In the first validation, all types of scenarios were tested. The 64 leaky scenarios were mixed with 40 non-leaky scenarios that were chosen randomly from the Hanoi network. The 22 leaky scenarios were mixed with 15 non-leaky scenarios that were chosen randomly from the Net-1 network. The mixture between leaky and non-leaky scenarios, and the use of different kits of scenarios instead of only one scenario, guaranteed the evaluation of different types of demands and leaks with different sizes. Table 3 summarizes the results of the artificial intelligence (AI) algorithms application for leak detection using pressures and flow data from Net-1 and Hanoi.

Only leaky scenarios
In this experiment, to evaluate the effect of labeled data on the leak detection learning, all non-leaky scenarios were eliminated, and the training and testing kits were constructed only with the reduced datasets of scenarios from Net-1 and Hanoi. The results are outlined in Table 4.

Abrupt and incipient leaks
In water distribution networks, leaks are categorized into two types: abrupt leaks and incipient leaks. Figure 4 displays an example of the flow and pressure variation for the two types of leaks. According to Vrachimis et al. (2018), incipient leaks are small and increase gradually. They are the hardest to detect. However, abrupt leaks are large and start with peaks. This experiment (the third) assesses the performance of AI algorithms according to the type of leaks. Two types of kits are used. One contains only incipient leaky scenarios for training and testing, while the other contains only abrupt leaky scenarios. The results are shown in Tables 5 and 6.

Night data
Among the methods used for improving leak detection performance, the overnight measurement of pressure and flow parameters minimizes disturbances related to the consumption. In this experiment (fourth experience), pressure and flow data from 00:00 to 04:00 over 1 year were used for each scenario in the train and test sets. Table 7 shows the results obtained for Net-1 and Hanoi.  To eliminate the disturbances due to seasonal changes in a year, in the fifth experiment the used pressure and flow data were taken from leaky scenarios for a period ranging from 1 to 2 months. The leaky scenario was divided into two parts, one for training and the other for testing. In the first test, the algorithms were applied for the whole node pressure and flow data in Net-1 and Hanoi networks. In the second test, algorithms were applied only for the leaky zone in the Hanoi network. The obtained results are shown in Tables 8 and 9.

Flow data difference
The study of LeakDB dataset by tracing flow and pressure data curves showed that in both cases when leaks were very small or the demands were unpredictable, the variations in pressure and flow during a leak were not remarkable. However, the difference between incoming and outgoing flows demonstrated large variations. Figure 5 reports an example of flow/pressure data variations, and flows difference variation during a leakage period. The proposed experiment (the sixth) used the flow difference data to detect leaks. The differences between the incoming and outgoing flows were computed in each node of each chosen scenario from the Hanoi network. Some scenarios were used for training and others for testing. The results are shown in Table 10.

RESULTS DISCUSSION
In all experiments, the application of AI algorithms for leak detection in the Net-1 network provides better results compared to the Hanoi network. This can be accounted for through comparing the architectures and the length of the two networks. It is obvious that the disruption of data due to unforeseen demands in the Hanoi network is more intense than that in the Net-1 network. This also justifies the higher performance with the use of pressure data from Net-1 and flow data from Hanoi. Pressure data is the best parameter for leak detection, but it is the more sensitive to disturbance compared to flow data. This finding can be corroborated further by comparing the fourth experiment to the second one. When using night pressure and flow data, no improvement was recorded in performance compared to using pressure data from the Net-1 network. However, for the Hanoi network, the accuracy improved substantially, especially with pressure data. Furthermore, the four algorithms yielded almost the same ACC results for both experiments, except for LR which provided the minimum TPR, and for KNN which gave the maximum FPR.
The first and second experiments demonstrated that by using only leakage scenarios, the percentage of TPR decreased by 10% for all algorithms. When using leaky and non-leaky scenarios together for training and testing, the leak event became more distinguished by AI algorithms. In addition, the results of the third experiment were indicative that the type and size of leaks have the greatest effect on leak detection efficiency. Moreover, incipient leaks were more difficult to detect compared to abrupt leaks. When using abrupt leaky scenarios from Net-1, all TPRs for the different algorithms were greater than 90%. However, when using incipient leaky scenarios, TPRs did not exceed 45%. The TPR results of abrupt leaky scenarios from the Hanoi network were suggestive that SVM and LR are the most affected algorithms by noisy data.
With the fifth experiment, by eliminating the disturbances due to seasonal change and using the same scenario for testing and training, the performances for the different algorithms improved in the two networks compared to the second experiment. Indeed,  corrected Proof better accuracy, more true labeled detection, and less false alarm were reported especially with SVM. Additionally, performances were improved in the Hanoi network when the algorithms were applied only in the leak areas.
In the sixth experiment, a new parameter was defined for leak detection, which is the flow difference. The proposed experiment yielded different results according to the test and train scenarios. As portrayed in Table 8, the best performance was achieved in scenario 111 with 100% TPR, 100% ACC, and 0% FPR for KNN, LR, and MLP, but for other scenarios, almost 50% ACC and 75% FPR were obtained. Furthermore, the variation triggered by the leak can be observed only on the leaky node when computing the flow difference between the incoming and the outgoing flow. The obtained results for the two last experiments can help identify leak localization using AI algorithms in future research.
It is noteworthy that the classification algorithms KNN and SVM provide the best performances compared to LR and MLP algorithms. Moreover, using night data and seasonal data KNN offers the highest TPR, but also the highest rate of false alarms, which makes SVM results more efficient and more reliable.      As a supervised comparison study, the major problem with the LeakDB dataset is defining the same label for each scenario for all the network nodes. This may be plausible for a large leak that disturbs the whole distribution systems; however, for small or incipient leaks, the nodes in the leak area are individually affected and by different degrees in large networks like Hanoi.
This work, to our knowledge, is the first to use a common benchmark dataset for different AI algorithms. It offers a clear comparison of the behavior of different algorithms in leak detection using flow and pressure data, and highlights the role of choosing learning data in terms of enhancing detection performance.
The obtained observations and results can shed light on the behavior of supervized learning techniques, allowing deeper insight to better explore leak detection applications. This research work is promising and valuable in terms of opening further lines of investigation and paving the way for constructive future research directions. This study can be taken further because it  corrected Proof lays the ground for future research into novel solutions to resolve the issue of leak detection and localization through the use of other state-of-the-art unsupervized techniques and deep-learning architectures.