The identification and localization of water pipeline leakages based on ground penetrating radar (GPR) technology are gradually becoming a research hotspot. Current methods mostly focus on exploring the patterns of B-Scan images, heavily relying on the subjective experience of detection personnel, which can lead to misjudgments. Moreover, the large amount of data makes it difficult for manual processing. Therefore, a method based on wavelet transform (WT) and ResNet-50 is proposed to identify the time-frequency characteristics of GPR data, thereby achieving intelligent localization of pipeline leakages. The B-Scan images from GPR are transformed into time–frequency scale images using WT, and the features in both time and frequency domains are combined to enhance the representation of leakages. Subsequently, ResNet-50 is employed for feature extraction and leakage identification. Additionally, a deviation correction mechanism is proposed to improve the clarity of the prediction results. Experimental results demonstrate that ResNet-50 achieves an accuracy of 0.917 and a recall of 0.998 on the time-frequency dataset, almost detecting all leakages, with a recognition efficiency of 0.0165 s per data trace. The comprehensive method is validated in the field, indicating its capability to accurately identify and localize pipeline leakages.

  • A GPR time–frequency scale image conversion method based on wavelet transform is proposed.

  • This method comprehensively considers waveform anomalies in the time domain and peak attenuation in the frequency domain.

  • Deep learning is used to accurately identify the leakage area of time–frequency images.

  • The proposed method can efficiently provide clear corrected leakage identification results.

Pipeline leakage has long been a challenge for water supply systems, and research on leakage identification has always been of interest to scholars. Currently, common leakage identification techniques mainly include model-based methods and equipment-based methods (Chen et al. 2023). Model-based methods use monitoring data such as flow meters and pressure meters to determine whether pipelines are leaking and preliminarily locate leakage areas in the pipe section between valves. Equipment-based methods (Hao et al. 2012; Juliano et al. 2013; Karthikeyan et al. 2014; Jinguuji & Yokota 2022) can achieve meter-level leakage localization, mainly including acoustic method, correlation instrument, tracer, infrared thermography, and resistivity method. The simple, low-investment, and flexible acoustic method (Shirajuddin et al. 2022) is the most commonly used method. Detection personnel analyze the leakage situation while advancing along the pipeline on the road surface relying on listening rods, which determines that the acoustic method is often interfered with by environmental noise. Therefore, researchers have begun to search for other methods that can replace manual labor in order to achieve intelligent leakage recognition.

In recent years, ground penetrating radar (GPR) (Klewe et al. 2021; Cao & Al-Qadi 2022; Hou et al. 2022; Luo et al. 2023) has been applied to the localization of leaks in water supply pipelines. It is based on the principle of electromagnetic waves to detect changes in the dielectric constant of underground media, and its efficient and comprehensive data collection method can serve as an effective supplement to the acoustic method. As shown in Figure 1, GPR moves along the survey line, and each time it passes a certain distance, the transmitting antenna emits radar waves, and the receiving antenna completes the reception. The received echo signal is the absolute amplitude of the reflected electromagnetic wave, resulting from the convolution of the pulsed electromagnetic wave and the reflection coefficient of the strata, containing information such as strata structure, lithology, and the distribution of potential buried bodies in the underground space. In the profile map of radar data, the diffraction signal formed by the pipeline target is a hyperbola described by the following equation:
(1)
where is the abscissa of the vertex of the hyperbola, which is the distance from the starting position of the survey line, is the ordinate of the vertex of the hyperbola, which is the two-way-travel-time, and v is the propagation speed of electromagnetic waves in the medium. For high-frequency electromagnetic waves emitted by GPR, the medium can generally be simplified as an isotropic medium. According to the scanning dimension of reflected waves, radar data can be divided into A-Scan signals and B-Scan images. The discrimination of leakage characteristics in radar imaging is mostly based on the analysis of B-Scan profiles from a temporal perspective, and the water-filled cavities formed around leakage points in the soil are represented as downward-opening hyperbolic signals in radar profiles. Gołębiowski (2023) conducted numerical simulations of pipeline leakage under various typical ground conditions, determining the selection of radar antenna frequencies under different ground interference amplitudes and the possibility of using radar to detect leakages. Cheung & Lai (2019) found that the vertices of pipeline hyperbolas in leakage areas shift downward, and an increase in soil moisture content causes radar wave diffraction. Lau et al. (2021) found that due to the high dielectric constant of clay with high moisture content, the radar electromagnetic wave velocity in leakage areas decreases by 20–30%. Based on existing research and experimental observations, it is known that due to the dielectric constant of soil ranging from 4 to 10 and the dielectric constant of constant temperature water reaching 81, such a large difference will result in three characteristics of pipeline leakage areas in GPR B-Scan images. First, in the survey lines collected longitudinally along the pipeline, the pipeline signal at the leakage point is no longer continuous and shifts downwards. Second, in the survey lines collected across the pipeline, the hyperbolic vertex of the pipeline in the leakage area is shifted downwards. Third, the electromagnetic wave intensity in the leakage area decays significantly, causing distortion and deformation of the image. According to these characteristics, B-Scan images can be used for pipeline leakage identification. However, mutual interference between multiple pipelines and irrelevant targets in the surrounding soil introduces uncertainty to the judgment of leakage signals. Additionally, velocity analysis and comparison of multiple images are required to support leakage identification. Therefore, analyzing leakage situations solely based on B-Scan images has limitations.
Figure 1

The principle of GPR imaging.

Figure 1

The principle of GPR imaging.

Close modal

The B-Scan image of GPR is actually composed of multiple ordered A-Scan signals, and spectral analysis can be performed on the A-Scan signals. Spectral analysis is commonly used in fields such as pollutant migration investigation, cave exploration, and pavement assessment (Marcak & Gołębiowski 2008; Szymczyk & Szymczyk 2015; Rodés et al. 2020), and can be used to distinguish different types of defects based on the sensitivity characteristics of materials to the spectrum. At present, there is limited research on the evaluation of pipeline leakages based on frequency domain analysis. Benedetto & Benedetto (2011) found that the degree to which radar waves of different frequencies are absorbed by water varies, and the frequency of scattered waves shifts toward lower frequencies with increasing moisture content. Therefore, the stratum moisture content can be evaluated based on the changes in spectral peaks, but it also requires the signal to have high resolution in spectral distribution, which limits the effectiveness of Fast Fourier Transform (FFT) (Benedetto & Tosti 2013). An effective alternative method is wavelet transform (WT) (Zhang et al. 2023), which has the properties of frequency analysis, highlighting frequency signals in A-Scan data, and reflecting the two-way-travel-time of target reflections. This allows for the extraction of changes in specified time and frequency from 1D signals simultaneously. By combining time domain B-Scan images with frequency domain A-Scan signals, leakage identification can be studied from a new perspective.

However, the vast amount of time-frequency data obtained from GPR requires the development of automated high-precision recognition methods capable of processing data in batches. Machine learning classification methods such as support vector machine (SVM) (El-Mahallawy & Hashim 2013) cannot extract detailed features due to their simple decision boundaries, and the classification accuracy depends on the manual selection of features. Recently, deep learning models represented by convolutional neural networks (CNNs) have been widely used in non-invasive underground targets or damage recognition based on GPR. According to the characteristics of the recognition object, it can be mainly divided into 1D signal and 2D B-Scan image recognition. Generally, the accuracy of signal-based recognition is higher than that of image (Tong et al. 2020; Li et al. 2022), but due to the strong interpretability of images, more research is conducted using B-Scan profiles as training and recognition targets. Li et al. (2016) evaluated the application of random Hough transform in root target recognition, locating underground tree roots based on hyperbolic diffraction waves. Zhang et al. (2020) intelligently identified abnormal reflections in B-Scan images using deep learning and incremental random sampling, thereby assessing damage in asphalt pavement. Qin et al. (2021) used ResNet-101 and feature pyramid networks (Lin et al. 2017) to extract features from B-Scan images, and then used Mask R-CNN (He et al. 2017) to detect defects in the steel ribs, voids, and initial linings of the tunnel. Regarding pipeline leakage, Liu et al. (2023) used CNNs and transfer learning to identify pipeline acoustic emission data, achieving optimal performance among multiple models. Choi & Im (2023) used CNNs to identify leakages in vibration data of water pipes, demonstrating a significant performance improvement compared to SVM. Xie et al. (2023) automatically detected leakages in infrared thermal imaging data using Faster R-CNN (Ren et al. 2015), achieving leakage identification in complex backgrounds. Typically, only 1D signals or 2D images are independently recognized in the study. Deep neural networks such as R-CNN and Faster R-CNN benefit from multiple hidden layers and non-linear structures, allowing them to uncover fine features in data and achieve intelligent classification of 1D–2D comprehensive time-frequency data. Compared to these networks, ResNet-50 (He et al. 2016) with a deeper network structure introduces residual connections to reduce the number of parameters and solve the problem of gradient vanishing or exploding. Furthermore, ResNet-50 is an end-to-end neural network that, compared to the two-stage models of the R-CNN series, has higher efficiency and is suitable for pipeline leakage recognition tasks. Therefore, this study chose ResNet-50 to intelligently identify the time-frequency scale images obtained through WT, thus achieving intelligent interpretation of the joint time-frequency domain features for water supply pipeline leakage identification.

It should be noted that the research scope of this paper is limited to pipelines with known information regarding construction location, diameter, and burial depth, a situation not uncommon in practical engineering and with considerable engineering demand. Additionally, the groundwater level should not inundate the pipeline, and areas with saturated soil medium are not within the scope of this study. The structure of this paper is organized as follows: Section 2 introduces the methodology, including the formation of GPR data, the principles of WT, and the structure of ResNet-50. Section 3 elaborates on the process of dataset establishment, the features of the data, and the preprocessing results. Section 4 explains the training and testing results of ResNet-50. Section 5 concludes and provides future prospects.

The time and frequency domain of GPR

The raw data of GPR are divided into A-Scan and B-Scan according to the scanning dimension. A-Scan is a 1D profile produced by the propagation of electromagnetic waves along the longitudinal axis with the signal intensity varying with time is called A-Scan. It is essentially composed of multiple signed 16-bit binary numbers, with values ranging from −32,768 to 32,767. A set of A-Scans is arranged in an orderly manner with a certain trace spacing along the horizontal axis to form a B-Scan. B-Scan is usually transformed into a grayscale image for display according to a certain mapping method.

The temporal characteristics of radar data are reflected in the reflection wave group. When there is an electrical difference in the detected target medium, a reflection signal will be generated in the radar record. The amplitude, continuity, and phase coherence of the reflection waves are the focus of attention. A-Scan data contain waveform information of reflection waves, while B-Scan images provide a more intuitive comparison of the time domain signals of each trace data. Detection personnel can ultimately complete the geological interpretation based on processed GPR profile features such as amplitude, phase axis, and frequency.

The frequency domain characteristics of radar data are reflected in the spectrum changes. Electromagnetic waves do not dissipate energy when propagating in ideal media, but real soil media are not ideal media. Therefore, the ultra-wideband high-frequency electromagnetic waves emitted by radar antennas undergo energy attenuation during the propagation process in the underground strata. According to the 1D wave equation of electromagnetic waves, the velocity attenuation coefficient α and the energy attenuation coefficient β during the propagation of electromagnetic waves in underground media are both related to the frequency of electromagnetic waves and the dielectric constant of the medium, and are calculated by the following equations:
(2)
(3)
where σ represents conductivity, ω represents angular frequency, and ε represents dielectric constant. Equations (2) and (3) theoretically show that media with different electrical parameters have different attenuation effects on electromagnetic waves. Soil is a good dielectric with , in this case, , . So soil belongs to non-dispersive media. Through the Fourier transform (Fu et al. 2022), the reflected wave can be decomposed into a series of sinusoidal electromagnetic harmonics with different frequencies, thereby obtaining the spectral distribution of each A-Scan signal. Based on the corresponding frequency spectrum of each trace, the distribution of underground media can be inverted, especially for exploring water-rich areas, and then determining the location of leakages.

Wavelet transform

In the B-Scan images before and after the occurrence of leakage, diffraction wave signals of the water supply pipeline will be displayed. Although the hyperbolic curves of the pipeline will shift downward after leakage, using this as a basis for discrimination may mistakenly identify deeply buried pipelines and underground unidentified buried objects in the inspection area as leakages. Therefore, only extracting the high-frequency details and low-frequency backgrounds of the B-Scan images without accurately perceiving the spectral changes in the images will not achieve accurate leakage identification. Leakage identification of pipelines must consider both the abnormal reflection waves in the time domain and the attenuation changes in the frequency domain.

Based on the characteristic difference between leakage signals and normal pipeline signals in the time-frequency domain, this study uses WT to analyze the A-Scan signals, thereby obtaining a multi-resolution wavelet time-frequency scale images, which provide the data foundation for subsequent intelligent identification. WT is one of the most effective methods for constructing time-frequency scale images. It shifts wavelets of different frequencies along the time axis and simultaneously performs inner product operations with the original signal during the shift. In the computation, large-scale factors correspond to wide wavelets of low frequencies, from which the low-frequency component information in the original signal can be obtained. Conversely, small-scale factors correspond to narrow wavelets, from which high-frequency component information can be obtained. WT uses finite-length and decaying wavelet bases as basis functions, and is described by the following equation:
(4)
This function has alternating positive and negative volatility, with a mean of 0. After scaling and translating wavelets, a family of wavelet functions described by the following equation can be obtained:
(5)
where a is the scale factor inversely proportional to the frequency of the wavelet, which is a series of 2. b is the translation factor that controls the wavelet translation along the time axis. Due to the similarity between the Morlet wavelet (Shyu & Sun 2002) and typical radar pulses, the Morlet wavelet was chosen as the mother wavelet for analysis. The Morlet wavelet is a single-frequency complex harmonic function with a Gaussian amplitude, expressed by the following equation:
(6)
where is the frequency of a complex harmonic function. The continuous wavelet transform (CWT) of any signal f(t) is defined by the following equation:
(7)
where τ represents the displacement transformation coefficient of the wavelet basis. In computer simulation, due to the difficulty of expressing continuity, signals are usually discretized. The discrete wavelet transform (DWT) is defined by the following equations:
(8)
(9)
(10)
where is a series of 2 greater than 1, is a constant greater than 0. If a small change is made to the value of m, a will undergo a significant change, thereby achieving scale discretization.

ResNet-50 and prediction process

The continuous updates and iterations of radar equipment have made data collection more and more efficient, but the interpretation of images still relies on experienced professional inspectors, leading to extremely high costs and very low efficiency. In recent years, various deep learning frameworks have been developed in the field of artificial intelligence, such as CNNs and transformers (Vaswani et al. 2017; Han et al. 2023), which can be used for the automatic interpretation of radar return images.

The recognition target of this study is the 2D time-frequency scale image, which exhibits extremely subtle feature changes. Therefore, the residual neural network model ResNet-50 is used for training to obtain high-level features in deeper networks without encountering gradient vanishing or exploding issues. The network structure, as shown in Figure 2, is modified from the VGG19 network and consists of a total of 50 layers. The input layer is a three-channel image with a resolution of 224 × 224, which sequentially passes through a 7 × 7 convolutional layer and a max-pooling layer. Subsequently, a residual connection is made every three convolutional layers, with convolutional kernels of sizes 1 × 1, 3 × 3, and 1 × 1, respectively. Finally, the fully connected layer is replaced by an average pooling layer, and the results are output.
Figure 2

The structure of ResNet-50.

Figure 2

The structure of ResNet-50.

Close modal
The leakage identification process of this study is illustrated in Figure 3. Firstly, GPR data are collected, and A-Scan signals are extracted channel by channel. Then, WT is applied to transform the A-Scan signals into time-frequency scale images of 224 × 224 × 3 size. Subsequently, ResNet-50 is used to recognize the time-frequency scale images. Afterwards, deviation correction is applied to the results of adjacent channels based on prior conditions. Finally, the predicted results are presented in the B-Scan image using background masking, assisting detection personnel in diagnosing leakage diseases.
Figure 3

Leakage identification process.

Figure 3

Leakage identification process.

Close modal

Compared with leakage identification methods based on GPR and deep learning networks, traditional model or data-based methods (Zhou et al. 2019) require obtaining parameters such as flow and pressure at multiple points, and training in hydraulic calculation models to determine the pipeline segment where the leak occurs through parameter changes. The localization is coarse and cannot achieve meter-level accuracy, resulting in considerable excavation work and a significant impact on the surrounding environment. The advantage of GPR lies in its independence from the variations of parameters within the pipeline system. It is applicable to pipes of any length and diameter, achieving meter-level precision in localization. In addition, current mainstream acoustic methods (Kang et al. 2018) are sensitive to environmental noise and require the processing of abnormal data. If the leak in the pipeline is small, resulting in weak vibration signals, sensors may fail to transmit data accurately, thus affecting the recognition accuracy. However, GPR combined with deep learning networks is not affected by the size of leakage points and environmental factors. Therefore, the method proposed in this study has significant advantages, and the relevant results will be presented in Section 4.

Data acquisition

All experiments or field data collection processes in this study were conducted under good weather conditions and dry soil. The experimental platform is located at the Yuquan Campus of Zhejiang University. The experimental site and soil samples are shown in Figure 4(a). The soil above the pipeline is the undisturbed soil of the experimental site. The measured soil dielectric constant ranges from 4 to 10 depending on the precipitation situation, which can simulate the environment of most pipelines laid under green belts. After each experiment, the covering soil is excavated entirely, left to dry thoroughly, and then manually backfilled. The site is leveled with each 0.2 m backfill to ensure the covering soil is spread as evenly as possible. If abnormal data collection occurs, the experiment needs to be repeated. The site model diagram is shown in Figure 4(b), with a site size of 3 m × 4 m. At a depth of 0.7 m, a 3 m long DN150 ductile iron pipe is buried and connected to the main pipeline. The normal water supply pressure is 0.45 MPa, and the pipeline connection is a valve that was tightly closed before the experiment. A 5 mm small hole is opened on the side of the ductile iron pipe 1.8 m away from the valve as the leakage point, and the occurrence of leakage can be simulated by the opening and closing of the valve.
Figure 4

(a) Realistic view and (b) schematic diagram of the experimental site.

Figure 4

(a) Realistic view and (b) schematic diagram of the experimental site.

Close modal
Using GPR to scan the entire site with grid-like survey lines, the layout of the survey lines is shown in Figure 5. 35 and 25 survey lines are, respectively, arranged on the X and Y axes, with an interval of 10 cm between each line and a length of 300 cm. The strategy of collecting experimental data in a grid format is to expand the amount of data (limited by site size) and comprehensively observe data characteristics. The GPR host model is Mala ProEx, the center frequency of the shielded antenna is 500 MHz, and the distance between the transmitting antenna and the receiving antenna is 18 cm. The signal is triggered by a ranging wheel with a triggering interval of 1 cm. The sampling frequency is 7,035 MHz, with 256 sampling points and 4 stacking times. Data collection was conducted along the grid survey lines before and after the leakage occurred, with a total of (35 + 25) × 2 = 120 times, and a total of 120 × 300 = 36,000 A-Scan data were collected. A total of 4,000 sets of experimental site samples were selected, with a data ratio of 1:2:5 for the no pipeline area (normal), buried pipeline area (pipeline), and leakage area (leakage). Pipeline data were collected at the same proportion in the Zijingang campus of Zhejiang University and surrounding residential areas. In the process of actual data collection, the radar advances along the pipeline direction at a speed of about 1 m/s, and after completing the initial collection of any pipeline section, returns to the starting point, and repeats the collection five to seven times along the survey lines parallel to the pipeline, which are approximately 10 cm away from the pipeline. A total of 4,800 actual samples were collected. Then, all samples were annotated and a dataset containing 8,800 sets of samples was established. WT was employed to convert the A-Scan data of each set of samples into time–frequency scale images of size 224 × 224 × 3. Finally, the dataset was randomly divided into training, validation, and test sets in a ratio of 6:2:2, and the proportions of normal, pipeline, and leakage were maintained in all three subsets.
Figure 5

GPR survey line layout.

Figure 5

GPR survey line layout.

Close modal

The collected data include both soil cover, pipeline, and pipeline leakage information, as well as various noises, which need to be weakened or eliminated to improve the signal-to-noise ratio. Since the pipeline is shallowly buried and the signal characteristics of the pipeline are already very clear, no enhancement was applied to avoid image distortion. After processing steps such as trace editing, removing the DC component, static correction, and average filtering, the contrast of the radar 2D profile was enhanced. In practical engineering, the burial depth of water supply pipelines is typically between 0.7 and 2.0 m, with soil types mainly consisting of gravel, cobble, sand, and plain fill. The radar antennas used in this study, ranging from 500 to 800 MHz, can meet the penetration depth requirements of the above-mentioned soil types, and this has been validated in subsequent field tests.

Data feature analysis

B-Scan time domain features

The B-Scan images in the Y direction before and after the leakage can be seen in Figure 6(a) and 6(b), respectively. It can be observed from Figure 6(a) that before the pipeline leakage, all the apexes of the hyperbolas were located at the position of 10 ns. As the leakage experiment proceeded, it is evident from Figure 6(b) that the positions of the hyperbolas near the moist area shifted downwards. Among them, the center of the leakage point was the most noticeable, with the apex of the hyperbolas shifting downwards to the position of 16 ns, and the curvature of the hyperbolas also increased. After on-site excavation confirmation, it was determined that the middle area of the measurement lines Y13–Y21 was the moist region of the leakage. There exist hyperbolic reflection waves in each measurement line, which are diffraction wave signals obtained from the reflection of electromagnetic waves on the upper sidewall of the ductile iron pipe. In moist soil, the dielectric constant of the medium increases (Lee et al. 2013), resulting in a decrease in the propagation velocity of electromagnetic waves therein, and an increase in the two-way-travel-time. Additionally, due to the strong conductivity of highly waterlogged soil, the attenuation of electromagnetic waves during propagation in the medium is more severe. The signal strength of the hyperbolas at the leakage measurement lines is significantly weakened, causing the image to become distorted and blurred, but the positions of the hyperbolas can still be identified.
Figure 6

Sectional view of each survey line in the Y and X direction before and after leakage, including (a) before leakage in the Y direction, (b) after a leakage in the Y direction, (c) before leakage in the X direction, and (d) after leakage in the X direction.

Figure 6

Sectional view of each survey line in the Y and X direction before and after leakage, including (a) before leakage in the Y direction, (b) after a leakage in the Y direction, (c) before leakage in the X direction, and (d) after leakage in the X direction.

Close modal

The B-Scan images in the X direction before and after the leakage can be seen in Figure 6(c) and 6(d), respectively. In Figure 6(c), on the measurement lines X16 and X19, there are bright black bands at the interface between the buried soil and the pipeline before the leakage occurs, appearing at 12 ns. This is because there is a significant difference in the dielectric constant between the buried soil and the pipeline boundary, leading to the reflection of electromagnetic waves at the interface. It can be observed from the results in Figure 6(d) that after the leakage, the previously continuous black bands are interrupted, and the two-way-travel-time of the reflection signal is prolonged, manifesting as an upward-opening hyperbola. This is because the previously uniform underground medium is disturbed by high-pressure water flow, resulting in a gradient change in soil moisture content at the same depth, causing the propagation rate of electromagnetic waves to vary.

A-Scan frequency domain features

Figure 7(a)–7(c) depicts typical A-Scan time domain charts of the experimental site in the normal area, leakage area, and pipeline area, respectively. Figure 7(d) shows the frequency spectrum chart after FFT processing. As depicted in Figure 7(a)–7(c), in the normal area, the energy is strong within 0–5 ns, and there is no obvious reflected wave signal after 15 ns, making it impossible to detect the reflected wave of the pipeline. In the leakage area, the reflected energy within 0–5 ns is weak, and the spectral bandwidth has increased. The reflected signal of the pipeline can be detected within 15–18 ns. In the pipeline area, the energy within 0–5 ns is also strong, and the reflected signal of the pipeline can be detected within 10–15 ns, lagging behind the non-leakage buried pipe signal by approximately 5 ns. From Figure 7(d), it can be observed that there is no significant difference in the frequency domain distribution between the normal area and the pipeline area. The main frequency is approximately 220 MHz, with a main frequency energy of about 25,000 dB. The main frequency of the A-Scan signal in the leakage area decreases by 40 MHz, reaching approximately 180 MHz, and the main frequency energy decreases by 10,000 dB, reaching approximately 15,000 dB. Compared to the pipeline area, the main frequency in the leakage area is slightly lower, and the energy amplitude is significantly reduced. This is due to the high water content in the leakage area, which absorbs some high-frequency electromagnetic waves.
Figure 7

A-Scan time-domain images and frequency spectrum of three types of areas, including (a) normal area, (b) leakage area, (c) pipeline area, and (d) the spectrum after FFT processing.

Figure 7

A-Scan time-domain images and frequency spectrum of three types of areas, including (a) normal area, (b) leakage area, (c) pipeline area, and (d) the spectrum after FFT processing.

Close modal

Wavelet time–frequency scale images

Leakage defects are reflected differently in the time domain and frequency domain. For example, the phenomenon of the downward shift of hyperbolas in B-Scan images is intuitive, but it is prone to interference in areas where multiple pipelines are buried. On the other hand, comparing the main frequency distribution in the spectrum makes it easier to identify the presence of water, but it cannot determine whether there are pipelines in the area without excavation, which is not conducive to pipeline localization. Based on the advantages and disadvantages of the time domain and frequency domain mentioned earlier, this study uses the WT algorithm to construct time–frequency scale images, thereby conducting a time–frequency comprehensive analysis to identify pipelines and leakages.

Figure 8(a)–8(c) show the wavelet time–frequency scale images corresponding to Figure 7(a)–7(c). The wavelet time–frequency scale image is a 2D pseudo-color image generated by the jet mapping method in Matlab. The horizontal axis of the time–frequency scale image represents the two-way-travel-time of the reflected wave, while the vertical axis represents the frequency magnitude. The mapping range is from 0 to 10,000, and a series of color tables are used to represent the intensity of each frequency signal. Where 0 corresponds to deep blue, indicating the weakest energy, and 10,000 corresponds to yellow, indicating the strongest energy. For a radar antenna with a center frequency of 500 MHz, the frequency range is from 150 to 800 MHz.
Figure 8

Time-frequency scale images of three types of areas, including (a) normal area, (b) leakage area, and (c) pipeline area.

Figure 8

Time-frequency scale images of three types of areas, including (a) normal area, (b) leakage area, and (c) pipeline area.

Close modal
The time–frequency scale image covers comprehensive characteristic information of both time and frequency domains, allowing for a more intuitive observation of the frequency spectrum distribution at each moment. From Figure 9, it can be seen that the frequencies of the three areas are all between 280 and 550 MHz, and the signal is strongest in the areas close to the surface, with a main frequency of about 300 MHz. Due to the dry and undisturbed soil in both the normal and pipeline areas, their spectra are not affected by the leakage water, so the frequency domain distribution of these two areas is relatively close. The signal strength of the main frequency in the leakage area is significantly reduced, and the components of high-frequency electromagnetic waves are fewer. In terms of time domain, the reflection signal of the pipeline cannot be detected in the normal area, while both the leakage area and the pipeline area have energy reactions, and the pipeline signal in the leakage area lags behind. Therefore, the time-frequency scale image obtained through WT exhibits more significant characteristic changes corresponding to the leakage, thus it can be used for subsequent computer vision-based intelligent leakage identification.
Figure 9

Confusion matrix.

Figure 9

Confusion matrix.

Close modal

MODEL TRAINING AND VALIDATING

The features perceptible to the naked eye are only a small part of the time–frequency scale image, and larger-scale frequency variations or finer features are difficult to capture manually. Furthermore, the vast amount of radar data does not support all tasks being completed manually. This study uses the ResNet-50 model to identify and classify pipeline leakages in GPR images.

The training of ResNet-50 was conducted in Python version 3.6, using the TensorFlow framework. The training process was carried out on a computing workstation equipped with one CPU (Intel Xeon E5-2620V4 8 Core/2.1 GHz/20M), four GPUs (GeForce GTX 1080Ti), and 128GB of memory. The cross-entropy function was used as the loss function, and the loss is calculated by the following equation:
(11)
When sample i belongs to category k, is taken as 1, otherwise it is taken as 0. represents the probability that sample i belongs to category k, K represents the number of categories, and N represents the number of samples. Adadelta (Matthew 2012) was selected as the optimizer, which can avoid low learning rates in the later stages of training. The initial learning rate was set to 0.01, and the number of epochs was set to 200. As the training progresses, the learning rate undergoes dynamic changes. After training completion, the recognition performance of the model is evaluated through accuracy, precision (Pr), recall (Re), and F1 score (F1), defined by the following equations, respectively:
(12)
(13)
(14)
(15)
where TP represents true positive samples, indicating leakage samples correctly predicted as leakages. FP represents false positive samples, indicating intact samples incorrectly predicted as leakages. FN represents false negative samples, indicating leakage samples incorrectly predicted as intact. TN represents true negative samples, indicating intact samples correctly predicted as intact. The confusion matrix of ResNet-50 is shown in Figure 9. Out of a total of 1,724 test samples, the overall accuracy of ResNet-50 reached 0.917.

The metrics achieved by ResNet-50 on the test set are shown in Table 1. The Pr of the normal area and the pipeline area reaches 0.990 and 0.968, respectively, indicating that the model can accurately determine the presence of underground pipelines. Due to significant changes in the moisture content at the leakage boundary, the Pr of the leakage area is lower, reaching only 0.812, indicating that data at the leakage boundary are prone to misdiagnosis. In terms of Re, it is the opposite, with the normal area and pipeline area reaching 0.882 and 0.880, respectively, while the leakage area is 0.998, indicating that almost all areas with leakages can be accurately identified by the model, which is crucial in practical leakage identification tasks. Therefore, based on the use of WT for image preprocessing, ResNet-50 can accurately predict time–frequency scale images, meeting the needs of engineering applications.

Table 1

Metrics achieved by ResNet-50 on the test set

PrReF1
Normal 0.990 0.882 0.933 
Leakage 0.812 0.998 0.895 
Pipeline 0.968 0.880 0.922 
PrReF1
Normal 0.990 0.882 0.933 
Leakage 0.812 0.998 0.895 
Pipeline 0.968 0.880 0.922 

Deviation correction

In order to present the prediction results more intuitively, three colored masks are used on B-Scan images to characterize the three types of areas, thereby achieving visual imaging. The imaging effect is shown in Figure 10, where green represents the normal area, red represents the leakage area, and blue represents the pipeline area. In Figure 10(a), most of the prediction results are continuous, but there are a few isolated A-Scan prediction results that are misclassified, which may also be local anomalies in the underground area. However, based on our prior information, leakage defects in the stratum generally cannot exist independently, and such accidental deviations should be corrected. A deviation correction strategy has been proposed, which involves counting the prediction results of the adjacent 20 traces for each A-Scan and selecting the type with the highest prediction frequency as the correction result for that trace. The imaging effect after correction is shown in Figure 10(b), which is more consistent with the actual situation and manual judgment results.
Figure 10

Deviation correction results, including (a) before correction and (b) after correction.

Figure 10

Deviation correction results, including (a) before correction and (b) after correction.

Close modal

Efficiency experiments were conducted on the overall process of WT, ResNet-50 identification, and deviation correction. Experiments were conducted on 300 sets of data, and the final time spent was 4.95 s, with an average time of 0.0165 s per set, which fully meets the requirements of engineering inspection and can be used for real-time leakage diagnosis.

Field test of model performance

After completing the experiments, the proposed method was tested for performance on an actual pipeline network. Leakage identification was carried out on a DN160 steel pipe in a steel enterprise plant. The depth of the pipeline is 1.2 m, and the actual survey line length is 7.5 m. The identification was conducted using a central frequency 500 MHz antenna, resulting in the B-Scan and spectrum diagrams shown in Figure 11. From Figure 11(a), it can be observed that within the range of 1.0–3.0 m along the survey line, the echo bands are thin and dense, indicating high resolution. However, within the range of 5.5–7.3 m along the survey line, the echo bands become coarse and sparse, and the resolution within the red box is significantly reduced, indicating a lower frequency at this location, suggesting a possible leakage. Signals were taken at positions 1.0 and 7.0 m, and FFT was applied for spectral analysis, yielding the results shown in Figure 11(b). The spectrum of the normal area has peaks at 200 and 380 MHz, respectively, while due to the influence of water accumulation, the spectrum of the suspected leakage area only has one peak at 200 MHz, and the high-frequency components are absorbed.
Figure 11

(a) B-Scan image and (b) frequency spectrum of the DN160 steel pipe.

Figure 11

(a) B-Scan image and (b) frequency spectrum of the DN160 steel pipe.

Close modal
Figure 12(a) and 12(b) show the time–frequency scale images corresponding to the normal area and the leakage area, respectively. Utilizing wavelet analysis, it intuitively displays the energy distribution of radar wave reflection signals in various frequency bands and time windows. After using ResNet-50 to predict Figure 12(a) and 12(b) and performing deviation correction, the results shown in Figure 12(c) show a clear distribution of underground areas, including normal soil cover areas displayed in green and abnormal leakage areas displayed in red. The position of the red mask indicates that the leakage area may be located at the end of the survey line. After excavation verification, the actual location of the leakage is consistent with the prediction results of ResNet-50, indicating that our proposed comprehensive method can accurately identify underground pipelines and leakage.
Figure 12

Time-frequency scale images and leakage prediction results of the DN160 steel pipe, including (a) normal area, (b) leakage area, and (c) prediction results.

Figure 12

Time-frequency scale images and leakage prediction results of the DN160 steel pipe, including (a) normal area, (b) leakage area, and (c) prediction results.

Close modal

This study applies WT to convert GPR data into the time-frequency domain, then employs ResNet-50 to detect time-frequency scale images, and proposes a deviation correction mechanism to enhance result clarity at the visualization level. The combination of time domain B-Scan images and frequency domain A-Scan signals enables more accurate identification and localization of leakages, eliminating the uncertainty introduced by B-Scan images, and addressing the inability of A-Scan signals to locate leakage positions. The conclusions drawn from this study are summarized as follows:

  • (1) WT applied to GPR data yields time-frequency scale images, capable of extracting temporal and spectral information from raw radar data with extremely high resolution, visually displaying differences between pipeline leakage areas and normal areas in the time-frequency domain, thus avoiding the need for manual adjustment of mapping ranges in B-Scan charts to enhance image contrast.

  • (2) The trained ResNet-50 model achieved an accuracy of 0.917 and a recall of 0.998, indicating that the model can recognize almost all leakages in the dataset, while also demonstrating robustness to complex underground terrain, with minimal impact from irrelevant interfering targets on the model.

  • (3) The proposed bias correction method improves the clarity of leakage prediction results at the visual level. Furthermore, the leakage intelligent recognition method proposed in this study is capable of processing radar data at a speed of 0.0165 s per trace, meeting the requirements for real-time identification.

This study also has some limitations. Water supply pipelines are just one type of underground target to be detected. Other targets such as inspection wells, communication systems, heating, and power supply systems can also affect the data characteristics of GPR. Future research should consider the detection of leakages or damages in multiple types of underground targets. On the other hand, GPR data are actually 3D data. C-Scan images can also reflect the characteristics of pipelines and leakages. In future research, the utilization of GPR data should be improved to enhance the accuracy of leakage recognition from a 3D feature perspective.

We gratefully acknowledge the support of the National Key Research and Development Program (Grant No. 2022YFF06069003-03) and the ZJU-ZCCC Institute of Collaborative Innovation (Grant No. ZDJG2021009).

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Benedetto
A.
&
Benedetto
F.
2011
Remote sensing of soil moisture content by GPR signal processing in the frequency domain
.
IEEE Sensors Journal
11
(
10
),
2432
2441
.
https://doi.org/10.1109/JSEN.2011.2119478
.
Benedetto
F.
&
Tosti
F.
2013
GPR spectral analysis for clay content evaluation by the frequency shift method
.
Journal of Applied Geophysics
97
,
89
96
.
https://doi.org/10.1016/j.jappgeo.2013.03.012
.
Chen
S. S.
,
Wang
Y.
,
Zhang
W.
,
Zhang
H. R.
&
He
Y. C.
2023
Leak detection in water supply network using a data-driven improved graph convolutional network
.
IEEE Access
11
,
117240
117249
.
https://doi.org/10.1109/ACCESS.2023.3326470
.
Cheung
B. W. Y.
&
Lai
W. W. L.
2019
Field validation of water-pipe leakage detection through spatial and time-lapse analysis of GPR wave velocity
.
Near Surface Geophysics
17
(
3
),
231
246
.
https://doi.org/10.1002/nsg.12041
.
El-Mahallawy
M. S.
&
Hashim
M.
2013
Material classification of underground utilities from GPR images using DCT-based SVM approach
.
IEEE Geoscience and Remote Sensing Letters
10
(
6
),
1542
1546
.
https://doi.org/10.1109/LGRS.2013.2261796
.
Fu
S. C.
,
Zhang
D.
,
Peng
Y.
,
Shi
B.
,
Yedili
N.
&
Ma
Z.
2022
A simulation of gas pipeline leakage monitoring based on distributed acoustic sensing
.
Measurement Science and Technology
33
(
9
),
095108
.
https://doi.org/10.1088/1361-6501/ac7633
.
Gołębiowski
T.
2023
Theoretical aspects and numerical modelling of the GPR method to analyse its possibilities for the detection of leakages in urban water supply networks
.
Geology Geophysics and Environment
49
(
4
),
357
373
.
https://doi.org/10.7494/geol.2023.49.4.357
.
Han
K.
,
Wang
Y. H.
,
Chen
H. T.
,
Chen
X. H.
,
Guo
J. Y.
,
Liu
Z. H.
,
Tang
Y. H.
,
Xiao
A.
,
Xu
C. J.
,
Xu
Y. X.
,
Yang
Z. H.
,
Zhang
Y. M.
&
Tao
D. C.
2023
A survey on vision transformer
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
45
(
1
),
87
110
.
https://doi.org/10.1109/TPAMI.2022.3152247
.
Hao
T.
,
Rogers
C. D. F.
,
Metje
N.
,
Chapman
D. N.
,
Muggleton
J. M.
,
Foo
K. Y.
,
Wang
P.
,
Pennock
S. R.
,
Atkins
P. R.
,
Swingler
S. G.
,
Parker
J.
,
Costello
S. B.
,
Burrow
M. P. N.
,
Anspach
J. H.
,
Armitage
R. J.
,
Cohn
A. G.
,
Goddard
K.
,
Lewin
P. L.
,
Orlando
G.
,
Redfern
M. A.
,
Royal
A. C. D.
&
Saul
A. J.
2012
Condition assessment of the buried utility service infrastructure
.
Tunnelling and Underground Space Technology
28
,
331
344
.
https://doi.org/10.1016/j.tust.2011.10.011
.
He
K. M.
,
Zhang
X. Y.
,
Ren
S. Q.
&
Sun
J.
2016
Deep residual learning for image recognition
. In:
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
,
Washington, USA
.
https://doi.org/10.1109/CVPR.2016.90.
He
K. M.
,
Gkioxari
G.
,
Dollár
P.
&
Girshick
R.
2017
Mask R-CNN
. In:
16th IEEE International Conference on Computer Vision (ICCV)
,
Venice, Italy
.
https://doi.org/10.1109/ICCV.2017.322.
Hou
F. F.
,
Rui
X. Y.
,
Fan
X. Y.
&
Zhang
H.
2022
Review of GPR activities in civil infrastructures: Data analysis and applications
.
Remote Sensing
14
(
23
),
5972
.
https://doi.org/10.3390/rs14235972
.
Juliano
T. M.
,
Meegoda
J. N.
&
Watts
D. J.
2013
Acoustic emission leak detection on a metal pipeline buried in sandy soil
.
Journal of Pipeline Systems Engineering and Practice
4
(
3
),
149
155
.
https://doi.org/10.1061/(ASCE)PS.1949-1204.0000134
.
Kang
J.
,
Park
Y. J.
,
Lee
J.
,
Wang
S. H.
&
Eom
D. S.
2018
Novel leakage detection by ensemble CNN-SVM and graph-based localization in water distribution systems
.
IEEE Transactions on Industrial Electronics
65
(
5
),
4279
4289
.
https://doi.org/10.1109/Tie.2017.2764861
.
Karthikeyan
V. K.
,
Khandekar
S.
,
Pillai
B. C.
&
Sharma
P. K.
2014
Infrared thermography of a pulsating heat pipe: Flow regimes and multiple steady states
.
Applied Thermal Engineering
62
(
2
),
470
480
.
https://doi.org/10.1016/j.applthermaleng.2013.09.041
.
Klewe
T.
,
Strangfeld
C.
&
Kruschwitz
S.
2021
Review of moisture measurements in civil engineering with ground penetrating radar-applied methods and signal features
.
Construction and Building Materials
278
,
122250
.
https://doi.org/10.1016/j.conbuildmat.2021.122250
.
Lau
P. K. W.
,
Cheung
B. W. Y.
,
Lai
W. W. L.
&
Sham
J. F. C.
2021
Characterizing pipe leakage with a combination of GPR wave velocity algorithms
.
Tunnelling and Underground Space Technology
109
,
103740
.
https://doi.org/10.1016/j.tust.2020.103740
.
Lee
K. F.
,
Wang
T. K.
,
Kang
Y. M.
,
Wang
C. S.
&
Lin
K. A.
2013
Identification of pipelines from the secondary reflect wave travel time of ground-penetrating radar waves
.
Journal of Marine Science and Technology
21
(
4
),
417
422
.
https://doi.org/10.6119/JMST-012-0522-2
.
Li
W. T.
,
Cui
X. H.
,
Guo
L.
,
Chen
J.
,
Chen
X. H.
&
Cao
X.
2016
Tree root automatic recognition in ground penetrating radar profiles based on randomized Hough transform
.
Remote Sensing
8
(
5
),
430
.
https://doi.org/10.3390/rs8050430
.
Li
Y. S.
,
Liu
C. L.
,
Yue
G. H.
,
Gao
Q.
&
Du
Y. C.
2022
Deep learning-based pavement subsurface distress detection via ground penetrating radar data
.
Automation in Construction
142
,
104516
.
https://doi.org/10.1016/j.autcon.2022.104516
.
Lin
T. Y.
,
Dollár
P.
,
Girshick
R.
,
He
K. M.
,
Hariharan
B.
&
Belongie
S.
2017
Feature pyramid networks for object detection
. In:
30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
,
Hawaii, USA
.
https://doi.org/10.48550/arXiv.1612.03144.
Liu
P. Q.
,
Xu
C. H.
,
Xie
J.
,
Fu
M. F.
,
Chen
Y. F.
,
Liu
Z. C.
&
Zhang
Z. Y.
2023
A CNN-based transfer learning method for leakage detection of pipeline under multiple working conditions with AE signals
.
Process Safety and Environmental Protection
170
,
1161
1172
.
https://doi.org/10.1016/j.psep.2022.12.070
.
Luo
T.
,
Zhu
S.
,
Yikeremu
Y.
,
Zhu
J. S.
&
Genderen
J. V.
2023
Ground penetrating radar applied to subsurface culverts
.
Geo-Spatial Information Science
.
https://doi.org/10.1080/10095020.2023.2238758
.
Marcak
H.
&
Gołębiowski
T.
2008
Changes of GPR spectra due to the presence of hydrocarbon contamination in the ground
.
Acta Geophysica
56
(
2
),
485
504
.
https://doi.org/10.2478/s11600-008-0003-4
.
Matthew
D. Z.
2012
Adadelta: An Adaptive Learning Rate Method. Available from: http://arxiv.org/abs/1212.5701v1.
Qin
H.
,
Zhang
D. H.
,
Tang
Y.
&
Wang
Y. Z.
2021
Automatic recognition of tunnel lining elements from GPR images using deep convolutional networks with data augmentation
.
Automation in Construction
130
,
103830
.
https://doi.org/10.1016/j.autcon.2021.103830
.
Ren
S. Q.
,
He
K. M.
,
Girshick
R.
&
Sun
J.
2015
Faster R-CNN: Towards real-time object detection with region proposal networks
. In
29th Annual Conference on Neural Information Processing Systems (NIPS)
,
Montreal, Canada
.
https://doi.org/10.48550/arXiv.1506.01497
Rodés
J. P.
,
Reguero
A. M.
&
Pérez-Gracia
V.
2020
GPR spectra for monitoring asphalt pavements
.
Remote Sensing
12
(
11
),
1749
.
https://doi.org/10.3390/rs12111749
.
Shirajuddin
T. M.
,
Muhammad
N. S.
&
Abdullah
J.
2022
Systematic review on research trends on sensor-based leak detection methods in water distribution systems
.
Jurnal Kejuruteraan
34
(
2
),
201
209
.
https://doi.org/10.17576/jkukm-2022-34(2)-03
.
Shyu
H. C.
&
Sun
Y. S.
2002
Construction of a Morlet wavelet power spectrum
.
Multidimensional Systems and Signal Processing
13
(
1
),
101
111
.
https://doi.org/10.1023/A:1013847512432
.
Szymczyk
P.
&
Szymczyk
M.
2015
Non-destructive building investigation through analysis of GPR signal by S-transform
.
Automation in Construction
55
,
35
46
.
https://doi.org/10.1016/j.autcon.2015.03.022
.
Tong
Z.
,
Gao
J.
&
Yuan
D. D.
2020
Advances of deep learning applications in ground-penetrating radar: A survey
.
Construction and Building Materials
258
,
120371
.
https://doi.org/10.1016/j.conbuildmat.2020.120371
.
Vaswani
A.
,
Shazeer
N.
,
Parmar
N.
,
Uszkoreit
J.
,
Jones
L.
,
Gomez
A. N.
,
Kaiser
L.
&
Polosukhin
I.
2017
Attention is all you need
. In:
31st Annual Conference on Neural Information Processing Systems (NIPS)
,
California, USA
.
https://doi.org/10.48550/arXiv.1706.03762.
Xie
J.
,
Zhang
Y. B.
,
He
Z. Y.
,
Liu
P. Q.
,
Qin
Y.
,
Wang
Z. L.
&
Xu
C. H.
2023
Automated leakage detection method of pipeline networks under complicated backgrounds by combining infrared thermography and Faster R-CNN technique
.
Process Safety and Environmental Protection
174
,
39
52
.
https://doi.org/10.1016/j.psep.2023.04.006
.
Zhang
J.
,
Yang
X.
,
Li
W. G.
,
Zhang
S. B.
&
Jia
Y. Y.
2020
Automatic detection of moisture damages in asphalt pavements from GPR data with deep CNN and IRS method
.
Automation in Construction
113
,
103119
.
https://doi.org/10.1016/j.autcon.2020.103119
.
Zhang
Z.
,
Guo
D. Y.
,
Zhou
S. Z.
,
Zhang
J. W.
&
Lin
Y.
2023
Flight trajectory prediction enabled by time-frequency wavelet transform
.
Nature Communications
14
(
1
),
5258
.
https://doi.org/10.1038/s41467-023-40903-9
.
Zhou
X.
,
Tang
Z.
,
Xu
W.
,
Meng
F.
,
Chu
X.
,
Xin
K.
&
Fu
G.
2019
Deep learning identifies accurate burst locations in water distribution networks
.
Water Research
166
,
115058
.
https://doi.org/10.1016/j.watres.2019.115058
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).