## Abstract

Pipe groups divided by physical attributes often include many individual pipes that are scattered over a large geographical area, meaning the replacement of these pipes requires frequent service interruptions and additional costs due to the scattered delivery of repair resources. To address this issue, this paper proposes a framework for pipe replacement optimization on pipe groups divided by spatial clustering, aiming to reduce the number of scattered individual pipes in the replacement scheme. The proposed framework integrates spatial autocorrelation analysis for spatial clustering of pipe groups as replacement candidates, the pipe failure model to predict potential failures of pipe groups, and the replacement optimization model of pipe groups. The optimization model aims to minimize the number of potential failures within the constraints of the annual budget. The framework was implemented in a real WDN and pipe replacement schemes obtained by the proposed framework are compared with two other methods, namely, pipe attribute clustering-based optimization and pipe risk-based ranking. The results show that the spatial clustering-based helps reduce the number of spatially scattered individual pipes by 37.4 and 64.6%, respectively, compared to the other two methods. The proposed framework is expected to provide more cost–benefit schemes for pipe replacement.

## HIGHLIGHTS

The spatial clustering of pipe groups is integrated into the replacement optimization of water distribution pipes.

The spatial patterns of pipe failures are investigated by spatial autocorrelation analysis.

The spatial clustering of pipe groups is able to reduce the number of spatially scattered individual pipes in the replacement scheme.

## INTRODUCTION

The water distribution network (WDN) is one of the most critical infrastructures of the urban community (Wéber *et al.* 2020). In particular, pipes in WDN deteriorate over time, which leads to leakage, water contamination, and even pipe bursts, posing severe challenges to water safety (Ouyang 2014). To reduce water leakage and improve service performance, water utilities must carry out scheduled maintenance (i.e., repair or replacement) of pipes in a timely and efficient manner (Scholten *et al.* 2014). Therefore, how to arrange pipe replacement with a limited budget has been a topical issue in the asset management of WDN.

As for the strategies and assessment models for pipe replacement in WDN, many studies were performed targeting the level of individual pipes. Pipe failure models established according to the historical data of pipe failures are usually utilized to identify high-risk pipes and prioritize pipe replacement based on the likelihood of failure (Wilson *et al.* 2017; Robles-Velasco *et al.* 2020; Barton *et al.* 2022). For example, Kleiner *et al.* (1998) proposed an exponential failure model in which the pipe failure rate was modeled as a function of pipe age. Xu *et al.* (2013) developed a failure model to assess the failure propensity of individual pipes by genetic programming (GP). The abovementioned model provided explicit relationships between pipe failures and pipe attributes such as pipe diameter, age, and length. Zangenehmadar *et al.* (2020) used Artificial Neural Network to develop the implicit relationships between the occurrence of failures and pipe attributes, which has been used to predict pipe failures in the city of Montreal with an accuracy of 89.35%. Ismaeel & Zayed (2021) developed a performance-based budget allocation model for pipe replacement, where pipe performance was predicted by the random variable that follows the Weibull distribution. The performance of the pipe remains at its maximum level and slowly decreases at the early age of its service life, then rapidly decreases in middle age, and eventually reaches a slower decrease at the end of its service life. In practice, the usefulness of the replacement schemes targeted at the individual pipe level might be limited when the high-risk pipes are fragmented and spatially distributed in the WDN (Chen *et al.* 2021). For water utilities, replacing these spatially scattered individual pipes requires frequent service interruptions and additional costs, which consist of the cost of setting up job sites, marking adjacent infrastructure, and scattered delivery of maintenance resources (Nafi & Kleiner 2010). As a result, scheduling pipe replacement in groups is believed to be a better method for providing actionable replacement schemes.

In terms of scheduling pipe replacement by pipe groups, the grouping of pipes has initially focused on pipes with similar physical attributes such as pipe material, diameter, and age (Burn *et al.* 2003; Sægrov *et al.* 2003). Giustolisi *et al.* (2006) and Moglia *et al.* (2006) demonstrated that the complexity of the pipe replacement problem could be reduced by grouping pipes with common physical attributes. Giustolisi *et al.* (2006) formulated the decision problem of pipe replacement as a multi-objective optimization with the objectives simultaneously minimizing the number of potential failures and the replacement investment, while Moglia *et al.* (2006) established a single-objective optimization model that minimizes the aggregated predicted costs involved with pipe failures and the related maintenance and replacement. The weakness in pipes grouped by physical attributes is that they still result in many spatially scattered individual pipes within one group. Since spatially adjacent pipes often share similar performance properties and failure risks, only a single mobilization of maintenance resources is needed to address these connected pipes during replacement, yielding economies of large-scale construction (Ramos-Salgado *et al.* 2021). The spatial grouping methods adopted by some researchers usually aggregate adjacent pipes based on the topology of the pipe network. For instance, Li *et al.* (2011) presented an optimal pipe replacement scheme based on pipe groups divided by spatial proximity. The proposed optimization model aims to minimize the present value of the total cost, including the replacement cost of the pipe and repair costs of failure, subject to the annual budget. The spatial grouping method was applied to the pipe subset that had already been prioritized for replacement. Rokstad & Ugarelli (2015) proposed a pipe replacement optimization that aims to minimize the life cycle cost of the pipe groups, where the pipe groups were identified by assessing the life cycle of the connected pipes and 2.7% of the total cost was saved in the case study due to the economies of large-scale pipe groups. Due to the topology complexity of the WDN pipes, the spatial grouping of pipes based on WDN topology was only implemented in small-scale systems.

Another strategy for the spatial grouping of pipes is the use of planar areas in the service regions of WDN. Some studies investigated the spatial patterns of pipe failure data using density-based methods and spatial autocorrelation analysis (SAA) (Berkhin 2006; Liu *et al.* 2013). For example, De Oliveira *et al.* (2011) presented a method based on spatial scan statistics aiming to detect and localize spatial clusters of pipe failures in the WDN. This method detected clusters of noncompact shapes by scanning the entire regions and these clusters present a significantly higher failure density than the neighboring regions. Since the statistical significance of the spatial clusters identified by density-based methods cannot be directly verified (Abokifa & Sela 2019), some researchers utilized SAA to verify the statistical significance of the identified clusters based on Moran’ I index (Zhang *et al.* 2008). Given the location of pipe failures, the layout, and the service region of the WDN, the service region is divided into different planar sectors before SAA as this method focuses on the spatial clustering of pipe failures located in individual planar sectors. Abokifa & Sela (2019) used SAA to identify the spatial clustering pattern of pipe failures on square grids with sides of 1,000 feet. The pipes were grouped by the units of the grid, and the number of pipe failures per unit area was computed to provide information on pipe failure risk. Since the occurrence of pipe failures is generally induced by lots of factors, other factors, such as physical attributes, should be combined with the spatial information of pipe failures to provide comprehensive information for the decision-making process on pipe replacement.

As the reduction of spatially scattered individual pipes in the replacement scheme probably leads to fewer service interruptions and in-site construction costs of pipe replacement, and the pipe failures usually present spatial clustering patterns, it is necessary to include spatial analysis in the decision-making process for pipe replacement. Existing studies on the spatial clustering analysis of pipe failures only provide failure risk of pipes but have not been combined with pipe replacement optimization. To this end, this study proposes a framework to integrate the spatial clustering of pipes into the replacement optimization model of WDN pipes, aiming to reduce the number of scattered individual pipes in the replacement scheme. The pipe replacement optimization model aims to minimize the number of potential pipe failures subjected to the annual budget, and the replacement candidates of pipes are grouped by spatial clustering and physical attributes. The proposed framework is implemented in a real-world WDN in northern China and verified by comparing the obtained pipe replacement schemes with those obtained by other two methods, namely, attribute clustering-based grouping and risk-based ranking of pipes.

## METHODOLOGY

### Spatial autocorrelation analysis

SAA is a statistical test method to measure the spatial distribution characteristics of elements and their interdependence (Legendre 1993), which is generally categorized into global autocorrelation and local autocorrelation. The global Moran's *I* index (*MI*) is utilized to investigate the spatial autocorrelation in pipe failure data.

*i*is computed as:where

*B*is the number of pipe failures in sector

_{i}^{sum}*i*;

*S*and

_{i}*L*denote the planar area of sector

_{i}^{sum}*i*and the total length of pipes in sector

*i,*respectively.

*MI*is computed as (Liu

*et al.*2013):where

*N*is the number of spatial sectors in WDN;

*w*is the spatial weight assigned to the connection between sectors

_{ij}*i*and

*j*.

*x*and

_{i}*x*are the pipe failure intensities in sectors

_{j}*i*and

*j*, respectively, which are computed by the ratio of the annual number of pipe failures and the sector area, see Equation (1). ̀

*x*is the mean pipe failure intensity across all sectors. The value of

*MI*takes the range of [−1, 1], where

*MI*> 0 indicates the positive correlation and similar pipe failure intensities in adjacent sectors. The larger the

*MI*value, the stronger the positive spatial correlation. Whereas

*MI*< 0 arises when adjacent sectors have different failure intensities.

*MI*, a standardized

*Z*-score is computed as a test statistic (Liu

*et al.*2013):where

*E*[

*MI*] and

*V*[

*MI*] are the mean and variance of the

*MI*under the distribution of spatial randomness, respectively. If the absolute value of the

*Z*-score is close to 0, it indicates that the simulated

*MI*is approximately randomly distributed. Otherwise, if the value of the

*Z*-score is larger, the spatial data are considered statistically significant (Abokifa & Sela 2019).

The best spatial grid size with the most apparent spatial aggregation of pipe failures can be determined through multiple simulations of various sizes. Finally, the pipes in each spatial grid can be further divided into smaller groups according to the physical attributes such as pipe diameter, pipe age, etc. These groups will be used as replacement candidates for the optimization model. For each candidate group, the total length is summed by the pipes in each group and the predicted failures are weighted by the pipe length.

### Pipe failure model

According to Barton *et al.* (2019), the factors influencing pipe failure occurrences may be divided into intrinsic (pipe diameter, age, material, etc.), operational (inner pressure, water velocity, etc.), and environmental (weather temperature, soil corrosion, etc.) aspects. As the focus of this study is to investigate the replacement optimization based on spatial clustering rather than developing the pipe failure prediction model with the best performance, the models and parameters that have been implemented in actual WDN and shown better performance in previous research will be selected in this study. Berardi *et al.* (2008) developed a failure model considering pipe diameter, pipe age, and pipe length by evolutionary polynomial regression (EPR) to assess the failure propensity of individual pipes. Sattar *et al.* (2016) utilized Gene Expression Programming (GEP) to develop explicit relationships between the time to failure and controlling variables, including material, diameter, and length. Therefore, the three major factors affecting pipe failure, namely, pipe diameter (*D*), pipe age (*A*), and pipe length (*L*), are selected to develop the failure model of pipes by the GEP method in this study.

*et al.*2017; Amiri-Ardakani & Najafzadeh 2021). For the pipe failure model in this study, the GEP is utilized to establish the explicit expression showing the relations between the input variables {

*D, A, L*} and the output

*BR*(number of pipe failures/km/year):

*BR*) in the next year can be predicted by the regression model, as shown in Equation (4). The prediction accuracy of the failure model is presented by the coefficient of determination (

*CoD*) and root mean square error (RMSE). The greater value of

*CoD*and the smaller value of RMSE correspond to better prediction accuracy.where

*y*and represent the predicted and actual data values at the

_{i}*i*th subgroup, respectively; is the mean of the predicted values; and

*n*is the number of pipe groups.

### Pipe replacement optimization based on spatial clustering

#### Pipe grouping based on spatial grids

According to the pipe maintenance experience of water utilities, replacing a pipe segment usually requires five pieces of mechanical equipment, including one pipe material transport truck, one muck truck, one truck-mounted crane, one excavator, and one water truck. The replacement of the clustered pipe groups, as shown in Figure 2(b), requires a centralized utilization of equipment and the arrangement of repair crews and thus improves the utilization efficiency of equipment and labor, which produces a smaller replacement cost per pipe length than the replacement cost of scattered pipes.

Based on the pipe replacement records of B city in north China, the pipe replacement cost per unit length of a few records is shown in Table 1. Twelve replacement records are divided into four sets according to diameter and pipe length. The records in Sets {A, C} correspond to a smaller pipe length in one replacement activity and represent the scattered pipe replacement, while the records in Sets {B, D} correspond to a larger pipe length in one replacement activity and stand for the clustered pipe replacement.

Diameter (mm) . | Set . | Records . | Pipe length (m) . | Total cost (10,000 RMB) . | Unit cost per length (10,000 RMB/m) . | Set cost per length (10,000 RMB/m) . |
---|---|---|---|---|---|---|

DN600 | A | A1 | 413.4 | 360.6 | 0.87 | 0.75 |

A2 | 562 | 393.4 | 0.70 | |||

A3 | 564 | 395 | 0.70 | |||

B | B1 | 1,249.7 | 601.2 | 0.48 | 0.52 | |

B2 | 1,456 | 702.7 | 0.48 | |||

B3 | 2,322.5 | 1,332.6 | 0.57 | |||

DN400 | C | C1 | 27.1 | 15.1 | 0.56 | 0.71 |

C2 | 94.3 | 60.6 | 0.64 | |||

C3 | 174.4 | 135.4 | 0.78 | |||

D | D1 | 1,039.5 | 411.9 | 0.40 | 0.36 | |

D2 | 1,169 | 440 | 0.38 | |||

D3 | 1,295.8 | 426.8 | 0.33 |

Diameter (mm) . | Set . | Records . | Pipe length (m) . | Total cost (10,000 RMB) . | Unit cost per length (10,000 RMB/m) . | Set cost per length (10,000 RMB/m) . |
---|---|---|---|---|---|---|

DN600 | A | A1 | 413.4 | 360.6 | 0.87 | 0.75 |

A2 | 562 | 393.4 | 0.70 | |||

A3 | 564 | 395 | 0.70 | |||

B | B1 | 1,249.7 | 601.2 | 0.48 | 0.52 | |

B2 | 1,456 | 702.7 | 0.48 | |||

B3 | 2,322.5 | 1,332.6 | 0.57 | |||

DN400 | C | C1 | 27.1 | 15.1 | 0.56 | 0.71 |

C2 | 94.3 | 60.6 | 0.64 | |||

C3 | 174.4 | 135.4 | 0.78 | |||

D | D1 | 1,039.5 | 411.9 | 0.40 | 0.36 | |

D2 | 1,169 | 440 | 0.38 | |||

D3 | 1,295.8 | 426.8 | 0.33 |

It can be seen from the set cost in Table 1 that a larger length of pipe group corresponds to a smaller replacement cost per length. Compared with set costs {7,500, 7,100} RMB/m of the shorter-length pipe Sets {A, C}, the average unit cost of the longer-length pipe Sets {B, D} has been reduced to {5,200, 3,600} RMB/m, which correspond to the cost reduction rate of {30.7%, 49.3%}. The three pipe records in the same set may present fluctuations in unit cost per length, such as the unit cost of records {B1, B2, B3} ranges from 4,800 RMB to 5,700 RMB. This cost fluctuation usually arises from the variety of in-site construction conditions and service requirements, such as the B1 replacement took place on the green land while the B3 replacement was conducted at the roadside, which limits the available time for replacing work every day. As a result, the construction period of replacement work on B3 was significantly prolonged and the unit cost of B3 was greater than B1. These data illustrate that the unit cost of replacement is affected by the scale of pipe length and the regional environment. Aside from the variety of environments in the individual record, the average unit cost in each record set still shows that a larger length of pipe group can reduce the unit cost per length, which presents the economic benefit of scale.

*et al.*(2021) have emphasized the benefits of pipe replacement by groups of connected pipes, which would reduce the service disruption and construction cost caused by individual pipe replacement. Whereas, there was no quantitative comparison of the cost benefits between the replacement of individual pipes and the clustered pipes in their studies. To quantitatively measure the cost–benefit of grouped replacement, Nafi & Kleiner (2010) proposed a discount model for cost savings achieved through the whole replacement of large-scale pipes. The cost discount is expressed as in the following equation:where

*C*is the replacement cost of the pipe

_{i}*i*. D

*C*is the discounted cost for pipe

_{i}*i*and can be evaluated as in the following equation:where Con

*d*is the maximum discount;

_{max}*l*

_{max}and

*l*

_{min}are the maximum and minimum pipe lengths defined by holders according to actual replacement investment records, respectively; and

*l*is the total pipe length that is connected together and scheduled to be replaced.

_{i}It should be noted that, even using the discount method in Equations (7) and (8), it is difficult to determine the parameters {Con*d _{max}*,

*l*

_{max},

*l*

_{min}} for evaluating the cost benefits between the replacement of individual pipes and the clustered pipes without a large amount of investment records and data. Unfortunately, in the WDN case of this research, the investment data of all kinds of pipes is currently unavailable. Since the data sets in Table 1 have demonstrated the idea that the reduction of the number of individual pipes indeed leads to cost reduction of pipe replacement, the number of individual pipes is used in the rest of the article.

#### Replacement optimization of pipe groups

This section develops an annual pipe replacement decision model, and it can reasonably assume that the deterioration process of pipes is monotonic (Giustolisi & Berardi 2009). The structural failure of pipes is usually caused by pipe deterioration and probably leads to further failure in hydraulic and water quality service. The consequence of a pipe failure event usually includes direct losses (e.g., pipe repair or replacement), indirect losses (e.g., service interruptions), and social losses (e.g., water contamination) (Kammouh *et al.* 2021). As it is difficult to quantify the indirect and social losses in coordination with the direct losses in the WDN case utilized in this study, we alternatively use the number of pipe failures. Therefore, decreasing the number of pipe failures is a critical aspect to reduce potential losses and is used as the objective to determine pipe replacement schemes.

*et al.*2013; Phan

*et al.*2019), the pipe diameter is an important indicator to represent the hydraulic and functionality impacts of pipes. The objective of pipe replacement optimization is to minimize the annual number of pipe failures weighted by pipe diameter.where

*BR*indicates the number of failures of the

_{i}*i*th pipe group predicted by the failure model in Equation (4).

*n*is the number of pipe groups provided by the spatial clustering.

*w*=

_{i}*D*/

_{i}*D*

_{max}is the normalized weight of the

*i*th pipe group regarding pipe diameter;

*D*is the average pipe diameter of the

_{i}*i*th pipe group weighted by pipe length; and

*D*

_{max}indicates the maximum value of

*D*in all groups.

_{i}*I*= {1, 0} is a binary variable representing whether pipe

_{ik}*i*is replaced in the replacement plan

*k*.

*C*is the replacement cost of the pipe

_{i}*i*. [

*C*] is the annual budget for pipe replacement in WDN.

*et al.*2002) and is adopted in this study.where

*C*and

_{i}*D*are the replacement cost (RMB) and pipe diameter (mm) of pipe

_{i}*i*, respectively. Parameters

*a, b*and

*α*are the regression coefficients obtained by fitting actual pipe costs into Equation (11). According to the reference costs listed in the water supply and drainage engineering design manual of China (Xu

*et al.*2013), the regression coefficients of pipe replacement costs are shown in Table 2.

Pipe materials . | a
. | b
. | α
. | R^{2}
. |
---|---|---|---|---|

CI | 271.6 | 0.006 | 1.88 | 0.99 |

DI | 291.5 | 0.001 | 2.18 | 0.99 |

Pipe materials . | a
. | b
. | α
. | R^{2}
. |
---|---|---|---|---|

CI | 271.6 | 0.006 | 1.88 | 0.99 |

DI | 291.5 | 0.001 | 2.18 | 0.99 |

## CASE STUDY

### Data collection and preliminary analysis

A real-world WDN of B city in northern China was selected for the study. The downtown service area of the WDN is 50.7 km^{2} and the WDN consists of more than 90,000 pipes with a total length of over 845.8 km. About 21% of the pipes have a service age of over 30 years. The pipe failures were obtained from the case study network, which occurred since the year 2009–2018 with a total number of 1,226 failure records. Table 3 gives the statistical information of the case WDN. More than 90% of the pipes in the WDN are cast iron (CI) and ductile iron (DI) pipes. Since 2000, the CI pipes have been gradually replaced by DI pipes with rubber joints for better performance and stability. As a result, the CI pipes will be replaced by the DI pipes with the same diameter.

Features . | CI . | DI . | Other . |
---|---|---|---|

Years laid | 1948–2005 | 1998–2018 | 1948–2018 |

Pipe diameter (mm) | 75–600 | 75–800 | 15–1,600 |

Total length (km) | 363.1 | 420.7 | 62.0 |

Average pipe length (m) | 8.8 | 8.3 | 16.4 |

Number of pipe sections | 41,255 | 50,983 | 3,784 |

Number of pipe failures | 1,091 | 96 | 37 |

Features . | CI . | DI . | Other . |
---|---|---|---|

Years laid | 1948–2005 | 1998–2018 | 1948–2018 |

Pipe diameter (mm) | 75–600 | 75–800 | 15–1,600 |

Total length (km) | 363.1 | 420.7 | 62.0 |

Average pipe length (m) | 8.8 | 8.3 | 16.4 |

Number of pipe sections | 41,255 | 50,983 | 3,784 |

Number of pipe failures | 1,091 | 96 | 37 |

*D*) and pipe age (

*A*). For the pipe diameter factor, the number of pipe failures with a diameter below DN150 is higher, which may be related to the thin wall of the small diameter and the low buried depth. Those with a pipe diameter larger than DN300 are installed and operated more carefully, thereby having a relatively lower failure number.

On the age of the pipe in the failure records, it can be seen from Figure 3(d) that the failure number of CI pipe first increases in the first 20 years, then decreases between 20 and 45 years rapidly, and finally holds at a steady level from 46 to 60 years. The failure number of the DI pipe holds similar changes in pipe ages to that of the CI pipe. The above number of failures versus service age of pipes in Figure 3(d) is very close to the probability density function of the Weibull distribution, which is consistent with the findings in the literature that the annual number of pipe failures follows the Weibull distribution (Ramirez *et al.* 2020; Snider & McBean 2021). In addition, since there are few failure records for the CI pipe older than 62 years and the DI pipe older than 22 years, the annual failure rate per km of these pipes shows dramatic fluctuation in Figure 3(c), these abnormal failure rates should be eliminated in the establishment of a failure model.

To achieve statistical significance, pipe failures are aggregated into homogeneous groups by diameter and pipe age to establish the pipe failure model. The total length of the pipe and the number of failures in each group are summed accordingly. Then the data of every group {*D*, *A*, *L*, *BR*} constitutes the data set to establish and verify the failure model of pipes by the GEP method. The explicit expression of the failure model with a concise structure and better fitting accuracy is selected from the candidates provided by GEP training. The selected failure models for the CI and DI pipes are shown in Table 4.

Pipe material . | Expression of the failure model . | CoD . | RMSE . |
---|---|---|---|

CI | 0.85 | 1.37 | |

DI | 0.76 | 1.94 |

Pipe material . | Expression of the failure model . | CoD . | RMSE . |
---|---|---|---|

CI | 0.85 | 1.37 | |

DI | 0.76 | 1.94 |

Since the pipe failure model, either explicitly expressed by GEP or implicitly presented by the machine learning method, is established by the statistics of the historical failure records, there is an important consensus for the rationality of the failure model established by the historical failure records. That is, the potential pipe failures that will happen in the case of WDN are expected to present similar characteristics to the historical failure records (Asnaashari *et al.* 2013; Sattar *et al.* 2016). This is reasonable when the pipes show similar operation pressures, soil conditions and maintenance strategies. Therefore, the failure model established for this case WDN cannot be directly used in another case, and there is no universal model applicable to different cases. If there are significant changes in the operational and environmental factors of the case WDN, then the pipe failure behavior will subsequently change, and the pipe failure model should be adjusted based on the updated failure records with the changed characteristics.

### Spatial clustering and pipe grouping

*MI*and

*Z*-scores under different grid sizes, as well as the percentage of the grids without pipe failure occurrence.

*MI*values are positive for all grid sizes, indicating that the spatial distribution of pipe failures presents a clustering pattern. As the grid size increases, the

*MI*value grows from 0.07 at 0.25 km × 0.25 km to 0.2 at 0.5 km × 0.5 km, after which the

*MI*value fluctuates but remains less than 0.2. The normalized

*Z*-score is also compared to verify whether the

*MI*of each grid size is spatially stochastic. According to the

*Z*-scores in Figure 4, the

*Z*-score reaches a maximum of 5.59 at 0.5 km × 0.5 km. In addition, the percentage of the grid without pipe failure data gradually reduces with the increasing grid size, where the grid size at 0.5 km × 0.5 km is the inflection point, and the curve reduction slows down after that. Therefore, the optimal size of the planner grid in the case of WDN is set as 0.5 km × 0.5 km, which divides the service area of the WDN into 230 individual spatial grids, as shown in Figure 5(b).

Figure 5(a) shows the spatial hot spots of pipe failure locations of the case of WDN, where the spatial clustering of pipe failures is noticeable. The spatial clustering of pipe failures may be caused by a variety of factors, including spatially closed pipes with similar water pressure, above-ground loads, soil erosions, and ground settlement. According to the influence factor category listed in Section 2.2, the environmental and operational factors probably lead to the spatial clustering of pipe failures.

According to the spatial grid partition in Section 3.2, the pipes in each spatial grid are further divided into smaller groups by their attributes. Therefore, each grid consists of a few pipe groups and the pipe groups in all grids make up the replacement candidates.

A moderate number of groups is capable of achieving better clustering of replacement pipes and thus resulting in less replacement cost and service interruption. For pipe group partition coupled with spatial clustering and attributes, considering that the spatial grid presents the spatial clustering, only the pipe diameter is taken as the grouping criterion to divide subgroups in each grid. To make a comparison between the spatial clustering and existing models, the results of attribute clustering-based pipe grouping are also presented in this section. For the attribute clustering, both the pipe diameter and pipe age are taken as grouping criteria to get a sufficient number of pipe groups as the replacement candidates for the optimization model. The grouping criteria for the spatial and attribute clustering are briefly presented in Table 5.

Clustering method . | Criteria . | Parameters . | Number of groups . |
---|---|---|---|

Spatial clustering | Planar grids | 0.5 km × 0.5 km | 1,226 |

Pipe diameter | (0,150)/[150,300]/(300,800) | ||

Attribute clustering | Pipe age | Every 2 years | 116 |

Pipe diameter | (0,150)/[150,300]/(300,800) |

Clustering method . | Criteria . | Parameters . | Number of groups . |
---|---|---|---|

Spatial clustering | Planar grids | 0.5 km × 0.5 km | 1,226 |

Pipe diameter | (0,150)/[150,300]/(300,800) | ||

Attribute clustering | Pipe age | Every 2 years | 116 |

Pipe diameter | (0,150)/[150,300]/(300,800) |

### Pipe replacement schemes

Taking the pipe groups in Table 5 as the decision variable, the pipe replacement optimization model shown in Equations (9) and (10) is performed separately for the spatial clustering and attribute clustering pipe groups. GA is used to solve the optimization model. The GA population size is set to be 4–6 times the number of decision variables. In accordance with the actual operation and maintenance of water utility in B city, the annual replacement budget is set up at 5% of the total construction cost of WDNs, and a similar budget ratio is also taken by Zhou (2018).

Indicators . | Method . | CI pipes . | DI pipes . | Total . |
---|---|---|---|---|

Potential failure reduction (%) | Spatial clustering | 11.32 | 0.12 | 11.44 |

Attribute clustering | 11.53 | 0.09 | 11.62 | |

Risk-based ranking | 13.26 | 0.41 | 13.67 | |

Total length of replacement (km) | Spatial clustering | 66.48 | 1.25 | 67.73 |

Attribute clustering | 65.82 | 0.56 | 66.38 | |

Risk-based ranking | 64.52 | 1.44 | 65.96 | |

Number of individual pipes | Spatial clustering | 890 | 64 | 954 |

Attribute clustering | 1,410 | 96 | 1,506 | |

Risk-based ranking | 2,108 | 587 | 2,695 |

Indicators . | Method . | CI pipes . | DI pipes . | Total . |
---|---|---|---|---|

Potential failure reduction (%) | Spatial clustering | 11.32 | 0.12 | 11.44 |

Attribute clustering | 11.53 | 0.09 | 11.62 | |

Risk-based ranking | 13.26 | 0.41 | 13.67 | |

Total length of replacement (km) | Spatial clustering | 66.48 | 1.25 | 67.73 |

Attribute clustering | 65.82 | 0.56 | 66.38 | |

Risk-based ranking | 64.52 | 1.44 | 65.96 | |

Number of individual pipes | Spatial clustering | 890 | 64 | 954 |

Attribute clustering | 1,410 | 96 | 1,506 | |

Risk-based ranking | 2,108 | 587 | 2,695 |

In terms of the reduction of potential pipe failures achieved by the replacement scheme, the risk-based ranking method reduces the proportion of potential pipe failures by 13.67% and holds the best performance among the three schemes, while the attribute clustering scheme reduces the number of failures with a proportion of 11.62%, which is slightly more than the spatial clustering scheme by 0.18%. The risk-based ranking scheme performs better than the other two optimization schemes probably because the former is obtained by individual pipe selection, which helps to locate pipe candidates precisely. While the optimization schemes are obtained by pipe group selection using the average information of pipes in the group.

In terms of the number of individual pipes in the replacement scheme, as shown in Table 6, the spatial clustering significantly reduced the number of individual pipes. The number of individual pipes in the spatial clustering scheme is 954, while the individual pipes in the schemes obtained by attribute clustering and risk-based ranking are 1,506 and 2,695, respectively. Therefore, the spatial clustering method achieves a 36.7 and 64.6% reduction in the number of individual pipes compared to attribute clustering and risk-based ranking, respectively. In terms of the total length of pipes in the replacement scheme, the data in Table 6 show that the three schemes have similar results. However, the constitution and spatial distribution of the pipes in the three replacement schemes are different. For example, the pipes selected by spatial clustering in Figure 6(a) are spatially aggregated, a small number of DI pipes are selected by attribute clustering optimization as shown in Table 6 and Figure 6(b), and the pipes selected by risk-based ranking present spatial dispersion in Figure 6(c) and contain more DI pipes.

Therefore, the spatial clustering method has a similar result to the risk-based ranking method in terms of the potential failure reduction and the total length of replaced pipes, while the spatial clustering method greatly reduces the number of individual pipes in the replacement scheme, so the spatial clustering method performs better than the risk-ranking method on the whole.

To investigate why there are few individual pipes in the spatial clustering scheme, the differences in pipe distribution are analyzed by comparing the overlapping pipes of the three methods. Table 7 presents the detailed data. Accounting for the length of non-overlapping pipes in Table 7, it can be seen that spatial clustering is quite different from the other two schemes. The lengths of overlapping pipes are 35.69 and 35.54 km and correspond to the overlapping pipe ratio of 52.7 and 51.0%, respectively. Whereas there are small differences between the results of attribute clustering and risk-based ranking with the length of non-overlapping pipes of 13.03 km, which means that the overlapping pipe ratio of the two replacement schemes reaches 80.4%. The high overlapping ratio of 80.4% is probably because the pipe groups in the attribute clustering method are divided by pipe diameter and pipe age, and the pipe failure model also predicts pipe failure risk (BR) by diameter and age. Therefore, the pipes in each group hold one BR value averaged by length and the group-averaged BR value is close to the BR of the individual pipe predicted by the failure model.

Comparative features . | Spatial clustering vs. attribute clustering . | Spatial clustering vs. risk-based ranking . | Attribute clustering vs. Risk-based ranking . | |||
---|---|---|---|---|---|---|

CI pipe . | DI pipe . | CI pipe . | DI pipe . | CI pipe . | DI pipe . | |

Number of overlapping pipes | 3,973 | 18 | 3,352 | 68 | 5,600 | 82 |

Number of non-overlapping pipes | 4,522 | 341 | 5,143 | 291 | 2,047 | 79 |

Length of overlapping pipes (km) | 35.66 | 0.039 | 34.49 | 0.051 | 53.28 | 0.068 |

Length of non-overlapping pipes (km) | 30.82 | 1.210 | 31.99 | 1.200 | 12.54 | 0.493 |

Comparative features . | Spatial clustering vs. attribute clustering . | Spatial clustering vs. risk-based ranking . | Attribute clustering vs. Risk-based ranking . | |||
---|---|---|---|---|---|---|

CI pipe . | DI pipe . | CI pipe . | DI pipe . | CI pipe . | DI pipe . | |

Number of overlapping pipes | 3,973 | 18 | 3,352 | 68 | 5,600 | 82 |

Number of non-overlapping pipes | 4,522 | 341 | 5,143 | 291 | 2,047 | 79 |

Length of overlapping pipes (km) | 35.66 | 0.039 | 34.49 | 0.051 | 53.28 | 0.068 |

Length of non-overlapping pipes (km) | 30.82 | 1.210 | 31.99 | 1.200 | 12.54 | 0.493 |

Due to the constraint of the annual budget, the spatial clustering method increases the total length of the pipe in Grid 64, which results in the reduction of pipes in other grids, as shown in Figures 7(d)–7(f) of Grid 15. It should be noted that Figure 7(d) has fewer individual pipes than Figure 7(e) and 7(f). Moreover, there are many grids similar to Grids 64 and 15 in the replacement scheme of spatial clustering (Figure 6(a)). As a result, spatial clustering produces a smaller number of individual pipes when the total length of the pipes replaced by the three schemes is similar.

## DISCUSSION

### Cost–benefit analysis

As shown in Figure 8, when the percentage of the pipe failure reduction is below 17% (point A), the replacement cost order by the three methods presents: spatial clustering-based optimization ≍ attribute clustering-based optimization > pipe risk-based ranking. The risk-based ranking method prioritizes individual pipes with a greater value of BR, but the two optimization methods select pipes based on the average BR of pipe groups, the larger length of pipes in each group results in a higher cost. Regarding the two optimization methods, after point A, with the increase of pipe failure reduction, spatial clustering methods correspond to less replacement than the attribute clustering methods. It is because there are more pipe groups in the spatial clustering methods, which provides more solution candidates for the optimization under the budget constraint.

Although the risk-ranking method holds a smaller replacement cost than the spatial clustering method targeting equivalent pipe failure reduction, the cost difference between the two methods is minimal. When the pipe failure reduction ratio is 50% (Point B), the normalized replacement costs of spatial clustering and risk-ranking are 28.6 and 27.0%, respectively, and the cost difference between the two methods is only about 1.6%. After Point B, the cost difference between spatial clustering and risk-based ranking gradually decreases as the cumulative replacement pipe length increases (Point C). The ultimate difference between the two methods ranges from 0 to 1.8% in Figure 8.

It should be pointed out that the replacement cost in Figure 8 was computed based on the cost per unit length, and the longer the pipe, the higher the cost, without considering the discount of the cost brought by the clustered pipe replacement. Based on the analysis of Section 2.3.1, the actual replacement of pipes with a larger length scale indeed saves the unit replacement cost. Therefore, if the discounted cost of the clustered pipe replacement is considered, the cost comparison between the risk-based ranking and the spatial clustering method should be Cost_RR (risk-ranking) and (1 − Con*D*) × Cost_SC (spatial clustering) in Figure 8, where Con*D* is the quantity discount for the clustered replacement, and Con*D* = 36.7–64.6% in Table 1. Compared with the risk-based ranking method, even with the previous 0–1.8% replacement cost increment in the spatial clustering method, if another discounted cost reduction achieved by spatial clustering through the reduction of scattered pipes is counted, the replacement scheme by the spatial clustering method presents the largest benefit.

### Sensitivity analysis to pipe grouping criteria

As shown in Figure 9, under the same budget constraint, the reduction of the potential failures by the grouping Criterion 2 (Grid + D + A) is slightly higher than that of the original grouping Criterion 1(Grid + D), with an increased ratio from 0.4 to 0.6%. The reason is that the grouping Criterion 2 generates more pipe groups (2,203) as optimization candidates than those (1,226) in Criterion 1. Therefore, the improvement in reducing the potential for pipe failures is in apparent by adding the pipe's age to the grouping criterion. Only considering the pipe diameter has sufficient efficiency on pipe replacement optimization based on spatial clustering. Of course, this sensitivity analysis helps to select grouping criteria for replacement optimization.

## CONCLUSIONS

In the practice of pipe replacement of WDN, the effort to reduce the spatially scattered individual pipe segments is believed to help to decrease the service interruptions and the replacement cost of pipes. To reduce the number of scattered individual pipes in the replacement scheme, this study proposes a pipe replacement optimization framework to select replacement candidates from pipe groups divided by the spatial clustering method. The optimization model aims to minimize potential pipe failures in the WDN subjected to the annual replacement budget. The framework was implemented in an actual WDN and the replacement schemes obtained by three methods for pipe candidate preparation, namely, the spatial clustering-based pipe grouping, the attribute clustering-based pipe grouping, and the risk-based ranking of pipes, were compared and investigated. The following findings can be drawn:

As for the number of spatially scattered individual pipes in the replacement schemes, the spatial clustering-based method respectively reduced the number of scattered individual pipes by 36.7 and 64.6% compared to the results of the attribute clustering method and the pipe risk-based ranking method, which show the merits of the spatial clustering-based pipe replacement strategy in selecting pipe groups with a larger length scale and similar failure risks. The three methods present similar results regarding the total length of pipes in the replacement schemes. Although the risk-based ranking method presents a slightly better reduction of potential pipe failure due to its precise selection of individual pipes, this approach corresponds to the largest number of spatially scattered individual pipes.

Although the actual unit cost of pipe replacement is affected by lots of factors, including the scale of the pipe set, the regional environment, and so on, the actual records of pipe replacement cost in the WDN case show that the pipe replacement of the longer-length pipe sets presents the average unit cost reduction ranging from 30.7 to 49.3% comparing that of the shorter-length pipe sets, which demonstrate the cost benefits by reducing scattered individual pipes in the replacement schemes. Without considering the unit cost reduction of longer-length and spatially clustered pipe groups in the replacement scheme, the cost–benefit analysis of the three methods for pipe replacement shows that the spatial clustering method achieves the same number of potential pipe failure reductions than the risk-based ranking method with an additional cost of up to 1.8%. If another 30.7% cost reduction achieved by the spatial clustering method is counted, the replacement scheme obtained by the spatial clustering method presents the largest benefit.

Notable spatial clustering of pipe failures is observed according to the SAA, which also reflects the influences of environmental and operational factors on pipe failures. Therefore, the spatial clustering method also provides a reasonable supplement to the failure risk assessment of pipes where the environment and operation parameters are insufficient. The spatial grid size for autocorrelation analysis and clustering is varied with the identical failure data of the WDN case. When using the spatial clustering method for pipe grouping, a moderate number of groups is capable to achieve better clustering of pipes in replacement schemes.

This study used the number of spatially scattered pipe segments to compare the pipe replacement schemes obtained by different methods, which is not reflected in the optimization model of pipe replacement. Further study should pay more attention to the quantity model to identify the additional construction cost of the spatially scattered pipe segment than the spatially grouped pipes. Moreover, to count the reduction of service interruption achieved by spatially grouped pipe replacement, the number of affected users and the amount of water demand or pressure loss caused by the pipe replacement construction should be evaluated by further exploration. Water distribution networks are intimately linked to other infrastructures, such as road networks, where pipe replacement affects land use and traffic. Cost savings can be further achieved by considering a consistent replacement scheme with other infrastructure.

## ACKNOWLEDGEMENT

This research work was supported by the National Natural Science Foundation of China (NSFC) (Grant No. 51978023).

## DATA AVAILABILITY STATEMENT

Data cannot be made publicly available; readers should contact the corresponding author for details.

## CONFLICT OF INTEREST

The authors declare there is no conflict.