## Abstract

Metropolitan governments and water operators are continuously facing the ever-growing challenges of evaluating the risks and optimizing investment in the rehabilitation of the buried aging infrastructure of water distribution systems (WDS). Proper asset management and efficient rehabilitation planning require monitoring, condition assessment, degradation risk analysis and a data-based model for degradation forecasting to support investment decision-making and significantly reduce the infrastructure rehabilitation cost. This paper presents a statistical and stochastic spatial data analysis of failure records of the WDS of the City of Wattrelos, France. The research objective is to develop and demo-illustrate the application of an operator's experience-based Risk Assessment Method (RAM) for network micro-zone prioritization of rehabilitation/replacement works to optimize preemptive asset management. The data used is a 74-year historical dataset from Wattrelos, France. The database includes approximately 424 observed failures for the period of 1991–2004. The data analysis demonstrates that understanding and using stochastic modeling to characterize the pattern of relationship between Failure Rate (*F*_{R}), Age (*T*) and the Probability (or Risk) of exceeding a specific Failure Rate (*P*_{r}(*F*_{R})) of a micro-zone can effectively support the operator's assessment, risk management and prioritization in the maintenance and rehabilitation of the WDS.

## INTRODUCTION

Beginning in the 1980s, a number of attempts have been made to statistically model the failure risk of water pipelines (Vanrenterghem-Raven *et al*. 2004; Jafar 2006; Jafar *et al*. 2007, 2010; Alegre *et al*. 2011; Alegre & Coelho 2012; Mamo 2013; Mamo *et al*. 2013, 2014). The development of models for the prediction of failure in the water distribution system (WDS) encounters major difficulties, mainly due to the lack of data on both the WDS and pipe breakage history (Jafar *et al*. 2010; Alegre & Coelho 2012). The assessment of pipe system degradation requires information on the WDS including physical, environmental, and operational parameters that have impacted the pipe failure rate (FR) during its life-cycle (Le Gat & Eisenbeis 2000; Alegre & Coelho 2012). Pipe system aging modeling has included statistical models for predicting water pipe failure using historical data, survival data analysis and stochastic models (Jafar 2006; Jafar *et al*. 2007, 2010; Mamo 2013; Mamo *et al*. 2013, 2014) such as artificial neural networks (ANN) for forecasting failure occurrence in the network.

The purpose of this research is to establish a Decision Support System (DSS) for optimizing infrastructure asset management of WDS through a statistical analysis of the WDS FR using historical data for degradation risk assessment. To support the operator's decision-making in optimizing the network rehabilitation the proposed statistical analysis method yields

For a pre-established Failure Rate (

*F*_{R}) of a cluster of pipes, defined by the pipe characteristics (i.e. year of installation, materials, diameter), the Probability (*P*_{r}(*F*_{R})) of exceeding the selected*F*_{R}value as a function of the Age (*T*) of the pipesFor an Acceptable Probability (or Risk) of exceeding a pre-selected

*F*_{R}value the variation of the*F*_{R}value as a function of the Age (*T*) of the pipesAn operator's experiencebased Risk Matrix to identify the risk of exceeding a pre-selected

*F*_{R}value at a specific Age (*T*).

## METHODOLOGY

The proposed Risk Assessment Method (RAM) is based upon the use of both statistical and stochastic models for the network micro-zone degradation rate assessment to enable optimization and prioritization of asset management and rehabilitation works. The use of Geographical Information Systems (GIS) for the construction of the database presents a specific interest in facilitating access to information on the water network (Jafar 2006; Jafar *et al*. 2007, 2010; Mamo 2013; Mamo *et al*. 2013, 2014). For this demo-illustration the RAM application used data on pipe failures collected in the City of Wattrelos, France.

*F*

_{R}) of the selected micro-zone indicates the number of failures per unit length of pipes (km) per year, as given by Equation (1):

*T*) the likelihood of failure is defined by the frequency of the

*F*

_{R}values, which is equivalent to the return period of the selected

*F*

_{R}value. For a sample of pipes with an average sample age (

*T*) the Probability of a selected

*F*

_{R}value over the selected time period is defined by the frequency distribution of the

*F*

_{R}values:

where:

*N*_{i}/N_{F}is the frequency of a selected*F*_{Ri}value (or range of values)*N*_{F}is the total number of years for the selected time period*i*is the serial number of a selected*F*_{Ri}value in a time series of*N*_{F}values.

For a selected age (*T*), a risk matrix, indicating a risk level on a color-coded scale (i.e. 1 to 4, corresponding to increasing deterioration severity level), is established based upon the Failure Rate (*F*_{Ri}(*T*)) for the cluster of pipes under consideration and the acceptable risk *P*_{rA} of exceeding the selected *F*_{Ri} value.

*F*

_{R}), Age (

*T*) and the Probability (or Risk) of exceeding a specific Failure Rate (

*P*

_{Ri}(

*F*

_{R})) of a selected micro-zone. The three-parameter Weibull Cumulative Density Function (CDF) is given by: where:

Scale parameter (or characteristic life-cycle). For a selected Age (

*T*) the Weibull CDF of the FR is obtained by substituting t =*F*_{Ri}and using for a scale parameter*F*_{R0}= 1 Failure per 1 km per 1 year.Shape parameter (or degradation rate parameter)

Location parameter (or failure-free life-cycle), representing a ) threshold of zero frequency (i.e.

*P*_{i}= 0).

t = the random variable under consideration.

### Data collection description of the study area

The study area is the City of Wattrelos, France, covering 1,362 hectares with a population of approximately 43,000 inhabitants. The WDS is approximately 162 km in length with the initial layout in 1891. The data used in this research is a 74-year historical dataset.

The pipe network database, including (i) water network characteristics and operational parameters (e.g. variance of pressure) and (ii) failure information with about 424 observations for the period 1991–2004, was compiled by Jafar (2006) and Jafar *et al.* (2007, 2010) using the GIS.

The pilot WDS consists of 200 pipes with a total length of 17.18 km. The oldest pipe was installed in 1927 while the youngest one was installed in 2002. The first recorded breakage dates to 1937 and the last one to 2005. The dataset selected for the pilot study included 51 cast iron pipes, total length of 4 km, installation year – 1927–1936 (average group installation year – 1931) and diameter (40 mm < *D* < 100 mm). To analyze the influence of the age (*T*) on the Failure Rate (*F*_{R}), the breakage-recorded data, from 1937 to 2004, were set into nine age-groups of 20-year time-period.

## RESULTS AND DISCUSSION

### Statistical data analysis

Figure 1(a) illustrates the Probability Distribution (*P*_{i}) of the *F*_{Ri} values and Figure 1(b) the cumulative probability function for the selected cluster of cast iron pipes and a 20-year group age (1937 to 1956, average group age 15 years). Figure 1(c) shows the reliability function for this group age with increasing *F*_{R} values. Exponential regression is used to establish the Weibull shape factor *β* for the statistical data analysis, yielding for *β**=* 2 the best-fit curve. Figure 1(d) shows that the Maximum Probability Density (*P*_{dmax}) remains practically constant with the group age and its corresponding Maximum Probability (*P*_{max}) value is quite consistent with the value calculated from Equation (8) for *β**=* 2. Figure 1(a) shows that the Weibull function with *β**=* 2 yields *F*_{Rmax} (Equation (7)) and *P*_{max} (Equation (8)) values which correspond fairly well to the *F*_{R} probability distribution characteristics values.

Figure 2 presents the cumulative probability functions for the *F*_{Ri} values for the selected cluster of pipes and the available group ages. It illustrates the relationships among three parameters, including *F*_{Ri}, Age (*T*) and Cumulative Probability (*P*_{Ci}). For a constant *F*_{Ri} value (e.g. *F*_{Ri} = 1) the Cumulative Probability (*P*_{Ci}) as well as the Probability (or Risk) of exceeding the selected *F*_{Ri} value (i.e. *P*_{Ri} = 1 − *P*_{Ci}) are therefore a function of the Group Age (*T*).

For example, the data analysis shows that for a selected value of *F*_{Ri} = 1, a 10-year delay in the network rehabilitation from an average group age of 55 years to 64 years will result in a significant risk increase of the probability of exceeding the selected value of *F*_{Ri} = 1 from *P*_{Ri} = 25% to 60%.

Similarly, for a constant Cumulative Probability (*P*_{Ci}) value the *F*_{Ri} value is a function of the average group age (*T*) as illustrated in Figure 2 for *P*_{Ci} = 0.6. The relationships between these three fundamental aging parameters could be efficiently used for risk assessment to enable the operator optimizing the WDS asset management.

### Stochastic data modeling for forecasting Aging (*T*) effect on Failure Rate (*F*_{R})

The ‘Casses’ software, developed by CEMAGREF (Casses Software; Le Gat & Eisenbeis 2000) for analyzing failure data and forecasting breakage risks in water distribution pipelines was used for the data analysis following its three steps of calibration, validation and forecasting. The ‘Casses’ algorithm uses the Poisson model for stochastic data analysis. The model calibration used record data up to 2004, including the validation period of 2000 to 2004, for forecasting the behavior trend over a period extending up to 2055.

Figure 3(a) and 3(b) show respectively the probability distribution of the *F*_{R} values and the reliability function forecast for the 20-year time-period group age of 2035 to 2055 (average group age of *T* = 114 years) for the cast iron pipes.

Figure 3(b) illustrates that the reliability function for this group age corresponds fairly well to the Weibull reliability function with *β**=* 2. Figure 3(a) shows that the Weibull function with *β**=* 2 and = 2.8 yields consistent *F*_{Rmax} (Equation (7)) and *P*_{max} (Equation (8)) values of the *F*_{R} probability distribution.

Figure 3(c) shows the forecast cumulative probability functions obtained for all the average group ages analyzed, indicating that for average group ages of 73 years and 113 years the probability of exceeding a selected *F*_{Ri} value of 3 increases, respectively, from *P*_{Ri} = 0% to 90%.

### Risk Assessment Methodology for a Decision Support System (RAM-DSS)

The integration of the statistical and stochastic data analyses of the recorded and forecast breakage rates enables the operator to establish a Risk Assessment Methodology (RAM) for optimizing WDS asset management. Figure 4(a) illustrates for constant Cumulative Probability (*P*_{Ci}) levels of 20%, 40%, 60% and 80%, the variation of the *F*_{Ri} value with the Average Group Age (*T*) over a time period of 123 years. It shows that for the data of the cast iron pipes, the aging effect starts to significantly increase after 55 years.

### Probability (*P*_{Ri}) of exceeding a selected *F*_{Ri} value

Figure 4(b) illustrates for selected *F*_{Ri} values of 0, 1, 2, 3, and 4, the risk increase with the network aging through variation of the Probability (*P*_{Ri}) of exceeding the selected *F*_{Ri} value vs the Average Group Age (*T*). For example, for a selected value of *F*_{Ri} = 1 a delayed rehabilitation of 9 years (from 55 to 64 years) will increase the risk of exceeding the selected value of *F*_{Ri} = 1 from 25% to 60%.

The purpose of the proposed Decision Support System (RAM-DSS) is to enable the operator to assess the delayed rehabilitation investment impacts on the risk of exceeding a selected *F*_{Ri} value and its cost consequences. For this purpose a risk matrix is defined as the product of the likelihood of exceeding the selected *F*_{R} value and the consequences of its occurrence.

For optimizing asset management the consequences could be defined by the selected *F*_{R} values for different severity levels of Insignificant, Minor, Moderate, Major, etc. For this pilot study, as illustrated in Table 1, considering a WDS, 4 km long, five severity levels of consequences were identified considering *F*_{R} ranges and the related average number of breakages per month. The event likelihood of exceeding a selected *F*_{R} value is identified by its frequency of occurrence over the selected time period (or its return period) (Table 2).

FR . | Consequence . | Number of breakages per month . |
---|---|---|

<0.5 | Insignificant | 0.17 (1 per 6 months) |

0.5–1 | Minor | 0.33 (1 per 3 months) |

1–2 | Moderate | 0.67 (1 per 1.5 months) |

2–3 | Major | 1.00 (1 per month) |

>3 | Severe | > 1.00 (1 per Month) |

FR . | Consequence . | Number of breakages per month . |
---|---|---|

<0.5 | Insignificant | 0.17 (1 per 6 months) |

0.5–1 | Minor | 0.33 (1 per 3 months) |

1–2 | Moderate | 0.67 (1 per 1.5 months) |

2–3 | Major | 1.00 (1 per month) |

>3 | Severe | > 1.00 (1 per Month) |

F_{R} Likelihood
. | |
---|---|

>80% | Highly probable |

60%–80% | Probable |

40%–60% | Occasional |

20%–40% | Remote |

<20% | Improbable |

F_{R} Likelihood
. | |
---|---|

>80% | Highly probable |

60%–80% | Probable |

40%–60% | Occasional |

20%–40% | Remote |

<20% | Improbable |

Figure 5(a) presents the color-coded risk matrix for the operator's experience-based DSS used for the demo-illustration in this pilot study. As illustrated in Figure 5(b), its deployment can be used for graphic color-coded risk visualization of the Probability (*P*_{Ri}) of exceeding a selected *F*_{Ri} value. The color-coded risk visualization can be efficiently integrated with multilayer input data on a GIS platform (Jafar 2006; Jafar *et al*. 2007, 2010) to support risk-based optimization and prioritization of investments in infrastructure repair and rehabilitation for pre-emptive asset management.

## CONCLUSION

The analysis of the relationships between the Failure Rate (*F*_{R}) of a water distribution network (or sub-network), the Age (*T*) of its classified pipes and the Risk (or probability – *P*_{R}) of exceeding the selected *F*_{R} value have been used in this study to develop and demo-illustrate a Risk Assessment Method Decision Support System (RAM-DSS) for optimizing asset management, forecasting the consequences of delayed rehabilitation and prioritizing the required investments.

The Weibull distribution function, often used for relatively small samples, is used for the statistical data analysis to model the network life-cycle and the risk of exceeding a selected Failure Rate (*F*_{R}) as a function of the Age (*T*). For this case study, the shape factor, representing the degradation rate, remains practically constant with the Group Age (*T*) and consistently yields the *F*_{Rmax} and *P*_{max} values of the Probability Density function of the *F*_{R} values.

An operator's experience-based risk management matrix is proposed for optimizing WDS asset management. The research demo-illustrates the value of integrating statistical and stochastic data analyses in modeling the vulnerability of the system at a certain age of its life-cycle and forecasting its degradation for preemptive asset management and rehabilitation investment prioritization.

## ACKNOWLEDGEMENTS

The authors thank Eric Macfarlane, Deputy Commissioner of NYC-DDC, for the fruitful discussions and expert advice during this study as well as Prof. Isam Shahrour for sharing data and our graduate students who contributed to this research.

## REFERENCES

*Modélisation de la dégradation des réseaux d'eau en vue d'une gestion prévisionnelle*

*Doctoral Dissertation*

*Virtual District Meter Area Municipal Water Supply Pipeline Leak Detection and Classification Using Advance Pattern Recognizer Multi-Class Support Vector Machine for Risk-Based Asset Management*

*Doctoral Dissertation*