## Abstract

Intermittent water supply (IWS) is established temporarily or continuously in many water distribution networks (WDNs) worldwide due to prolonged drought, low rainfall periods, water scarcity and high level of leakage. IWS causes several adverse consequences on the network operation, resulting in ineffective supply and demand management. This paper presents a survival analysis of the network elements, including water mains, service connections, and valves using the Kaplan-Meier approach to determine the survival probability and the probability of failure rates of events of interest. The objective is to explore the changes in failure rates of network elements after implementing an IWS scheme. The non-parametric survival method is applied to the large zone (Zone-5) of the WDN in Tehran (Iran) based on the frequency of failures before, during, and after the implementation of an IWS regime. The results show that the probability of failure rates significantly increase after implementing the IWS scheme, and can remain for several years after, even when the network returns to continuous water supply (CWS). The results of this study can assist utility managers to understand the detrimental effects of IWS systems on increasing failure rates.

## HIGHLIGHTS

Survival analysis of different network elements after implementing an IWS regime is investigated based on the Kaplan-Meier approach.

The proposed approach is applied to a large zone in Tehran, Iran.

Survival curves and hazard functions are derived to estimate survival time and probability of failure for water mains, service connections and valves.

The results can assist utility managers to better understand detrimental effects of IWS schemes on increasing failure rates of network elements.

### Graphical Abstract

## INTRODUCTION

Managing water demand for safe, sufficient, and continuous water supply is always a challenge for water utilities worldwide. The issue of water scarcity and intermittent supply affects around 40% of the world's population (Bivins *et al.* 2017; Charalambous & Laspidou 2017). In an intermittent water supply (IWS) system, water utilities deliver water to their consumers for less than 24 hours per day. Many regions around the world, mostly in Low and Middle Income (LAMI) countries, have to use IWS systems due to water shortage, insufficient hydraulic network capacity and severe deterioration of the network (Vairavamoorthy *et al.* 2007; Charalambous & Laspidou 2017). There are many problems in IWS systems in comparison to continuous water supply (CWS) systems due to their adverse consequences. IWS may cause economic and operational problems like ineffective supply and demand management, increased frequency of pipe failures, water contamination, high leakage level, illegal connections, meter tampering and customer dissatisfaction (Vairavamoorthy *et al.* 2007; De Marchis *et al.* 2011; Kumpel & Nelson 2013).

Several studies on various aspects of IWS systems, such as accuracy of water meters (Walter *et al.* 2017; Mastaller & Klingel 2018), water contamination (Coelho *et al.* 2003; Petkovic *et al.* 2011), leakage management (Tamari & Ploquet 2012), economic and social analysis (Hastak *et al.* 2017; Al-Washali *et al.* 2019) have been reported in the literature. Bivins *et al.* (2017) demonstrated that about 17 million infections are generated annually under IWS systems by quantitative microbial risk analysis. Burt *et al.* (2018) studied the costs and benefits of policies to exit an IWS system and attain a CWS system. They concluded that the IWS policies upturn costs and the best way to reach CWS from IWS is the sectorization of networks through district meter areas (DMAs). Anand (2011) investigated the social issues of IWS systems and argued that managing networks under IWS conditions is more influenced by people's rebellion than a rational program. Taylor *et al.* (2019) determined the ideal distribution time of intermittent supply for demand satisfaction for a particular area, by assuming a series of assumptions such as topological conditions and leakage rates. Al-Washali *et al.* (2019) investigated the influence of the system input volume (SIV) on the reported level of non-revenue water (NRW) in an IWS system. Haider *et al.* (2019) developed a framework to establish an economic level of leakage in IWS systems.

Operating pressure in water supply systems plays an important role, and the level of leakage is significantly affected by it (Moslehi & Jalili-Ghazizadeh 2020; Moslehi *et al.* 2020). In IWS systems, pressure transients and significant variations in pressure may cause the deterioration of water networks, frequent pipe failures and increasing water losses due to leakage. This may lead to substantial operation and maintenance costs and further reduction of the water supply (Charalambous & Laspidou 2017).

Several failure prediction models have been developed to assist water utilities in developing their asset management programs and improving the reliability of their networks. Predictive models may be classified into physically based and statistically based models (Martinez-Codina *et al.* 2015; Snider & McBean 2020; Moslehi & Jalili-Ghazizadeh 2020). Statistical models are based on historical data of failure records and intend to predict future failures. These event-based data are facing the challenge of censoring. If observation time is limited or the observation is missed, the event of interest is not observed within the data set, and a censoring issue occurs. To overcome the issue of censored data, survival analysis may be applied for predicting time-to-event (Snider & McBean 2020). A wide range of statistical predictive models has been developed using survival analysis. The methods of survival analysis may be classified into parametric, semi-parametric and non-parametric methods. Over the last several decades, several researchers applied different survival methods for failure analysis like the Kaplan–Meier estimator (Christodoulou 2011), Poisson regression (Boxall *et al.* 2007; Asnaashari *et al.* 2009), Weibull model (Park *et al.* 2008a; Dridi *et al.* 2009, 2005), multivariate exponential model (Mailhot *et al.* 2000; Kleiner & Rajani 2002), Cox proportional hazard model (Cox-PHM) (Vanrenterghem-Raven *et al.* 2004; Park *et al.* 2011, 2008b), and Bayesian inference/analysis (Watson *et al.* 2004; Dridi *et al.* 2005, 2009; Economou *et al.* 2007, 2008). The vast majority of previous studies were conducted in water distribution networks (WDNs) under continuous conditions. However, there are many WDNs around the world that are intermittent and do not provide water continuously. Few studies have been focused on failure analysis using survival analysis under IWS operating conditions.

The most important component of a decision support system for replacement and renewal of WDNs is a model to predict failure rates. The predictive model will enable utility managers to estimate the timing and costs of repairing, maintaining, and replacing water mains or other network elements like service connections and valves. In this paper, the survival analysis of different network elements after implementing an IWS regime has been investigated based on the Kaplan-Meier approach. The proposed approach is applied to a large zone (Zone-5) in Tehran, Iran. The survival curves and hazard functions are derived to estimate the survival time and the probability of failure rate for water mains, service connections and valves. The results can assist utility managers in better understanding the detrimental effects of IWS schemes on increasing failure rates of network elements.

## METHODOLOGY

Survival analysis is a set of statistical methods dealing with deterioration and failure of network elements over time to determine the survival time. This time is the elapsed time between an initiating event and a terminal event (Christodoulou 2011). The event could be water main breaks or failures, service connection failures, valve failures or any outcome of interest.

Survival analysis provides a supportable mechanism for the prediction of failures that can be included in a risk-based asset management model. These models improve understanding of how covariates influence the failures in WDNs by using them to differentiate the failure distributions without splitting the failure data (Park *et al.* 2008b; Osman & Bainbridge 2011).

*t*; that is, until the time of the failure (

*T*is a non-negative random variable), in this case. The probability of surviving in , , may be written as follows (Kaplan & Meier 1958; Kalbfleisch & Prentice 2011).where, denotes the Cumulative Distribution Function (CDF) as .

*t*. Thus, the hazard function is defined as follows (Cox 1972):where, denotes the Probability Density Function (PDF) as . The is always non-negative and has no upper bound. Based on Equation (2), there is a clear relationship between and that one can derive from the other (Cox 1972; Kleinbaum & Klein 2010).

The most widely used non-parametric model of survival function is the Kaplan-Meier curve (Christodoulou & Agathokleous 2012). In this function, the empirical curve derived from failure records is used to estimate actual computation of such a probability of network elements surviving prior to a certain time (Kleinbaum & Klein 2010).

### Kaplan-Meier (product limit) approach

It is important to note that the first derivation of the cumulative hazard function equals the hazard function at the point *t*. Thus, the second derivation of the cumulative hazard function equals the first derivation of the hazard function. This means that, if the cumulative hazard curve is concave downward, the slope of the curve continually decreases (), the lifetime distribution has a decreasing failure rate. Otherwise, it has an increasing failure rate.

For network element *i*, this equation computes the expected number of failures at the time *f*, where denotes the numbers of failures by element *i*. is the numbers of subjects in the risk by element *i*. The test contrasts the null hypothesis that considers if there is no difference between survival curves.

Under the null hypothesis, the Log-Rank statistic is approximately chi-square with k − 1 degrees of freedom, where k represents the number of comparison groups of network elements. Therefore, a *P*-value is a critical value for the Log-Rank test, which is specified from tables of the chi-square distribution. If chi-square exceeds the critical value, the null hypothesis is rejected (Bland & Altman 2004).

## CASE STUDY

The proposed methodology is applied to the large zone (Zone-5) of a WDS in Tehran, the capital city of Iran, at approximately 1,500 meters above sea level. The WDN of Zone-5 consists of three subzones: D1, D2 and D3. Figure 1 depicts the selected zone, which consists of approximately 1,080 km mains length and 169,461 service connections. The general characteristics of Zone-5 are summarized in Table 1. The water mains materials for Zone-5 consist of asbestos cement concrete (AC), polythene (PE) and ductile iron (DI). As shown in Table 1, the predominant mains materials are DI (77%) and AC (13%). Therefore, the methodology is conducted based on the collected data from DI and AC mains materials. Several valves have been installed in the network including air release, gate, fire landing, pressure relief, and butterfly valves. Service connection failures have occurred from the tapping point until the point of customer metering.

Material . | Length by material (Km) . | % . | Age (year) . | Length by age (Km) . | % . |
---|---|---|---|---|---|

DI | 830.3 | 76.88 | 0–20 | 9.08 | 11.05 |

PE | 51.7 | 4.79 | 20–40 | 118.6 | 24.63 |

AC | 143.8 | 13.32 | 40–60 | 95.5 | 60.17 |

Not defined | 54.2 | 5.01 | 60 < | 16.9 | 4.15 |

Total | 1,080 | 100 | 515.9 | 100 |

Material . | Length by material (Km) . | % . | Age (year) . | Length by age (Km) . | % . |
---|---|---|---|---|---|

DI | 830.3 | 76.88 | 0–20 | 9.08 | 11.05 |

PE | 51.7 | 4.79 | 20–40 | 118.6 | 24.63 |

AC | 143.8 | 13.32 | 40–60 | 95.5 | 60.17 |

Not defined | 54.2 | 5.01 | 60 < | 16.9 | 4.15 |

Total | 1,080 | 100 | 515.9 | 100 |

## RESULTS AND DISCUSSION

Table 2 represents the frequency of failures for different network elements, including water mains, service connections and valves, and covers a time period of 7 years (from 2000 to 2007). There are no censored failures from the collected dataset. The network was operated under an IWS regime for seven months from April 2001 to November 2001 due to severe drought conditions. As shown in Table 2, the failure rates of each network element were increased significantly after the establishment of IWS conditions. The average growth rate of failures for water mains, service connections and valves was around 40, 24 and 12% respectively. Moreover, the failure rates for network elements still remained high after return to normal conditions from 2002 to 2006. This is because of the fact that although the network was operated for a short period (7 months) under the IWS conditions, this situation significantly affected the failure rates for several years. The table also shows that the failure rate of water mains and service connections increased immediately after implementing the IWS scheme. However, the rate of rise of valve failures has a delay time. This might be due to the valve's resistance against deterioration and final failure.

Year . | Service connection failures (No./1000 connections/year) . | Valve failures (No./100 km/year) . | Water main failures (No./100 km/year) . | |||
---|---|---|---|---|---|---|

2000 . | 120.5 . | Growth rate . | 38.6 . | Growth rate . | 34.6 . | Growth rate . |

2001 | 149.2 | 0.24 | 33.5 | − 0.13 | 52.1 | 0.51 |

2002 | 147.5 | 0.22 | 64.3 | 0.66 | 49.1 | 0.42 |

2003 | 152.5 | 0.27 | 62.9 | 0.63 | 49.9 | 0.44 |

2004 | 155.5 | 0.29 | 47.9 | 0.24 | 51.2 | 0.48 |

2005 | 157.7 | 0.31 | 46.8 | 0.21 | 50.9 | 0.48 |

2006 | 158.4 | 0.31 | 26.8 | − 0.31 | 44.7 | 0.29 |

2007 | 120.7 | 0.003 | 20.8 | − 0.46 | 41.4 | 0.19 |

Year . | Service connection failures (No./1000 connections/year) . | Valve failures (No./100 km/year) . | Water main failures (No./100 km/year) . | |||
---|---|---|---|---|---|---|

2000 . | 120.5 . | Growth rate . | 38.6 . | Growth rate . | 34.6 . | Growth rate . |

2001 | 149.2 | 0.24 | 33.5 | − 0.13 | 52.1 | 0.51 |

2002 | 147.5 | 0.22 | 64.3 | 0.66 | 49.1 | 0.42 |

2003 | 152.5 | 0.27 | 62.9 | 0.63 | 49.9 | 0.44 |

2004 | 155.5 | 0.29 | 47.9 | 0.24 | 51.2 | 0.48 |

2005 | 157.7 | 0.31 | 46.8 | 0.21 | 50.9 | 0.48 |

2006 | 158.4 | 0.31 | 26.8 | − 0.31 | 44.7 | 0.29 |

2007 | 120.7 | 0.003 | 20.8 | − 0.46 | 41.4 | 0.19 |

Figure 2 illustrates the average failure rates of three network elements per month. As can be seen, the rate of failures after the implementation of the IWS scheme, especially over the warmer months, significantly increased and decreased after six years. Moreover, there is a similar trend of failure rates between each year from 2001 to 2006, which indicates the same hydraulic behaviour in the water distribution network.

Figure 3 shows the total failure rates before, during and after implementing the IWS scheme from 2000 to 2007 in Zone-5. As can be seen, the total failure rates follow the same trend. The frequency is lower for cold months and higher for warmer months of the year, from July to September. As may be observed in Figure 3, the total failure rates significantly increase after implementing the IWS scheme in April 2001 until January 2007, while before the implementation of the IWS scheme, there is a relatively low failure rate. The average growth rate of total failures was found to be around 24% during this time period. This may be due to the surges and large pressure variations as a prime factor of failures during IWS conditions. This is also inferred from the results that the adverse consequences of IWS conditions can remain for a long time.

It is required to identify the most influenced element which may be affected by implementing IWS conditions. Thus, the survival probability is calculated based on the Kaplan-Meier approach using failure rates of each network element.

Figure 4 shows the empirically derived survival curves, using Equation (3), for network elements, grouped by water mains, service connections, and valves. In Figure 4, time and survival probability are shown on the X-axis and Y-axis respectively. These curves are stepped functions that can be used to estimate the probability of a new element surviving until a particular time period after implementing the IWS scheme. The curves also allow the comparison between different failure rates over the time period of the analysis. Figure 4 shows that the probability of survival of network elements starts to decrease after the establishment of IWS conditions. The survival probability reaches 65% after 30 months from the implementation of the IWS regime in the study area. The median survival time can be estimated from survival curves, where the survival probability is 0.5 for each element. For this case study, the median survival time of valves, water mains and service connections was found to be 46, 50 and 51 months, respectively. The figure also shows that all network elements will have to be eventually replaced after approximately 80 months from implementation of the IWS scheme. Figure 4 demonstrates that the network operating conditions significantly influence the survival time of network elements. The estimated survival probability could assist water utilities in understanding the detrimental effects of IWS on the network assets.

Figure 4 also indicates that the survivor functions for water mains and service connections are very close together during the time period of analysis. However, the survivor function for the valves rapidly decreases from November to December 2002. The reduction of the survival function of valves was found to be around 10% at the time period. Afterward, it consistently lay below the mains and service connections functions. The difference between survival curves of valves and two other elements increases after December 2002 until December 2005 and then decreases until the end of December 2007. This difference demonstrates that the probability of surviving is significantly affected by the detrimental effects of IWS conditions and some influential parameters, including material and age.

For comparing the pair of survival functions, pair of service connections and water mains, service connections and valves, and water mains and valves, the Log-Rank test was applied. In this application, the Log-Rank test assesses whether there is a difference in survival (or cumulative incidence of the event) between the elements. The Log–Rank statistic is based on the summed failures observed minus failures expected scores for a given group and its variance estimate. In this procedure, the failures of mains, service connections and valves were compared in pairs. Table 3 shows the results of the nonparametric test at 95% significance level for the studied network. The test results indicate that *P*-values are found below 0.05 for all pairs and therefore the null hypothesis of the Log-Rank tests is rejected. Moreover, the results demonstrate that the failure rates of valves are significantly different from water mains and service connections according to estimated test statistics during the time period of the analysis. This would mean that valves are more influenced by implementing IWS conditions in the study area.

Group . | Group name . | Comparison level . | Number of failures observed . | Number of failures expected . | Log-Rank test statistic . | Chi-squared test (accepted) . |
---|---|---|---|---|---|---|

1 | Mains | 1 vs 2 | 4,039 | 4,028.4 | 0.081 | No |

2 | Service connections | 196,910 | 195,525.6 | |||

1 | Mains | 1 vs 3 | 4,039 | 4,330.9 | 57.102 | No |

3 | Valves | 3,691 | 3,358.1 | |||

2 | Service connections | 2 vs 3 | 196,910 | 196,109.7 | 109.101 | No |

3 | Valves | 3,691 | 3,111.3 |

Group . | Group name . | Comparison level . | Number of failures observed . | Number of failures expected . | Log-Rank test statistic . | Chi-squared test (accepted) . |
---|---|---|---|---|---|---|

1 | Mains | 1 vs 2 | 4,039 | 4,028.4 | 0.081 | No |

2 | Service connections | 196,910 | 195,525.6 | |||

1 | Mains | 1 vs 3 | 4,039 | 4,330.9 | 57.102 | No |

3 | Valves | 3,691 | 3,358.1 | |||

2 | Service connections | 2 vs 3 | 196,910 | 196,109.7 | 109.101 | No |

3 | Valves | 3,691 | 3,111.3 |

Figure 5 represents the calculated hazard rate functions using Equation (4) for water mains, service connections and valves. The hazard rate may be used to identify the instantaneous probability of failures. As can be seen in Figure 5, the probability of failures increases rapidly after implementing the IWS scheme in the study area for all network elements, because of the convex form of the hazard rate function. The figure also shows that the hazard rate for valves is larger in comparison with the hazard rate of both mains and service connections. This indicates that the IWS conditions are more influential on the probability of failure occurrence of valves.

## CONCLUSIONS

This study presented a survival analysis of network elements under intermittent water supply conditions using the Kaplan-Meier Approach (non-parametric method). The results showed a dramatic increase in the failure rate of the different network elements, including water mains, service connections and valves after the implementation of the IWS scheme. The total increasing failure rates were 24%. Additionally, the average increase of total failure rates was around 30% over the time period of the effects of IWS conditions for the studied network (about six years). The results also demonstrate that the detrimental effects of IWS conditions on the network assets could remain for several years (six years for this study), even when the network returns to CWS. The survival curves derived from the Kaplan-Meier approach showed that the median survival time of valves, water mains and service connections is 46, 50 and 51 months respectively, showing the severe effects of IWS conditions on water network elements. The calculated hazard functions indicated that the probability of failures increased rapidly after implementing an IWS scheme in the study area for all network elements. For the case study, the behaviour of failure rates of mains and service connections were almost the same. Additionally, valves were more vulnerable than mains and service connections. It indicates that high-quality valves should be used if the water network operates under intermittent supply conditions. Whereas the intermittent-supply policy may be a fast way to reduce water consumption (if any), the reliability of the network may decrease significantly due to increasing failure rates. The estimated survival probability could assist water utilities in understanding the detrimental effects of IWS on the network assets and establish efficient asset management strategies.

## ACKNOWLEDGEMENTS

The authors appreciate the help provided by the Tehran Water and Wastewater Company (TWWC) in provision of data used in this work.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.