An alternative risk assessment method, known as failure mode effects and criticality analysis (FMECA), is demonstrated on the regional water supply systems (RWSS) in Tucson, AZ, USA that combines delivery of potable and reclaimed water and conveyance of wastewater to a developing area within the Tucson RWSS. The goal of FMECA is to examine the volumetric severity of a component failure on the overall system function by modeling the system under alternative failure modes. Within FMECA, the Risk Priority Number (RPN) is applied to compare the risk criticality between components' failures. To complete FMECA, the Tucson RWSS is represented in a network flow model that optimally allocates flows between sources and demand points to minimize operational costs. Potential failure mode consequences are evaluated from the flow model as the volume of water not delivered to users if the component is unavailable. The volumetric severity of the failure event is converted to an ordinal value using stakeholder judgment. Likelihood of each failure mode is similarly defined by stakeholders on a 1–10 scale. The RPN is then computed as the product of the severity and likelihood. RPN values for all failure modes are then ranked to assess the most critical elements. Alternative system configurations are examined to assess the value of redundancies on the Tucson RWSS resilience.
INTRODUCTION
Given the importance to community health and societal well-being, regional water supply systems (RWSS) must be robust and resilient. Infrastructure robustness is defined as the ability to perform under extreme conditions and resilience is the ability to mitigate and recover from failure events (Lansey 2012). Robustness contributes to system resilience and reduces failure severity (Bruneau 2005). For example, the Pentagon's resilient structure was redundant and robust so that the level of damage was reduced and only a small part of the building was collapsed despite the September 11th terrorist attack (Mlakar et al. 2003). Further, during the Japan earthquake in March 2011, their robust internet network system survived although two-thirds of the Northern Japan mobile stations were impaired (Bijan et al. 2011). Lack of resilience, on the other hand, is recognized after failure such as the extended recovery time from Hurricane Katrina.
Acts of terrorism and natural disasters are not the only risk factors that undermine system functionality. Deterioration, aging, increased demands, and diminished operations and maintenance funding further imperil infrastructure systems. Much effort has been devoted to ensure infrastructure system resilience. The Department of Homeland Security (DHS) has defined resilience as one of three foundational concepts for a comprehensive approach to homeland security in the 2010 Quadrennial Homeland Security Review (DHS 2010). The Homeland Security Studies and Analysis Institute initiated several studies to form a framework for combining resilience into the nation's critical infrastructure (Kahan et al. 2009).
Engineers and decision-makers apply risk management techniques to identify potential failure modes (events) and expose potential losses. Results from these studies allow decision-makers to rank and prioritize the potential failure conditions through a risk assessment. Analysis techniques include failure mode effects and criticality analysis (FMECA), Fault Tree Analysis, Goals Means Task Analysis, and Markov analysis methods (Fredriksen et al. 2002). However, these methodologies have had limited water resources applications and no risk management methods with the exception of FMECA (Marlow et al. 2007) have been applied to RWSS.
The objective of this study is to provide a structured and practical method for conducting risk management for RWSSs. Of particular interest here is failure of large system components with very low and, often undefined, probabilities of occurrence. To better inform decision-makers on potential resilience vulnerabilities, a stepwise risk management technique known as FMECA can be performed to categorize risks and their consequences. The methodology is applied to a portion of the Tucson, AZ, USA RWSS that is being expanded to meet future growth. Weaknesses and critical risks can be identified early in the design process when modifications and corrective actions are still possible. The secondary objective is to investigate the resilience of different Tucson RWSS configurations: (1) centralized and decentralized wastewater treatment and reuse system, and (2) use the Central Well Field (CWF) as a backup potable source.
LITERATURE REVIEW
The definition of resilience has been tailored to specific fields/applications. Bruneau (2005) defined resilience from the earthquake engineering perspective. Holling (1973) and Folke (2006) used the resilience concepts to understand the dynamics of ecological and social-ecological systems, respectively. Norris et al. (2008) defined resilience from the community perspective. Although little effort has been placed on RWSS resilience, much work has been conducted in the water management field. For example, Wang & Blackmore (2009) proposed an approach to quantify the resilience of water resource systems; Hagos et al. (2014) introduced methodology for pipe leakage detection to improve water supply system (WSS) resilience; Zhuang et al. (2013) demonstrated adaptive pump operations for enhancing the WSS resilience when pipe failure occurs; Lansey (2012) defined and described the characteristics of resilience from a water resource perspective. Different definitions of resilience are commonly linked to the adaptive capacity of a system, resistance to disturbance, and ability to effectively recover from any disturbance.
The National Science Foundation (NSF) supported EFRI RESIN project teams (Lansey 2012) defined resilient infrastructures as having ‘the ability to gracefully degrade and subsequently recover from a potentially catastrophic disturbance that is internal or external in origin’. Resilience is a function of the system functionality loss and the failure event duration. The loss of functionality is dependent on the redundancy and robustness of the system while the recovery time is related to the system resourcefulness that promotes the rapidity of responses (Bruneau & Reinhorn 2006). Scholz et al. (2012) distinguish between generalized resilience that deals with rare events whose probability is not known and specified resilience due to more predictable failures (e.g., pipe breaks in a water distribution system). Here, we consider generalized resilience.
Risk analysis, risk assessment, and risk management are often used in an interchangeable way to describe a variety of the techniques and processes involved in the overall management of risk (Nicolosi et al. 2009). Risk management implies a process of identifying, analyzing, assessing, and communicating risk and accepting, avoiding, or controlling the risk to an acceptable level at an acceptable cost (Rausand 2013) (Figure 1). Risk assessment is a product or process that collects information and assigns values to risks for the purpose of informing priorities, developing or comparing courses of action, and informing decision-making (World Health Organization 2004; United States Department of Homeland Security Risk Steering Committee 2008).
FMECA is easy to understand relative to other risk management techniques and it allows decision-makers to identify the most vulnerable components in terms of failure likelihood and consequence without defining the failure probabilities of rare events. Furthermore, it is a useful tool for assessing failure modes, associated corrective actions, and improved designs (Dhillon 2006). However, FMECA is limited to considering single component failures and not suitable for multi-failure systems that are better assessed using fault tree method analyses (Rausand & Hoyland 2004).
FMECA has been used in a wide range of applications. Bertolini et al. (2006) presented an application of FMECA to the production process in the farming and food industries to identify the critical points of the system and to propose improvements in the traceability system. Kim et al. (2009) modified the standard FMECA steps to be more applicable to railroad systems. FMECA has also been implemented in the aircraft equipment industry (Li & Xu 2012), health care (Scorsetti et al. 2010; Nguyen et al. 2013), LED manufacturing (Sawant & Christou 2012), diesel engines (Wei et al. 2012), information systems (Signor 2002), and power systems (Bevilacqua et al. 2000).
Many RWSS components are quite large and costly. Failure rates of these components tend to be very low and, often, sufficient data are not available to statistically estimate failure probabilities. Failure of these components, however, can result in substantial negative impacts. To develop a resilient RWSS, it is crucial to recognize the critical points of the RWSS and propose system redundancies to mitigate failure effects. FMECA is well suited for problems of this type. Previous studies demonstrated that FMECA is a feasible and effective risk management method for enhancing the operational system or equipment reliability in a water system. Sydney Water applied FMECA to asset management by evaluating the risk associated with each system asset using the Risk Priority Number (RPN) to determine asset maintenance or renewal requirements (Marlow et al. 2007). This study aims to express RWSS component criticality in qualitative terms within an FMECA framework to provide guidance on the RWSS design scale to enhance system resilience.
The main challenges in applying FMECA are: (1) developing appropriate system representations to assess the severity of component failure; and (2), with local experts, identifying possible failure modes and representative scales for failure occurrence and severity for calculating the RPNs that are based on the deficit of water compared to the water demand. A unique element compared to many FMECA applications is the ability to store water throughout the network that can be used as backup sources during component failures.
METHODOLOGY
Regional water supply system
A typical RWSS consists of various sources (surface and ground water), infrastructures (water/wastewater treatment plants, pump stations, and surface water recharge facilities), and users (agricultural, industrial, and municipal). System elements are connected and interrelated in complex networks. RWSS components are designed to dependably deliver high quality water through pipelines, canals, and the distribution system to users. RWSS capacities are generally determined based on anticipated future population growth, water source availability, and climatic conditions (Chung et al. 2009).
Component failures within a RWSS can cause acute impacts on agricultural, industrial, and municipal users. According to Lukas et al. (2012), failure implies the condition of not fulfilling objectives. For this study, the general RWSS objective is to continuously supply water that satisfies the user's demands.
Failure definition and resilience
For the case shown in Figure 2, the volumetric severity of the failure event is 0.6 and denoted by the shaded area. Under normal conditions, system functionality equals 1 as all users are supplied adequate water. However, when a component failure occurs at to, the functionality is reduced, here, to 0.4. In other words, only 40% of the demanded water is delivered to users. The reduced functionality can remain at the lower value until the failure is fully repaired at time tf or gradually increase as remedial measures are taken to partially repair the system. When the system is fully repaired, functionality returns to 1.
Failure mode effects and criticality analysis (FMECA)
FMECA consists of two sequential processes: failure mode and effects analysis (FMEA) and criticality analysis (CA). FMEA identifies potential failure modes and their consequences. CA then prioritizes (ranks) the failure modes according to their failure risk criticality (Figure 3). CA can be conducted either quantitatively or qualitatively. Quantitative CA is employed if failure rate, failure mode, and failure effect probabilities are available to calculate criticalities using risk relationships (Equation (1)).
Larger SS and OS values indicate a more severe consequence associated with the failure and higher failure likelihood, respectively. DT is defined as the likelihood that the failure mode will be observed, higher numerical values of DT indicate lower failure detection probabilities. In FMECA, failure modes with higher RPNs are assumed to result in more severe and critical damage than failure modes that yield lower RPNs (Johnson & Niezgoda 2004) and require a rapid response.
Computing damage/severity – flow allocation model
N | Set of nodes |
A | Set of arcs |
T | Number of time steps |
S | Set of storage nodes (e.g., recharge facility) |
NS | Set of non-storage nodes (e.g., treatment facilities) |
i, j | Upstream and downstream node identifiers, respectively |
Ci,j | Unit cost for arc i, j |
q | Decision variable |
b | Net water supply (+) or demand (−) |
RT | Treatment/recharge capacity |
P | Pump station capacity |
μ | Arc loss factor |
t | Time period |
V | Aquifer storage |
Q | Pipeline capacity |
N | Set of nodes |
A | Set of arcs |
T | Number of time steps |
S | Set of storage nodes (e.g., recharge facility) |
NS | Set of non-storage nodes (e.g., treatment facilities) |
i, j | Upstream and downstream node identifiers, respectively |
Ci,j | Unit cost for arc i, j |
q | Decision variable |
b | Net water supply (+) or demand (−) |
RT | Treatment/recharge capacity |
P | Pump station capacity |
μ | Arc loss factor |
t | Time period |
V | Aquifer storage |
Q | Pipeline capacity |
Each node i ∈ NS is associated with quantity bi that represents the net water supply or demand rate. bi is positive, negative, and zero for a supply node, demand node, and transit node, respectively. Vi, i ∈ S denotes aquifer storage for node i if node i ∈ S. Pi, i ∈ N denotes pumping capacity for node i and RTi represents the treatment capacity if i is a treatment plant or recharge capacity if i ∈ S. For each arc (i, j) ∈ A, Qijt denotes the pipeline capacity, μij denotes the loss multiplier to account for water leakage. The decision variables qijt denote the flow from node i through node j at time t and cijt is the unit cost for carrying flow from node i through node j at time t.
The objective function denotes the total operational cost over the entire planning horizon (Equation (6)). Equations (7) and (8) represent conservation of mass for non-storage (pump stations, treatment facilities, etc.) and storage (aquifers and surface reservoirs) nodes, respectively, where Vi,t is the volume of storage at time t and aquifer i. Equations (9) and (10) are the capacity constraints limiting the total flow that may enter and/or leave certain nodes. Lastly, Equation (11) gives the lower and upper bounds constraints for the flow variables.
APPLICATION
In collaboration with the city of Tucson and Pima County, a bottom-up, qualitative FMECA approach was applied to a portion of the Tucson RWSS known as the RESIN RWSS (Figure 4). Since all failures have obvious impact, the modified RPN equation was applied. The network flow model consists of surface water supply, water/wastewater treatment plants, recharge facilities, reservoirs, well fields, and users (municipal, agricultural, and industrial). As described in more detail below, the study area is composed of 19 pressure zones separated by 34 meters in elevation (Figure 4) denoted in alphabetical order starting with Zone C. The network flow model computes the optimal allocation of potable and non-potable water that minimizes the operational cost for a 41-year period in a monthly step (from 2010 to 2050). The model is solved using MATLAB.
Multi-year population estimates were taken from the Water and Wastewater Infrastructure, Supply and Planning Study (WISP) (Take et al. 2009). WISP was initiated by the local governments and gives the more conservative or ‘high’ population estimate. Since the majority of population increase is expected in the lowest seven pressure zones (Zones C to I), only these zones were modeled with demands from Zone J and higher added to Zone I's water consumption. From available demographic studies, the RESIN area population will increase from 41,000 to as many as 760,000 people over the planning period. Demands are calculated using Tucson Water's per capita usage of 511 liters per capita per day (lpcd) for municipal and commercial areas. The 2011 monthly demand data for Tucson was obtained from Tucson Water and converted to percentage of annual water total use (Figure 5).
Water demands vary each month following a demand pattern based on 2011 records (Figure 5). The proportion of monthly demand to the annual demand is assumed to follow the pattern of proportional demands from 2011. Growth and demand projections are based on an ongoing regional planning study on the Tucson Active Management Area.
Wastewater returns are assumed to be a percentage of the potable use from municipal and industrial demand nodes. Fifty-two percent of the total demand is assumed to be for potable uses with the remainder non-potable consumption. These percentages are based on 76 lpcd for commercial outdoor uses, 265 and 170 lpcd for household indoor use and outdoor consumption, respectively. Outdoor water use and irrigation are assumed to be consumptive uses. Potable water sources can supply both potable and non-potable user demands depending upon water availability and cost. Non-potable supplies are only acceptable for non-potable uses.
Tucson RWSS configuration
The Tucson RWSS is depicted as a series of nodes and arcs (Figure 6). Major water pipes are regarded here as arcs, water sources such as reservoirs, water/wastewater treatment plants, and recharge facilities are regarded as supply nodes, and potable and non-potable water users are demand nodes. In the RESIN RWSS planning area, surface water from the Central Arizona Project (CAP) canal is delivered and recharged in the Central Avra Valley Storage and Recovery Project Facility (CAVSARP), the Southern Avra Valley Storage and Recovery Project Facility (SAVSARP), and the Pima Mine Road Terminal Storage Facility (PMR). The water from CAVSARP is chlorinated in the Hayden-Udall Water Treatment Plant (HUWTP) and sent to the Clearwell Reservoir (CWR) for storage and daily flow balancing. Flow leaves the CWR through a pipeline/pressure reduction valve to the Zone C interconnection. SAVSARP wells can deliver water to HUWTP or the Plant 9 Water Treatment Plant (Plant 9). Chlorinated water from Plant 9 is delivered through Martin Reservoir (MR) to the interconnection point. PMR is intended to be a long-term storage facility. However, the high recharge basin infiltration rates and rapid lateral movement has resulted in a significant volume of recharged water impacting the Santa Cruz Well Field (SCWF). The SCWF extracted water is chlorinated at the well head before being sent to MR. The Zone C interconnection point can deliver water to Zone C or to the northern portion of Zone F (Zone FN) through the Kolb Rd booster station.
Wastewater generated in the RESIN RWSS area flows through the interceptor lines (shown as black dashed lines in Figure 6) and delivered to either: (1) the Design-Build-Operate Project Facility (DBO) for treatment, or (2) flow in excess of the DBO capacity is diverted to the Ina Road WWTP. Only flow from the DBO is available for reuse. DBO effluent is de-chlorinated and discharged into the Santa Cruz River, a portion of the DBO effluent is conveyed to Tucson Water Reclamation Facility (TWRF) for tertiary treatment and non-potable reuse or recharged at the Sweetwater Recharge Facility (SRF). When needed, the SRF water is extracted and mixed with the tertiary treated effluent at the TWRF. The TWRF is the main distribution infrastructure for the reclaimed water supply in Tucson; providing water to the RESIN RWSS Zone C via the Mission booster and to Zone FN.
Potable water is delivered to pressures zones in a stair-stepping process to serve the potable and non-potable demands. Reservoirs/pump stations are located at the lower end of each pressure zone. Pumps draw water from the reservoir and lifts it through a zone to the next upper zone's reservoir. Zones F, G, and H are split into two sub-zones by a system divide created by Interstate 10 (I-10) (Figure 7). The lower portion of Figure 7 continues the network shown in Figure 6. The upper portion of Figure 7 shows the northern portion with flow entering at the Zone FN entrance. Non-potable water is delivered to this zone directly from the TWRF. Zone HN is the terminal zone on the north side of I-10.
RESULTS
Scenarios
The following assumptions were made in FMECA failure scenario analyses:
RESIN RWSS operations will be determined for 41 years on a monthly time step to analyze the overall system performance (system costs) over the 41 years.
Given the low likelihood of a failure, only one major component (recharge facilities, water/waste WWTP, pump station, reservoir, and major pipe lines) fails at a given time.
For demonstration purposes and to assess failure conditions that would be addressed in the short term (next 20 years), we assume all failures occur at year 20 of the simulation period (year 2030). Failures are assumed to occur in June as it has highest monthly demand. The simulation period is continued until 2050 to allow the operations to return to a stable condition and not affect the condition at the end of the failure. The last assumption is very conservative but relatively inexpensive to include given the short computation times of the flow allocation model.
For demonstration purposes and to remain consistent with the flow allocation model time step, all failures are assumed to occur in June 2030 and have 1-month durations. Some failure durations will likely be less than 1 month and some short-term modifications may be possible to reduce volumetric severity.
Annual demands are assumed to increase almost linearly from 7.6 to 142.1 billion liters per year over the planning horizon.
User demand patterns are time invariant (Figure 5). That is, the proportion of the annual demand is the same for a given month during all years of the simulation period.
Failures are detected instantaneously relative to the failure duration (i.e., the failure mode's detectability scale values (DS) equals 1).
Thirty-three failure modes were identified (Table 2). For example, failure mode 1, Zone C interconnection point pump station, corresponds to a mechanical or electrical breakdown in the pump station that conveys water from the centralized system to Zone C. Note that pipe arcs are lumped representations of the transmission system rather than a single pipe line. Partial capacity losses can occur in these links, for example, 50% reduction in the Zone D reservoir/pump station transmission capacity (failure mode 26).
Failure mode . | Component failed . |
---|---|
1 | Zone C interconnection point pump station |
2 | HUWTP |
3 | CWR |
4 | CAVSARP |
5 | SAVSARP |
6 | Kolb booster station |
7 | TWRF |
8 | Zone C non-potable pump station |
9 | Zone D non-potable pump station |
10 | Zone E non-potable pump station |
11 | Zone FS non-potable pump station |
12 | Zone FN non-potable pump station |
13 | Zone GS non-potable pump station |
14 | Zone GN non-potable pump station |
15 | SRF |
16 | DBO |
17 | MR |
18 | Zone C reservoir/pump station |
19 | Zone D reservoir/pump station |
20 | Zone E reservoir/pump station |
21 | Zone FS reservoir/pump station |
22 | Zone FN reservoir/pump station |
23 | Zone GS reservoir/pump station |
24 | Zone GN reservoir/pump station |
25 | Zone HS reservoir/pump station |
26 | Zone C reservoir/pump station flow reduction |
27 | Zone D reservoir/pump station flow reduction |
28 | Zone E reservoir/pump station flow reduction |
29 | Zone FS reservoir/pump station flow reduction |
30 | Zone GS reservoir/pump station flow reduction |
31 | Zone GN reservoir/pump station flow reduction |
32 | Zone HS reservoir/pump station flow reduction |
33 | Zone HN reservoir/pump station flow reduction |
Failure mode . | Component failed . |
---|---|
1 | Zone C interconnection point pump station |
2 | HUWTP |
3 | CWR |
4 | CAVSARP |
5 | SAVSARP |
6 | Kolb booster station |
7 | TWRF |
8 | Zone C non-potable pump station |
9 | Zone D non-potable pump station |
10 | Zone E non-potable pump station |
11 | Zone FS non-potable pump station |
12 | Zone FN non-potable pump station |
13 | Zone GS non-potable pump station |
14 | Zone GN non-potable pump station |
15 | SRF |
16 | DBO |
17 | MR |
18 | Zone C reservoir/pump station |
19 | Zone D reservoir/pump station |
20 | Zone E reservoir/pump station |
21 | Zone FS reservoir/pump station |
22 | Zone FN reservoir/pump station |
23 | Zone GS reservoir/pump station |
24 | Zone GN reservoir/pump station |
25 | Zone HS reservoir/pump station |
26 | Zone C reservoir/pump station flow reduction |
27 | Zone D reservoir/pump station flow reduction |
28 | Zone E reservoir/pump station flow reduction |
29 | Zone FS reservoir/pump station flow reduction |
30 | Zone GS reservoir/pump station flow reduction |
31 | Zone GN reservoir/pump station flow reduction |
32 | Zone HS reservoir/pump station flow reduction |
33 | Zone HN reservoir/pump station flow reduction |
To ensure that failures are not anticipated in the operation, the optimal allocations are determined by executing the network flow model without failures. Storage conditions at the end of May 2030 are used as the starting condition for each failure scenario that is evaluated through 2050. Failure severity is then computed as the volume of unmet demand relative to the total system demand after 2030.
To satisfy demands during failure conditions, dummy sources are connected to each demand node. Costs for flow on these arcs are substantially higher than for water delivery through the system so they are only used during shortages. Further, the cost assigned to supply a potable user from the dummy source is ten times greater than links from the dummy source to non-potable demand nodes to ensure that potable uses have higher priority than non-potable demands.
Failure severity (SS) and occurrence scale (OS) values were assigned according to Figure 8 and Table 3 (Stamatis 1997), respectively. The relationship between the volumetric severity and severity scale value in Figure 8 was created from discussions with the Tucson Water officials and experts. These officials also identified the occurrence scale value for each of the failure events. During a failure condition, all flow may not be delivered to users. The flow allocation model computes this shortfall that is known as the volumetric severity. Using Figure 8, volumetric severity is converted to SS for the RPN calculation.
Effect . | Scale value . | Criteria . |
---|---|---|
Almost never | 1 | Failure unlikely. History shows no failure |
Remote | 2 | Rare number of failures likely |
Very slight | 3 | Very few failures likely |
Slight | 4 | Few failures likely |
Low | 5 | Occasional number of failures likely |
Medium | 6 | Medium number of failures likely |
Moderate high | 7 | Moderately high number of failures likely |
High | 8 | High number of failures likely |
Very high | 9 | Very high number of failures likely |
Almost certain | 10 | Failure almost certain |
Effect . | Scale value . | Criteria . |
---|---|---|
Almost never | 1 | Failure unlikely. History shows no failure |
Remote | 2 | Rare number of failures likely |
Very slight | 3 | Very few failures likely |
Slight | 4 | Few failures likely |
Low | 5 | Occasional number of failures likely |
Medium | 6 | Medium number of failures likely |
Moderate high | 7 | Moderately high number of failures likely |
High | 8 | High number of failures likely |
Very high | 9 | Very high number of failures likely |
Almost certain | 10 | Failure almost certain |
FMECA results
Table 4 summarizes the RWSS FMECA results for the 33 failure modes. The overall system operation cost under normal conditions is $249.38 million for the 41-year period while the system costs for the failure modes range between $248.84 million and $249.62 million. The results show that the failures have either negative or positive impacts on the system cost depending on the volumetric severity. For example, failure mode 1 has a significant volumetric severity representing the shortfall in water delivery. For this case, the operation during the failure mode reduces total operation cost by $0.541M compared to the system with no failures. In other cases, the system cost increases due to failures, including when the volumetric severity is zero (e.g., failure modes 9 to 16 and 30) as water is delivered but through more expensive paths.
Failure . | Volumetric severity . | SS . | . | RPN . | Cost difference . | |||
---|---|---|---|---|---|---|---|---|
mode . | P . | NP . | P . | NP . | OS . | P . | NP . | ($) . |
1 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 | −0.541 |
2 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 | −0.434 |
3 | 0.47 | 0.54 | 4 | 5 | .32 | 8 | 10 | −0.203 |
4 | 0.13 | 0.54 | 2 | 5 | 3 | 6 | 15 | −0.041 |
5 | 0.06 | 0.54 | 1 | 5 | 3 | 3 | 15 | −0.040 |
6 | 0.15 | 0 | 2 | 1 | 2 | 4 | 2 | 0.150 |
7 | 0 | 0.35 | 1 | 3 | 2 | 2 | 6 | 0.062 |
8 | 0 | 0.14 | 1 | 2 | 2 | 2 | 4 | 0.153 |
9 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.216 |
10 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
11 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
12 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.236 |
13 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
14 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.236 |
15 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.239 |
16 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
17 | 0 | 0.46 | 1 | 4 | 2 | 2 | 8 | 0.033 |
18 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 | −0.271 |
19 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 | −0.136 |
20 | 0.28 | 0 | 2 | 1 | 2 | 4 | 2 | 0.019 |
21 | 0.22 | 0 | 2 | 1 | 2 | 4 | 2 | 0.058 |
22 | 0.09 | 0 | 1 | 1 | 2 | 2 | 2 | 0.173 |
23 | 0.17 | 0 | 2 | 1 | 2 | 4 | 2 | 0.101 |
24 | 0.04 | 0 | 1 | 1 | 2 | 2 | 2 | 0.206 |
25 | 0.13 | 0 | 2 | 1 | 2 | 4 | 2 | 0.133 |
26 | 0.12 | 0.33 | 2 | 3 | 4 | 8 | 12 | −0.273 |
27 | 0.16 | 0.12 | 2 | 2 | 4 | 8 | 8 | −0.184 |
28 | 0.10 | 0 | 2 | 1 | 4 | 8 | 4 | −0.073 |
29 | 0.09 | 0 | 1 | 1 | 4 | 4 | 4 | −0.063 |
30 | 0 | 0 | 1 | 1 | 4 | 4 | 4 | 0.232 |
31 | 0.07 | 0 | 1 | 1 | 4 | 4 | 4 | −0.411 |
32 | 0.02 | 0 | 1 | 1 | 4 | 4 | 4 | −0.008 |
33 | 0.06 | 0 | 1 | 1 | 4 | 4 | 4 | −0.407 |
Failure . | Volumetric severity . | SS . | . | RPN . | Cost difference . | |||
---|---|---|---|---|---|---|---|---|
mode . | P . | NP . | P . | NP . | OS . | P . | NP . | ($) . |
1 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 | −0.541 |
2 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 | −0.434 |
3 | 0.47 | 0.54 | 4 | 5 | .32 | 8 | 10 | −0.203 |
4 | 0.13 | 0.54 | 2 | 5 | 3 | 6 | 15 | −0.041 |
5 | 0.06 | 0.54 | 1 | 5 | 3 | 3 | 15 | −0.040 |
6 | 0.15 | 0 | 2 | 1 | 2 | 4 | 2 | 0.150 |
7 | 0 | 0.35 | 1 | 3 | 2 | 2 | 6 | 0.062 |
8 | 0 | 0.14 | 1 | 2 | 2 | 2 | 4 | 0.153 |
9 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.216 |
10 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
11 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
12 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.236 |
13 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
14 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.236 |
15 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.239 |
16 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | 0.235 |
17 | 0 | 0.46 | 1 | 4 | 2 | 2 | 8 | 0.033 |
18 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 | −0.271 |
19 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 | −0.136 |
20 | 0.28 | 0 | 2 | 1 | 2 | 4 | 2 | 0.019 |
21 | 0.22 | 0 | 2 | 1 | 2 | 4 | 2 | 0.058 |
22 | 0.09 | 0 | 1 | 1 | 2 | 2 | 2 | 0.173 |
23 | 0.17 | 0 | 2 | 1 | 2 | 4 | 2 | 0.101 |
24 | 0.04 | 0 | 1 | 1 | 2 | 2 | 2 | 0.206 |
25 | 0.13 | 0 | 2 | 1 | 2 | 4 | 2 | 0.133 |
26 | 0.12 | 0.33 | 2 | 3 | 4 | 8 | 12 | −0.273 |
27 | 0.16 | 0.12 | 2 | 2 | 4 | 8 | 8 | −0.184 |
28 | 0.10 | 0 | 2 | 1 | 4 | 8 | 4 | −0.073 |
29 | 0.09 | 0 | 1 | 1 | 4 | 4 | 4 | −0.063 |
30 | 0 | 0 | 1 | 1 | 4 | 4 | 4 | 0.232 |
31 | 0.07 | 0 | 1 | 1 | 4 | 4 | 4 | −0.411 |
32 | 0.02 | 0 | 1 | 1 | 4 | 4 | 4 | −0.008 |
33 | 0.06 | 0 | 1 | 1 | 4 | 4 | 4 | −0.407 |
Note: P and NP denote potable and non-potable users, respectively; SS and OS are severity scale value and occurrence scale value, respectively; cost difference is overall system cost under failure mode minus overall cost under normal condition.
The only path to deliver to the lower RESIN RWSS area is through the Zone C interconnection point pump station. As a result, this pump station is the most critical system element for the potable water supply and has the highest RPN of 16. For non-potable users, CAVSARP (failure mode 4) and SAVSARP (failure mode 5) failures result in the most severe impacts and largest RPNs.
Potable users must be provided with potable water while non-potable users have supply redundancy as they can be served by either potable or non-potable water. As a result, the potable user's RPN tends to be higher because they are served with only potable water. Non-potable water infrastructure failures, such as non-potable pump stations, TWRF, SRF, and DBO, have lower impact to non-potable users as compare to components related to supplying potable water to potable users.
RWSS MODIFICATION
The RPN values can help managers target infrastructure investments that reduce the volumetric severity or failure likelihoods. To reduce RPN values two alternative modifications were posed here: (1) decentralized wastewater treatment system and reuse system; and (2) a backup potable water supply source from the CWF. FMECAs were completed for the modified systems considering failure modes 1, 2, 3, 18, and 19 in Tables 2 and 4. These failure modes had the highest potable users' RPNs in the initial FMECA.
Decentralized wastewater treatment and reuse system
Centralized facilities are the primary wastewater and reuse system structure in the Tucson RWSS. This configuration benefits from the economies of scale of constructing a single WWTP facility. However, as water scarcity becomes an increasing concern, centralized systems may not be the most appropriate structure from a sustainable water resources management perspective (Gikas & Tchobanoglous 2009). Decentralized wastewater treatment and reuse systems are more expensive to construct than centralized systems. However, they can have operational benefits. For example, Woods et al. (2012) examined the economic and environmental benefits of decentralized wastewater treatment and reuse. They determined that a decentralized configuration was less costly from a total cost perspective and it produced less greenhouse gas emissions than a centralized system.
Here, we examine the impact of additional operation of decentralized wastewater treatment and reuse systems on RWSS resilience when a decentralized system was installed within the study area. To that end, decentralized wastewater treatment and reuse systems were added to the RESIN RWSS. In this configuration, wastewater is treated at satellite WWTPs (SP), as shown in Figure 9 for zone C. Reclaimed water can be delivered directly to non-potable users or sent to recharge and recovery facilities for eventual indirect potable reuse (IPR) or non-potable use. These waters are introduced in the distribution system in the SP's pressure zone and can be pumped to higher zones and we assume that system configurations do not permit flow to lower zones.
To examine the benefits of decentralization, 35 failure scenarios (Figure 10) were created based on six decentralized infrastructure systems (SP/IPR in Zones C, D, E, FS, GS, and HS). FMECA results are compared for potable and non-potable users served from those systems and the centralized RESIN RWSS for the five most critical components for potable users from the centralized system noted above.
FMECA results (Table 5) show that decentralized facilities strengthen the RESIN RWSS resilience. The operation of SP/IPR in the lowest pressure zone, Zone C, results in the largest RPN reduction when components in the centralized system fail (scenarios A1, A22, and A29). For example, the RPN for the centralized system was 16 for potable supply during a Zone C interconnection pump station failure (scenario A1) while a decentralized SP/IPR system in Zones C and D (scenarios A2 and A3) can meet the desired water demand and the resulting RPN is 2, respectively. Note that if no deficit occurs, the severity scale value is set to 1 so the RPN remains positive. When a satellite plant is located in Zone E (DS scenario 4), the RPN is reduced from 16 to 4 since only demands in Zone E and above are met due to the assumed system configuration.
Scenario . | Volumetric severity . | SS . | OS . | RPN . | System cost (M$) . | |||
---|---|---|---|---|---|---|---|---|
. | P . | NP . | P . | NP . | . | P . | NP . | . |
A1 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 | −0.541 |
A2 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.624 |
A3 | 0.07 | 0 | 2 | 1 | 2 | 4 | 2 | −8.713 |
A4 | 0.28 | 0 | 2 | 1 | 2 | 4 | 2 | −7.495 |
A5 | 0.56 | 0.11 | 5 | 2 | 2 | 10 | 4 | −8.603 |
A6 | 0.62 | 0.18 | 6 | 2 | 2 | 12 | 4 | −11.004 |
A7 | 0.68 | 0.26 | 6 | 2 | 2 | 12 | 4 | −8.969 |
A8 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 | −0.271 |
A9 | 0.78 | 0.29 | 7 | 2 | 2 | 14 | 4 | −9.489 |
A10 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.664 |
A11 | 0.22 | 0 | 2 | 1 | 2 | 4 | 2 | −7.457 |
A12 | 0.50 | 0.05 | 4 | 1 | 2 | 8 | 2 | −8.539 |
A13 | 0.56 | 0.11 | 5 | 2 | 2 | 10 | 4 | −10.957 |
A14 | 0.62 | 0.22 | 6 | 2 | 2 | 12 | 4 | −8.889 |
A15 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 | −0.136 |
A16 | 0.56 | 0 | 5 | 1 | 2 | 10 | 2 | −9.117 |
A17 | 0.56 | 1 | 5 | 2 | 2 | 10 | 4 | −9.584 |
A18 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −7.298 |
A19 | 0.28 | 0 | 2 | 1 | 2 | 4 | 2 | −8.376 |
A20 | 0.34 | 0 | 3 | 1 | 2 | 6 | 2 | −10.731 |
A21 | 0.4 | 0.02 | 3 | 1 | 2 | 6 | 2 | −8.649 |
A22 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 | −0.434 |
A23 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.626 |
A24 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.663 |
A25 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −7.278 |
A26 | 0.19 | 0.26 | 2 | 2 | 2 | 4 | 4 | −8.310 |
A27 | 0.25 | 0.32 | 2 | 3 | 2 | 4 | 6 | −10.829 |
A28 | 0.38 | 0.57 | 3 | 5 | 2 | 6 | 10 | −8.881 |
A29 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 | −0.203 |
A30 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.626 |
A31 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.663 |
A32 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −7.278 |
A33 | 0.19 | 0.26 | 2 | 2 | 2 | 4 | 4 | −8.310 |
A34 | 0.25 | 0.32 | 2 | 3 | 2 | 4 | 6 | −10.829 |
A35 | 0.38 | 0.57 | 3 | 5 | 2 | 6 | 10 | −8.881 |
Scenario . | Volumetric severity . | SS . | OS . | RPN . | System cost (M$) . | |||
---|---|---|---|---|---|---|---|---|
. | P . | NP . | P . | NP . | . | P . | NP . | . |
A1 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 | −0.541 |
A2 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.624 |
A3 | 0.07 | 0 | 2 | 1 | 2 | 4 | 2 | −8.713 |
A4 | 0.28 | 0 | 2 | 1 | 2 | 4 | 2 | −7.495 |
A5 | 0.56 | 0.11 | 5 | 2 | 2 | 10 | 4 | −8.603 |
A6 | 0.62 | 0.18 | 6 | 2 | 2 | 12 | 4 | −11.004 |
A7 | 0.68 | 0.26 | 6 | 2 | 2 | 12 | 4 | −8.969 |
A8 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 | −0.271 |
A9 | 0.78 | 0.29 | 7 | 2 | 2 | 14 | 4 | −9.489 |
A10 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.664 |
A11 | 0.22 | 0 | 2 | 1 | 2 | 4 | 2 | −7.457 |
A12 | 0.50 | 0.05 | 4 | 1 | 2 | 8 | 2 | −8.539 |
A13 | 0.56 | 0.11 | 5 | 2 | 2 | 10 | 4 | −10.957 |
A14 | 0.62 | 0.22 | 6 | 2 | 2 | 12 | 4 | −8.889 |
A15 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 | −0.136 |
A16 | 0.56 | 0 | 5 | 1 | 2 | 10 | 2 | −9.117 |
A17 | 0.56 | 1 | 5 | 2 | 2 | 10 | 4 | −9.584 |
A18 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −7.298 |
A19 | 0.28 | 0 | 2 | 1 | 2 | 4 | 2 | −8.376 |
A20 | 0.34 | 0 | 3 | 1 | 2 | 6 | 2 | −10.731 |
A21 | 0.4 | 0.02 | 3 | 1 | 2 | 6 | 2 | −8.649 |
A22 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 | −0.434 |
A23 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.626 |
A24 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.663 |
A25 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −7.278 |
A26 | 0.19 | 0.26 | 2 | 2 | 2 | 4 | 4 | −8.310 |
A27 | 0.25 | 0.32 | 2 | 3 | 2 | 4 | 6 | −10.829 |
A28 | 0.38 | 0.57 | 3 | 5 | 2 | 6 | 10 | −8.881 |
A29 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 | −0.203 |
A30 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.626 |
A31 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −8.663 |
A32 | 0 | 0 | 1 | 1 | 2 | 2 | 2 | −7.278 |
A33 | 0.19 | 0.26 | 2 | 2 | 2 | 4 | 4 | −8.310 |
A34 | 0.25 | 0.32 | 2 | 3 | 2 | 4 | 6 | −10.829 |
A35 | 0.38 | 0.57 | 3 | 5 | 2 | 6 | 10 | −8.881 |
Note: P and NP mean potable and non-potable users, respectively; SS and OS are severity scale value and occurrence scale value, respectively; cost difference is overall system cost under failure mode minus overall cost under normal condition.
In the RESIN RWSS, decentralization treatment for IPR provides another potable water source that is not isolated during failure events and can deliver water to most users. Similar trends are observed for the non-potable user's RPNs (Table 5). Constructing a SP/IPR facility in Zones E, FS, GS, and HS (scenarios A4, A5, A6, and A7, respectively) reduced the RPNs compared to the centralized system but did not completely satisfy demands under all failure modes. Similar patterns are observed for scenarios A22 to A28 and A29 to A35.
However, if a component fails within a pressure zone, Zone C is not the ideal location for SP/IPR from a resilience perspective. For example, the RPN for the centralized treatment system and SP/IPR in Zone C was 14 for potable supply under a Zone C reservoir/pump station failure (scenario A8). On the other hand, the RPN is 2 when a SP/IPR is in Zone D (scenario A10). Similarly, the RPNs for the centralized and Zone C or Zone D, SP/IPR systems were 10 under a Zone D reservoir/pump station failure (scenario A15) while a decentralized SP/IPR system in Zone E (scenario A18) can satisfy the desired water demand during Zone D reservoir/pump station failure providing an RPN of 2.
Alternative backup water source – maintaining the central well field
For many years Tucson's primary water resource was the aquifer directly below the city. Over-pumping of the so-called CWF resulted in significant aquifer overdraft and subsidence issues. The introduction of CAP water from the Colorado River has allowed the aquifer to recover by largely eliminating routine pumping. However, Tucson Water sees value in maintaining and periodically exercising the 51CWF wells/pumps as a backup source to meet critical demand conditions. Testing the Tucson RWSS robustness and redundancy components proved necessary in two events when the pipeline from the CWR (1) ruptured, or (2) required maintenance.
To examine the benefit of such a contingency source, we modified the Tucson RWSS configuration to introduce CWF supply to the RWSS RESIN area when a failure occurs. Costs to lift CWF water from the aquifer to the well head ($32.5/ML) were included in the model to reflect the CWF water delivery costs. The CWF can deliver water to either the Zone C interconnection point or Zone FN (Figure 11). Ten failure scenarios (Figure 12) were evaluated assuming that the CWF was available with results listed in Table 6.
. | Volumetric severity . | SS . | . | RPN . | |||
---|---|---|---|---|---|---|---|
Scenario . | P . | NP . | P . | NP . | OS . | P . | NP . |
B1 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 |
B2 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 |
B3 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 |
B4 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 |
B5 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 |
B6 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 |
B7 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 |
B8 | 0 | 0 | 1 | 1 | 2 | 2 | 2 |
B9 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 |
B10 | 0 | 0 | 1 | 1 | 2 | 2 | 2 |
. | Volumetric severity . | SS . | . | RPN . | |||
---|---|---|---|---|---|---|---|
Scenario . | P . | NP . | P . | NP . | OS . | P . | NP . |
B1 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 |
B2 | 0.82 | 0.41 | 8 | 3 | 2 | 16 | 6 |
B3 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 |
B4 | 0.78 | 0.33 | 7 | 3 | 2 | 14 | 6 |
B5 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 |
B6 | 0.56 | 0.12 | 5 | 2 | 2 | 10 | 4 |
B7 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 |
B8 | 0 | 0 | 1 | 1 | 2 | 2 | 2 |
B9 | 0.47 | 0.54 | 4 | 5 | 2 | 8 | 10 |
B10 | 0 | 0 | 1 | 1 | 2 | 2 | 2 |
Note: P and NP mean potable and non-potable users, respectively; SS and OS are severity scale value and occurrence scale value, respectively.
Demonstrating the value of investing in the backup system, all RWSS RESIN area demands are satisfied if the CWF is available during HUWTP and CWR failures. However, the CWF has no benefit during failure of the Zone C interconnection pump station (scenarios B1 and B2), the Zone C reservoir/pump station (scenarios B3 and B4), and Zone D reservoir/pump station failures (scenarios B5 and B6). To alleviate this problem, redundant transmission lines would be necessary to transport CWF water directly to higher pressure zones. With this additional infrastructure, if the Zone C reservoir/pump station fails, the CWF can provide the desired flows to higher pressure zones to meet demand.
CONCLUSIONS
FMECA is demonstrated as a useful structured method for conducting risk management during RWSS planning. The use of RPN avoids explicit definition of the likelihoods of rare events that is necessary for standard risk analyses. FMECA is well suited to RWSS assessments since component failures are unlikely to occur simultaneously. Further, it introduces decision-maker judgment with respect to the likelihood and consequences of failures. As demonstrated here, FMECA provides feedback to efficiently understand complex RWSS by comparing alternative designs.
In the Tucson RWSS application, 33 catastrophic and critical failure modes and an occurrence scale were identified. To determine failure consequences, an optimal network flow model was formulated to determine flow allocations that minimize the operational cost. The model was solved for each failure mode and the demand shortfall was converted to the severity scale values using a jointly developed relationship. Analogous to a risk calculation, the RPN is calculated as the product of severity and the occurrence likelihood. RPNs were ranked to identify the most critical RWSS elements for potable and non-potable users.
For the potable supply system, the pump station at the Zone C interconnection point was the most critical component since it conveys the majority of flow to the upper pressure zones. For delivery of non-potable water, the two recharge facilities were the most critical components. To improve the Tucson RWSS resilience, two modifications were examined: (1) locating a decentralized wastewater treatment and reuse system within the RESIN RWSS area reduced the RPN for all cases and often eliminated all deficits; and (2) by serving as a backup potable water supply, the CWF was completely effective in satisfying demands during failure of centralized system components such as water/WWTPs, recharge facilities, and a well field. However, the CWF did not maintain the Tucson RWSS functionality when failure occurs within the RESIN RWSS area since the CWF conveys water only through Zone C interconnection point and Zone FN. Therefore, to overcome this weakness, additional transmission lines to the RESIN RWSS area could be added to Zone D or Zone E potable water reservoirs.
The resilience benefit of maintaining the CWF is clear but FMECA does not consider its costs. Future research should be incorporated in a FMECA taking into account a decision-making framework that provides tradeoffs between cost and resilience. To identify multiple failures and quantify the results, the proposed FMECA should be in conjunction with fault-tree analysis. Furthermore, in this next demonstration setting, the failure timing, demand (as a function of population growth and water use), and supplies are assumed to be certain. Monte Carlo analysis that considers those parameters as uncertain is seen as a useful next research step.
ACKNOWLEDGEMENTS
This material is based in part upon work supported by the National Science Foundation (NSF) under grant no. 0835930. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.