Interactive decision support methodology for near real-time response to failure events in a water distribution network

The present study proposes a new interactive methodology and an interactive tool for the response to water network failure events facilitating near real-time decision-making. The proposed methodology considers (i) a structured yet ﬂ exible approach supporting and guiding the operator throughout the entire response process to water network failure events, while allowing the operator to have a ﬁ nal say; (ii) a novel interaction with the operator in near real time via the proposed tool (e.g. allowing operators to propose different ‘ what-if ’ scenarios without being hydraulic experts); (iii) the provision of automatically generated advice (e.g. optimal response solutions and assessed end-impacts) – although optimal response solutions not identi ﬁ ed in near real time yet and (iv) improved impact assessment using realistic impact indicators that cover different aspects of the event – which are consistently calculated for every proposed response solution (to facilitate easy comparison between different response solutions). The new methodology was applied on a semi-real case study. The results obtained demonstrated the potential of the new response methodology and its application through the interactive tool to improve water utilities ’ current practice. This was accomplished through supporting/guiding operators in the identi ﬁ cation of effective response solutions with low end-impact on the consumers and low cost for the utility.


INTRODUCTION
The water industry in the UK and worldwide faces considerable challenges in making effective use of sensor and other data that are collected in water distribution systems (WDSs) in near real time (typically every 15-30 mins).These data are still not used much in a water utility's control room, especially when it comes to identifying a suitable strategy to respond to failure events in near real time, i.e. events such as major pipe bursts, equipment failure or water treatment work (WTW) shutdowns.Relevant academic work has not adequately addressed this challenge mainly due to the focus on specific stages (i.e.isolation, impact assessment or intervention) rather than the overall response process.Furthermore, for an effective near realtime response, there is still a need to develop: (1) improved impact assessment methods that are based on realistic metrics used in the water industry and that are also used In this study, a novel response methodology that aims to fulfil the above needs is proposed.The new response methodology is implemented via an interactive decision-support tool entitled the Interactive Response Planning Tool (IRPT).The IRPT is used to guide/support operators in identifying an effective response solution in near real time (i.e. usually required up to 1 h after the event detection/localisation).The main aim of this study is to show the potential of the IRPT to improve utilities' current practice by supporting/guiding operators in the identification of low end-impact (i.e. the total impact after the implementation of the response solution) and low-cost response solutions.This paper is organised as follows.Firstly, background information relevant for the present study is presented.Subsequently, the new response methodology (including its concept, details of the indicators used in the impact assessment and information regarding the optimisation of the response interventions) is described.Later, results from a semi-real case study are presented and discussed.Finally, conclusions from the application of the new methodology and of the interactive tool are drawn.

BACKGROUND
An 'event' is denoted here as any failure that has a negative impact on a WDS's performance in terms of the water utility's temporary inability to deliver a regular service.An efficient event management process in WDSs can be divided into three principal stages: event detection, event rezoning, water injection) and that also enables identifying the best time for their implementation in the field in order to restore supply.

Current practice response methodology
Different utilities deal with events in a different way and use more or less structured approaches.This section briefly describes a response methodology mainly based on ad hoc response interventions that can be considered typical for the UK water sector.In this methodology, the response interventions are largely based on the experience and expert judgement of control room operators, despite various systems being used by the operators to support their decisions.
The detection of an event in a water utility is nowadays usually done in two possible ways: (a) through customer calls (i.e.reporting no water/low pressure/discolouration, etc.) and/or (b) through an automated detection system (i.e.alarms generated based on flow and/or pressure data).
Once the detected event is confirmed and approximately localised (e.g.roughly based on customer calls' addresses and/or using other semi-automated means), the utility typically mobilises some available water trucks, called Alternative Supply Vehicles (ASVs).This is done as an immediate restoration measure after an initial impact assessment usually carried out manually and/or with limited hydraulic model support.Here, an assessment involving the calculation of the water volume required to be supplied per hour (and hence the number of ASVs required per hour) based on the affected district metered areas (DMAs) normal water demand may also be carried out.At the same time, in the control room, after further manual (e.g. by checking service reservoirs' levels using online systems) and/or hydraulic model-supported initial impact assessment, operators request isolation of the event.Isolation is then carried out either as soon as possible (e.g. if the service reservoirs' levels are quickly dropping or there is significant thirdparty damage) or later in the day, depending on severity/ time of the event and other factors.There are also occasions where the repair can be conducted without isolating the failure (i.e.under pressure).If isolation is required, the isolation valves are usually identified manuallyas the closest operable valves to the event.With some ASVs already on site (or not), the control room operators then attempt to identify the most suitable response solution (e.g.how many more ASVs should be sent to the site, a suitable rezoning plan, overland bypasses, etc.) to be implemented while the repair is being carried out.Online map systems, offline connectivity maps, calculation sheets and hydraulic models can be used by the operators for this purpose.
Bearing in mind the above, it is worth stressing that despite using hydraulic models for some of the aforementioned activities can be considered as common practice, hydraulic analysis is not always carried out thoroughly due to limitations in terms of the time that can be dedicated to this activity, the skills required to run hydraulic simulations, the ability to only test a few scenarios and the difficulty to consistently assess their end-impact.

The concept
The new response methodology proposed in this paper consists of the following main steps: Step (1) initial impact assessment, Step (2) identification of the isolation plan, Step (3) manual identification of a response solution proposed by an operator, Step (4) automatic identification of a response solution generated using optimisation and Step (5) identification of the response solution to be implemented in the field.Note that these five steps do not need to be necessarily carried out in a sequential manner as presented here.
The implementation of the new methodology within the IRPT is conducted through the following three-stage routine in each step: Stage (1) involves obtaining the operators' inputs, Stage (2) involves carrying out hydraulic simulations to assess the end-impact/cost for each solution and Stage (3) involves visualising the calculated end-impact of each solution.The new response methodology's steps are described in more detail in the remainder of this section and are also shown as a flowchart in Figure 1.
Step 1.Following the confirmation (i.e.detection and localisation) of an event, an initial impact assessment is performed assuming the 'do nothing' scenario.At this point, the operators are asked if isolation needs to take place or if it can be carried out.
a.If yes, the operators create an event isolation plan in Step 2.
b.If no, they move to Step 3 to propose a manual solution.
Step 2. For the identification of the isolation plan, the best (i.e.closest to the event) set of valves is automatically provided to the operators by the IRPT and the operators are then asked if they are satisfied with this set (e.g. if the identified valves can been localised and are operable).If they are not satisfied, they ask the IRPT to automatically provide the next best set of isolation valves.As soon as the best set of isolation valves is selected, the operators input into the IRPT the isolation duration and different potential isolation start times.Then, the IRPT automatically calculates the end-impact of the different isolation start times, and these are presented to the operators.In view of the calculated end-impacts, the operators can then select a desired start time of isolation.Once the isolation plan is finalised, the operators are asked if they consider the resulting endimpact low.
a.If yes, they proceed with the implementation of the 'isolation only' final solution without applying any further intervention.The operators then move to Step 6. b.If no (or isolation is not possible), they then identify the more comprehensive response solution as follows.The operators are asked if they want the IRPT to automatically generate an optimal solution.i.If yes, they move to Step 4.
ii.If no, they proceed by proposing a manual solution in Step 3.
Step 3. In this step, the operators are able to propose a manual solution by interacting with the IRPT.Here, the IRPT firstly enables the operators to input their desired and available (e.g.accessible/operable rezoning valves) intervention(s) and the start time of this (these) intervention(s).It then provides decision support to the operators by assessing and visualising the end-impact/cost of the proposed manual solution.Then the operators are asked if they are satisfied with the end-impact/cost of their proposed manual solution.Step 5.In this step, the operators are first asked if they want to go back all the way to Step 1, with the modified system state used as a starting point.This is done to account for the fact that the situation may have changed in the meantime.If not, they can further modify a solution from Step 3 or Step 4, as well as proposing new solution.Here, the IRPT enables operators to compare all of the identified solutions consistently (i.e.consistent impact metrics) and with support of effective visualisations (e.g.multiple maps in a single window) of the end-impacts and costs.All this enables the operators to select the final solution they wish to implement.
Step 6.Once the system operation is back to normal, the operators identify the lessons learned.
In the IRPT, the hydraulic simulations are carried out by using EPANET2 (Rossman ) and pressure-driven net-  normal operating conditions but not in the pressuredeficient ones that occur during various system failures (e.g. during a pipe burst or some equipment failure).
Hence, in this study, the original EPANET2 hydraulic model is modified by using the approach proposed by Paez et al. ().This method works by adding suitably chosen dummy elements to the original EPANET2 model.This creates the pressure-driven model that is able to simulate hydraulic conditions in the network under both normal-and pressure-deficient conditions.Paez et al.'s pressure-driven method has been selected here as it was thoroughly tested, validated and demonstrated to work effectively on real-sized networks producing accurate hydraulic results (Paez et al. ).The use of this model provides additional computational burden for the calculation of different impact indicators but then so does any other pressure-driven methodthis is simply the price to pay for being able to simulate pressure-driven conditions in the pipe network.Finally, note that the selection of a pressuredriven model is not the focus of this study, i.e. any other reliable and accurate pressure-driven model can be used instead within the response methodology presented in this paper.
The IRPT also links to the Quantum Geographic Information System (QGIS) software to visualise the spatial distribution of end-impact on a suitable map of the analysed water system.
The key novelties of the new response methodology proposed are as follows: (i) structured yet flexible approach supporting and guiding the operator throughout the entire response process (from detection and localisation of a failure event to the implementation of the identified response solution in the field) while allowing the operator to have a final say, (ii) novel interaction with the operator in near

Impact assessment
The IRPT provides to operators the capability to automatically assess the end-impact of a proposed solution (i.e. in Stage 2 of each step of methodology) based on realistic metrics.In the IRPT, a consistent framework for endimpact assessment (i.e.same impact metrics calculated for every proposed response solution) is implemented.This CML is defined as the mean duration customers are without water supply (i.e.equivalent to pressure 3 m in the main) in a given reporting year.CML is a real-life indicator used in water utilities nowadays and is calculated for every discrete pressure area (DPA; i.e. discrete areas within a DMA).It is measured in minutes per customer (mins/cust).In this study, CML is found as follows: where Cust SI is the total number of customers in each DPA affected by supply interruption at least during one time-step (i.e. 15 mins) over the impact horizon; Dur SI is defined as the length of time for which properties are without a continuous supply of water in minsonly events with duration !3 h are taken into account; Cust is the total connected customers at year end (fixed number for each utility).
The AMLP indicator is defined as follows: where Cust LP is the total number of customers affected by low pressure (i.e.minimum_pressure < pressure < requir-ed_pressure, where minimum_pressure is usually considered in UK utilities as equal to 3 m and required_pressure as equal to 15 m) at least during one time-step (i.e. 15 mins) over the impact horizon; Dur LP is the average low pressure impact duration over the impact horizon in mins; Cust is the total number of connected customers at year end (fixed number for each utility).
The UW indicator is calculated as follows (Bicik et al.

)
: where T is the impact horizon (h), t is the simulation time (with assumed time-step of 15 mins ¼ 900 s), D i,req (t) is the requested demand at node i and time t in l/s, D i (t) is the delivered demand at node i and time t in l/s, Cust i,count is the number of customers supplied from demand node i.
Note that the requested water demand may be undelivered due to either complete interruption or low pressure (i.e. pressure < required_pressure).
where N p is the number of pipes in the network and Disc j, norm and Disc j,failure are the total discolouration risk of pipe j under normal and failure conditions, respectively.

Interventions optimisation
The IRPT provides to operators the capability to automatically identify a number of optimal solutions (i.e. in Step 4 of the methodology) by solving a two-objective optimisation problem.The two objectives are the minimisation of total end-impact (of a response solution) and the minimisation of the total cost associated with this solution.
The total (i.e.aggregated) end-impact is estimated by normalising and then adding up the values of the individual impact indicators defined in the previous section.Before aggregating, the normalised indicators are multiplied with specified weights based on the priority/preferences of the operators as follows: where i is the index of each impact indicator with i ε [1, 4]; f i is the normalised impact indicator i and w i is the weight of impact indicator i with P w i ¼ 1: The impact indicators are normalised in the range [0, 1] as follows: where x new is the normalised impact indicator value; x is the non-normalised impact indicator value; x min is the minimum impact indicator value and x max is the maximum impact indicator value.
The second objective function is the total cost of the selected response solution, calculated as follows: It is important to highlight here that apart from the type of interventions, the start time of each intervention is a decision variable too.Likewise, it is important to stress that rezoning is assumed to last until the repair is complete (i.e. as in utility's general practice) and, hence, its duration is not considered as a decision variable.ASV injection, on the other hand, is carried out until the tank (modelled at each injection point, see above) gets empty.This may happen before the repair is complete, depending on the water demand (under normal conditions) of the affected area.
In view of the above, each identified optimal solution takes the form of an action plan, as it was also done in It is stressed here that optimising for minimum endimpact and cost has multiple benefits for a utility.The most important benefit is reducing the impact on the customers which can be costly in many ways (financially but also in terms of reputation, etc.).A couple of other examples related to costs include: (1) operational savings in the long term as many events may occur each yearalthough the cost of a single response solution may be small (e.g.hundreds of pounds) and ( 2) less time spent on site for opening valves or injecting waterthis could benefit utilities in terms of more efficient scheduling of the technicians' activities.

CASE STUDY
The present case study aims to illustrate the benefit of a response solution identified through interaction with the IRPT (hereafter referred to as the 'New methodology response') by comparing it to a response solution based on utilities' current practice (hereafter referred to as the 'Current practice response').For this purpose, a semi-real case study (described hereunder) was considered.Then, the IRPT's steps are implemented for the case study's event in order to identify the 'New methodology response'.The 'New methodology response' is ultimately compared with the 'Current practice response', in order to demonstrate the benefit resulting from the operator's interaction with the IRPT.

Description of the semi-real case study
The case study used here is based on the following real system and event.On Saturday 2nd November 2019 at 14:00, a WTW that serves approximately 100,000 customers located in the North West of England shut down following observation of high turbidity levels.This event was due to a burst on a main within the WTW.The shutdown resulted in intermittent supply and low pressure to some customers.
The WTW remained shut until the quality of the water leaving the WTW could be assured to meet the required standards.The utility mobilised ASVs to the area and implemented network changes (i.e.rezoning) in order to minimise customer end-impact.Bottled water was delivered directly to priority services and sensitive customers.The repair was completed 24 h after the shutdown.
In the IRPT, the shutdown is modelled by closing the Because of all of the above, we refer to the case study under scrutiny as 'semi-real' (i.e. based on a real system and event, but with several simplifications and assumptions).Bearing in mind the typical response strategy described earlier, we refer to the response actions shown in Table 1 as the 'Current practice response' although they only approximate (i.e. in terms of total end-impact, start time of impact, affected areas, etc.) the actual real-life response.This said, it is also important to stress here that many factors may have influenced the actual response actions taken by the utility.These factors have not been accounted for in this study and, hence, the term 'Current practice response' should be construed accordingly.

New methodology response
In this section, the identification of the response solution through the IRPT's steps is presented.It is worth stressing at this point that Step 2 of the IRPT methodology is not applied here because the event is considered to be the shutdown (i.e. not the burst).Hence, in this case, operators do not need to ask the support of the IRPT for the identification of the best isolation start time and of the best isolation valves to close.

Initial impact assessment (methodology Step 1)
The first step of the methodology is to apply the initial impact assessment.Here, operators aim at assessing the initial end-impact over the impact horizon (i.e. until the repair is completed).Therefore, for the purposes of this work, they input into the IRPT the repair completion time as 24 h.This is because, despite the fact that the completion time can only roughly be estimated before the actual repair commences, 24 h is considered to be a reasonable period over which the repair of a major burst is likely to be carried out.Figure 2(a) shows the location of the considered service   injection is available (i.e.DMAs 003, 004, 010 and 007), presented in Figure 3 for the 'No response' scenario, he/she decides to start injecting into these DMAs 5 h after the shutdown (i.e. when the impact starts in the horizon).This is because he/she wants to allow plenty of time to mobilise the ASVs and also allow injection to start at 19:00 when a peak in demand is expected.He/she finally decides to rezone as soon as possible (here assuming 2 h after the shutdown to allow plenty of time for technicians to get to site), because rezoning for longer periods is expected to significantly reduce end-impact without increasing cost (i.e. rezoning duration does not affect cost, see Equation ( 7)).
Bearing in mind the above, it is worth stressing that the   For the calculation of the total end-impact, the maximum values of the impact indicators (i.e. for normalisation, see Equation ( 6)) are calculated as follows.
For the CML and AMLP, the total number of customers in the section of network under scrutiny is equal to 46,545, and the simulation duration is equal to 24 h ¼ 1,440 mins.
The maximum value of UW is equal to the total volume of water required to supply the whole section of network under normal operation (equal to 175,530 m 3 ).The maximum value of DRI is equal to the total number of pipes in the section of network (equal to 8,750 pipes

Identification of the final response plan (methodology
Step 5) After the optimisation (Step 4 of the methodology) is completed, in Step 5, the fictional operator decides to compare the identified optimal solutions with the 'New responsemanual' solution in order to identify the best response plan.Here, for illustration reasons, he/she selects one solution from the Pareto front in order to compare it with the 'New responsemanual' solution.The selected optimal solution (denoted hereafter as 'New responseoptimal', pointed with a black arrow in Figure 4) is a solution with significantly reduced end-impact for a small cost increase compared with the rest of the optimal solutions with less cost and bigger impact (i.e.solutions found at the left side of the selected one) on the Pareto front.Such a solution is quite likely to be selected by a decision-maker.
The values of the impact indicators of the 'New response optimal' solution (as well as those of the 'New response - The 'New responseoptimal' solution also suggests only one intervention, i.e. opening one rezoning valve which feeds DMA 005, starting 2 h after shutdown.No injection from ASVs is suggested, which explains the minimised cost (i.e.£55) of this solution.The significantly reduced total end-impact of the 'New responseoptimal' solution (i.e.5%), compared with the 11.1% of 'No response', is a consequence of starting the rezoning very early in the simulation (although only one valve is opened).It is also observed that the total end-impact of the 'New responsemanual' solution (i.e.4.5%) is not significantly lower than the total end-impact of the 'New responseoptimal' solution (i.e.5%).However, the cost of the 'New responseoptimal' solution (i.e.£55) is much lower than the cost of the 'New responsemanual' solution (i.e.£813).
Furthermore, Figure 5(c) shows that in the 'New responseoptimal' solution, the number of affected customers with SI has been reduced when compared with the 'No response'.However, in the 'New responsemanual' solution, the affected area is smaller than the affected area in the 'New responseoptimal' solution (e.g.DMA 011 is not affected with CML when applying the 'New response manual' solution, but there is CML impact when applying the 'New responseoptimal' solution).In both solutions, the hospital is not affected anymore (see also the pressure  assumed that the fictional operator is more likely to select the 'New responseoptimal' solution because of the minimum DRI (and small cost), as well as relatively low values of all the other impact indicators.The 'New responseoptimal' solution is therefore considered to be the 'New methodology response' in the remainder of this paper.
This step can be completed in a number of minutes in a water utility control room, when a limited number of comparisons takes place.In this case study, operators carried out a single comparison between the 'New responsemanual' and the 'New responseoptimal'; hence, it is assumed that this step was completed in approximately 5 mins (as an example).From the above, it is shown that the whole response methodology can be implemented within 1 h, required for the near real-time response decision-making.
Comparison between 'Current practice response' and 'New methodology response' In Table 2, CML, AMLP, UW, DRI, cost and total endimpact calculated by the IRPT for the 'Current practice response' described in Table 1 are  In light of the above, it can be concluded that the 'New methodology response' identified through interaction with the IRPT outperforms the 'Current practice response'.

CONCLUSIONS
This paper presents a novel overall response methodology that aims to support/guide water utility operators in in a consistent manner to facilitate easy comparison between different response solutions (i.e. a response intervention or a set of interventions), (2) more realistic selection of response interventions to be implemented (e.g. based on operational costs, the availability of different types of interventions, etc.) and (3) effective interaction with the control room operators that takes into account their expert judgement, preferences and experience.
localisation and event response (Vamvakeridou-Lyroudia et al. ; Romano et al. ; Jung et al. ; Kapelan et al. ).The first two stages involve detecting and localising the event in the network and raising the relevant alarm.The third stage is associated with the decisions and actions required to reduce and, ultimately, eliminate the negative impact of the event on the consumers ( Jeong et al. ; Bicik et al. ; Nayak & Turnquist ).The first two stages have been researched extensively in the literature (Bicik et al. ; Romano et al. ; Casillas Ponce et al. ; Romano et al. ; Jung et al. ; Okeya et al. ; Laucelli et al. ; Romano ; Zhou et al. ).Hence, the focus of this paper is on the event response stage.The event response stage typically includes two substages, namely isolation and recovery (Vamvakeridou-Lyroudia et al. , Mahmoud et al. ).The isolation substage aims to minimise the negative initial impact of an event and prepare the affected part of the network for follow-on repairs.This sub-stage has been thoroughly studied in the past by several authors such as Jun & Loganathan () and Giustolisi & Savic ().Hence, it is not the subject of the present work.The recovery substage, on the other hand, involves impact assessment of the event and selection of the best response solution.This sub-stage is the focus of this study.Several methods for event impact assessment, such as Kapelan et al. (), Kao & Li (), Giustolisi et al. (), Bicik et al. () and Qi et al. (), have been proposed in the literature.These methods are all based on rather theoretical impact assessment metrics.This aspect is, therefore, further investigated in this work by developing and using a wider range of improved impact assessment metrics that are based on real-life indicators used by water utilities.The selection of the best response solution to implement is strongly dependent on the preceding steps of the event management process.An effective event management process should be based on an integrated methodology that takes into account all the preceding stages.However, in the literature, the recovery sub-stage has been approached mainly through (1) proposing theoretical mitigating methods for the minimisation of the consequences of a physical attack (e.g.Jeong et al. ; Jeong & Abraham ; Turner et al. ) and (2) proposing generic decision-support systems (DSSs).For example, Bicik et al. () proposed a general risk-based DSS methodology for supporting operators in decision-making against a failure.Vamvakeridou-Lyroudia et al. () proposed an integrated Intervention Management Model, in the context of a general DSS methodology for operational WDS management.Finally, Mahmoud et al. () proposed an integrated methodology for the near real-time response to pipe burst events in WDSs.All these methods, however, are rather academic in nature as they do not take into account the complexities of real-world response problems.This aspect is accounted for in this work by making use of the control room operators' extensive knowledge and experience via their interaction with the IRPT.Paez et al. (), in their summary paper, presented several methods for the response to WDS events after an earthquake disaster (i.e. they considered real-life and complex pipe network failures).Later, Zhang et al. () proposed an optimisation-based framework to maximise resilience of a WDS after a disaster-type event (e.g.earthquake).However, both these studies identified the optimum set of response interventions that includes pipe repair or replacement only (i.e.without proposing different types of response interventions).This limitation is circumvented in this work by developing a methodology that utilises multiple intervention types (e.g. a.If yes, a final solution has been found and the operators move to Step 6. b.If no, the operators are allowed to propose alternative manual solutions and compare their end-impact/cost by moving to Step 5 or ask the IRPT to automatically generate an optimal solution in Step 4. Step 4. Operators input into the IRPT all the desired and available interventions, as well as a time range in which the various interventions could start.Then, the optimisation runs and optimal solutions are automatically generated and assessed by the IRPT.Operators are then able to select one (or more) optimal solution(s) on the Pareto front (depending on whether or not the end-impact/cost is low).Finally, they are asked if they wish to further modify this (these) solution(s) manually.a.If yes, they move to Step 5. b.If no, a final solution has been found and they move to Step 6.
work modelling based on methodology developed by Paez et al. ().The demand-driven analysis conducted by EPANET2 accurately estimates the nodal demands in
real time (i.e. up to 1 h) via the IRPT (e.g.'what-if' scenarios) without hydraulic expertise requirements, (iii) provision of automatically generated advices (e.g.optimal response interventions and assessed impacts)although the optimal response interventions are not yet provided in near real time due to the long (e.g.several hours) optimisation time currently required (longer than the time typically available in a control room for identifying a response), (iv) improved impact assessment (based on realistic impact indicators) that covers different aspects of the eventwhich are consistently calculated for every proposed response intervention (to facilitate easy comparison between different response solutions) and (v) more realistic selection of operational interventions (based on operational costs, the availability of different types of interventions, etc.).
facilitates the comparison of different response solutions (in Step 5 of the methodology) and enables more informed decision-making.Furthermore, the IRPT allows the operators to perform this comparison without the need for them to be hydraulic model experts.The impact indicators proposed in this paper have been developed bearing in mind the UK water industry practice as well as previous relevant literature (e.g.Bicik et al. ).Most of these indicators have not been used before in this context (at least in the published literature).The following aspects of end-impact are considered here: water supply interruption, low pressure impact and discolouration risk increase (DRI) impact.More specifically, the following indicators are used: (1) customer minutes lost (CML), (2) average minutes low pressure (AMLP), (3) unaccounted for water (UW) and (4) DRI.AMLP and UW are calculated for different customer types, namely residential, industrial and sensitive (i.e.schools and hospitals).The impact horizon in the new response methodology is the period of time for which the end-impact is assessed.It starts from the detection/localisation time of an event and lasts until the repair is completed (i.e.time period over which restoration interventions can be implemented).
The DRI is estimated based on a combination of the methods found inBeuken et  al. () and Bicik et al. ().Beuken et al. () suggest to calculate discolouration risk based on minimum and maximum velocities and maximum flow rates.Here, the minimum and maximum flow velocities in an average demand day are calculated with a hydraulic model for each pipe.The same model is used to estimate the largest flow rate for each pipe under the same demand conditions.Then, a score is assigned to each pipe for each discolouration risk type (i.e. based on velocities and on flow rates).Hence, a score of 1 means low, a score of 2 means moderate and a score of 3 means high discolouration risk.The discolouration risk for every pipe is calculated as the sum of the scores based on both velocity and flow rate.The resulting discolouration risk scores are grouped in five severity categories (Beuken et al. ), i.e. 'VERY LOW' with a total score of 2, 'LOW' with a total score of 3, 'MODERATE' with a total score of 4, 'HIGH' with a total score of 5 and 'VERY HIGH' with a total score of 6.Once the discolouration risk score for every pipe has been found, the DRI for every pipe can be calculated as the difference between the discolouration risk score under 'failure' and normal conditions.The 'failure' condition is defined here as the WDS condition after the occurrence of the event and/or the implementation of the intervention(s).The DRI is then ranked based on the total score increase, i.e. 'NO RISK' with a total score increase equal to 0, 'LOW INCREASE' with a total score increase equal to 1, 'MODERATE INCREASE' with a total score increase equal to 2 or 3 and 'HIGH INCREASE' with a total score increase equal to 4. Following the calculation of the DRI for every pipe in the network, the number of pipes with at least 'LOW INCREASE' (i.e. with total score increase equal to 1 or higher) is used to estimate the DRI (based on a modification from the equation in Bicik et al. ): ) where c rez is the cost (£) per hour of manipulating (i.e.opening or closing) a single rezoning valve; d rez is the time it takes to open and close a single rezoning valve (in h); N rez is the number of rezoning valves to open/close in the specific response solution; c ASV is the cost (£) per hour of ASV injection and h ASV is the total time of ASV injection (i.e. hours of injection from all the ASVs sent to site).The above hourly costs (i.e.c rez , c ASV ) can be calculated from the total employee rechargeable (i.e.equal to the inflated value of the average employee costs) divided by the number of working days per annum.It is stressed that the cost function presented here does not aim to calculate the precise cost of a specific solution.It mainly aims to point out the cost difference between different solutions and to identify those solutions that reduce significantly the endimpact with least increase in cost, as these solutions are likely to be selected by the decision-makers.The decision variables of the optimisation problem are (1) the operational interventions used and (2) the start times of their implementation.The operational interventions considered in this methodology are (1) rezoning by valve manipulations (i.e.opening of initially closed boundary valves), (2) water injection at different network locations and (3) combination of these.Water injection, which is a novel type of intervention considered in this study, is carried out through the ASVs.In this study, an ASV is modelled as a tank linked to the injection point through a pump (to manage the pressure pumped into the network) and a valve (to allow water flow from the tank to the system).Usually, utilities dispatch three ASVs of 30 m 3 to every injection point in order to guarantee continuous supply to the affected node/customers.In this study, to simplify the coding required in the IRPT, one artificial ASV with volume equal to 90 m 3 (i.e. 3 × 30 m 3 ) is modelled at each injection point.

Sophocleous
et al. ().The Non-Dominated Sorting Genetic Algorithm II or NSGA II (Deb et al. ) is used in this paper to solve the optimisation problem.This method has already been proved to be appropriate for solving a similar optimisation problem (Mahmoud et al. ).The mathematical description of the present multi-objective problem, as well as the optimisation constraints considered in this study, can be found in the Supplementary Material.
pipe downstream the service reservoir directly fed by the WTW (the WTW feeds this service reservoir only), in order to facilitate the hydraulic simulations.As far as the actual utility's response actions are concerned, a number of simplifications and assumptions were made to simplify the coding required in the IRPT.For example, ASV injection at each point is carried out by using a single artificial ASV (equivalent to the 3 × 30 m 3 ASVs usually sent to every injection point).However, in reality ASVs supplied water intermittently at some injection points (i.e.started at different times during the event and did not inject water consecutively) and with more than three ASVs used in some cases.Additionally, the rezoning valves considered in the IRPT's simulations do not necessarily coincide with the rezoning valves actually used by the utility during the event.This is due to the fact that the hydraulic model used did not precisely reflect the real valves' layout.Finally, the actual start times of the interventions have been rounded to the next hour (e.g. if an intervention started at 19:30 in real-life, then in the IRPT it assumed to start at 20:00).

Figure 2
Figure 2 | (a) Location of the considered service reservoir (fed by the WTW), simulated closed pipe P8703, schools, industrial users, hospital and DMAs; and (b) location of the selected available interventions (i.e.rezoning valves and ASV injection points).
reservoir and the downstream pipe that was closed (i.e.pipe P8703) for modelling the shutdown, as well as the location of the industrial users, schools, hospital and the network model's DMAs (each DMA represented with different colouration).The values of the four impact indicators, for the initial condition of the system (denoted hereafter as 'No response'), are calculated in the IRPT.Considering that the total number of customers registered in the utility is equal to 3,293,080 (value obtained by the utility), P min is equal to 3 m and P req is equal to 15 m (as applied in water utilities' practice), then CML is equal to 4.0 mins/ cust, AMLP is equal to 3.6 mins/cust, UW is equal to 3,330 m 3 and DRI is equal to 14 pipes (out of the 8,950 pipes in this section of network).This results in a total initial end-impact as equal to 11.1%.The location of the affected customers with supply interruption (SI) for more than 3 h is shown in Figure5(a) (with purple-gradient colouration), in sets of 3 h (i.e.3-6, 6-9, 9-12, >12 h).The above impact values (computed over the 24-h impact horizon) highlight the significance of this event that affected a wide area comprising different DMAs.The affected area also includes two schools and one industrial node and the hospital (all purple-gradient coloured depending on the SI duration).However, the risk of discolouration (or DRI) could be considered low (i.e.only 14 pipes are at high risk).Using the IRPT, operators are also able to check the pressure over the impact horizon and, hence, get a view of when the aforementioned affected customers start getting end-impact.At the bottom of Figure5(a), the pressure graph for DMA 005 (selected as an example here because the hospital is located in that DMA) is presented.It can be noticed that DMA 005 (and, hence, the hospital too) starts being affected approximately 5 h after the shutdown, if nothing is done.The IRPT also provides the capability to visualise the other aspects of the impact, such as the low pressure duration at each node, the volume of undelivered water at each node and the DRI at each pipe, as well as for different DMAs, in a similar way as shown in Figure 5(a).All this is a significant advantage over what done/available as part of current practices.This step takes approximately 3 mins to be completed on the PC used in this study (Intel processor, Core i5-6200U CPU at 2.30 GHz and 64-bit Windows 7), i.e. as long as a single impact evaluation takes for the present complex network.Manually proposed solution (methodology Step 3) For the purposes of this work, a fictional operator proposes a realistic (i.e. that could have potentially be identified in the utility's control room) manual solution (denoted hereafter as 'New responsemanual') in Step 3 of the methodology, after having carried out the initial end-impact assessment.The available interventions are shown in Figure 2(b).Looking at the initial end-impact (in Figure 5(a)), the fictional operator decides to inject water into the affected DMAs 003, 004, 010 and 007 and rezone-affected DMAs 005, 006 and 008 by opening all the available rezoning valves.This is because the fictional operator wants to intervene into all affected DMAs where available interventions exist.Then, looking at the pressure graphs of the DMAs where ASV

Figure 3 |
Figure 3 | Pressure vs. time of different DMAs for the 'No response' case.

Figure 5 |
Figure 5 | Customers affected with SI and pressure graph for the (a) 'No response'; (b) 'New response -manual' and (c) 'New response -optimal'.
manual' and 'No response') are shown in Table 2.As can be seen from this table, the values of CML and UW are significantly reduced in the 'New responseoptimal' compared with the 'No response' case.This implies that the considered weight factors were effective in this problem.It is also observed that the 'New responsemanual' solution obtains smaller impact values for the CML, AMLP and UW compared with the 'New responseoptimal' solution.However, cost and DRI are significantly reduced in the 'New responseoptimal' solution compared with the 'New responsemanual' solution due to the optimisation enforcing cost minimisation.As expected, minimisation of cost function has also reduced the number of rezoning valves to open and the injection time, resulting in reduced disturbance in pipe flows, and consequently reduced risk of discolouration.
Figure 5(b) and 5(c) operators are also informed that DMA 007 still has high SI impact (i.e.almost the whole DMA is affected with SI >12 h, although the total CML is low) in both 'New responsemanual' and 'New response optimal' solutions.It is reminded at this point that the only available intervention in DMA 007 is injection from two ASV points (see Figure 2(b)).Hence, the IRPT also serves the purpose of informing the operators that they should look into more available ASV points and/or available rezoning (e.g. from adjacent unaffected DMA 002) in DMA 007.At this point in time (i.e. in Step 5 of methodology), based on the information obtained by using the IRPT, the fictional operator has to make the following decision: (1) apply the 'New responsemanual' solution due to the reduced CML, AMLP and UW when compared with the 'No response', (2) apply the 'New responseoptimal' solution where CML, AMLP and UW are also reduced when compared with the 'No response' (although higher than those in the 'New responsemanual' solution) but with cost and DRI impact much lower than the cost and DRI impact of the 'New responsemanual' solution, (3) test/ assess a different manual solution (i.e.'what-if' scenario) and compare it with the other identified solutions or (4) select a different optimal solution from the Pareto front in Figure 4 and compare it with the other identified solutions.For the purpose of this work, the different 'what-if' scenarios and the different optimal solutions are discounted due to space limitation.Hence, based on the results obtained, it is shown.It can be noticed that even though CML, AMLP and UW are reduced when compared with the 'No response' scenario, the 'New methodology response' (i.e. the 'New responseoptimal' solution) offers further improvements.Indeed, the 'New methodology response' further reduced all impact indicators (especially DRI and cost), except AMLP which remained the same.The 'New methodology response' also suggested fewer interventions to implement (i.e.opening of only one rezoning valve compared with opening of 12 valves and injecting from 5 points in the 'Current practice response'), justifying the significant improvement in DRI and cost.
making better informed decisions regarding water network failures.The new response methodology considers (1) improved impact assessment methods (based on realistic metrics used in the water industry), (2) consistent impact assessment (i.e.impact metrics consistently calculated for every response solution to facilitate easy comparison), (3) provision of automatically generated advices (e.g.optimal response interventions and assessed end-impacts), (4) more realistic selection of operational interventions (based on operational costs, the availability of different types of interventions, etc.) and (5) novel near real-time (i.e.identification of a solution within 1 h after event detection/localisation) interaction with the control room operator that takes into account their expert judgement/ experience (e.g.proposing 'what-if' scenarios) without hydraulic model expertise requirements.The new methodology is implemented via an interactive decision-support tool aiming to support operators in making better informed decisions.The application of the new methodology on a semi-real case study showed that the tool enabled operators to identify a more effective response solution (i.e.reduced end-impact and cost) compared with the 'Current practice response'.This is because the tool allowed operators to compare alternative response strategies (i.e.manually created by the operator and automatically generated by the IRPT through optimisation).This comparison was facilitated by the consistent impact assessment (i.e.same metrics assessed for each solution) used in the tool, as well as by the comprehensible impact metrics (i.e.well-known metrics in utilities), impact coverage (shown in maps) and cost of different solutions (shown in graphs).Hence, this application showed the potential of the IRPT to be used by utilities to make better and more informed decisions.All the methodology's steps apart from the optimisation step can currently be conducted in near real time (i.e. to identify a solution within 1 h after event detection/localisation).Future work on further improvement of the proposed methodology and tool will focus on: (1) the improvement of the DRI index (e.g. to take into account the flow direction into the pipe network, as well as the pipe material), (2) the improvement of ASV injection modelling (i.e.injection in one point at different times in the day should be modelled) and (3) the improvement of the optimisation methodology in order to identify optimum solutions faster (i.e. in near real time).The latter point is of crucial importance to effectively enable optimal solutions to be used in the near real-time response framework presented in this paper.

Table 1 |
Semi-real case study's event timeline and 'Current practice response'

Table 2 |
Total end-impact and cost of 'No response', 'Current practice response', 'New response -manual' and 'New response -optimal' (or 'New methodology response')