Abstract
Placing fixed water quality monitoring stations in a water distribution system can greatly improve the security of the system via prompt detection of poor water quality. In the event that a harmful substance is injected into a water distribution system, large populations can be put at risk of exposure to the contaminant. Promptly detecting the presence of a contaminant will reduce the number of people put at risk of exposure. However, to protect against a wide variety of possible contaminants, a water quality monitoring station will need to identify contamination via recognition of anomalous changes in a suite of surrogate water quality indicators (chlorine, pH, etc.). This work attempts to place water quality monitoring stations within the water distribution at locations that best detect contamination events via surrogate water quality signals. Networks of water quality monitoring stations are designed to minimize the population affected prior to contamination event detection, and simultaneously minimize the expected number of false positive detections, under uncertain water quality conditions. Solutions generated in this study are compared to solutions designed via classical detection methods. Results show the sensor networks designed without consideration to detection via surrogate water quality parameters have higher false positive detection rates.
INTRODUCTION
Within a water distribution system (WDS), water quality is monitored and maintained at or near the sources, i.e. water treatment plants, desalination plants, etc. Throughout the WDS, downstream water quality is then assumed to be satisfactory given that the pipe network is closed and secure, and that the inlet water quality is adequate for the hydraulic travel times between the sources and the consumers. In a system that is perfectly understood and perfectly secured, the assumption that water quality management at the inlet is sufficient may be valid. However, in real-world WDSs the rate of water quality deterioration will not be perfectly known, and network intrusions and/or backflows can dramatically reduce water quality within a WDS. An intrusion of a harmful substance can expose downstream populations to potential ingestion, and infection or poisoning by harmful contaminants.
Placing water quality monitoring stations (WQMSs), or sensors, within a water distribution system has shown to improve a managing authority's ability to detect a contamination, and reduce the potential size of a population put at risk. Although it is beneficial to monitor a large number of water quality parameters, budget, space, and accessibility constraints will limit the number of parameters a WQMS can measure. Instead of attempting to identify a contamination via detecting the presence of some contaminant, a contamination will be identified by observing anomalous readings in a small suite of water quality parameters as a response to the presence of a contaminant.
Typically, the performance of an early warning system (EWS) composed of water quality sensors is highly sensitive to the locations at which sensors are placed. Initial studies in WQMS placement identified the best locations to place WQMSs according to demand based coverage, and assumed that water upstream from a WQMS was also considered ‘safe’ (Lee & Deininger 1992; Kumar et al. 1997). Later methods explicitly placed sensors to detect contamination intrusion events (Kessler et al. 1998) by defining a minimum allowable volume of contaminated water delivered prior to detection, termed the ‘level of service’. A network of WQMSs was then designed to meet the prescribed level of service. More advanced optimization methods were later applied to identify WQMS locations according to water quality simulation results, including mixed integer programing (MIP), genetic algorithms, and greedy heuristic algorithms. Ostfeld & Salomons (2004) proposed a genetic algorithm to determine the locations to place WQMSs. Performance of each WQMS network was evaluated against a large number of randomly generated contamination events. By evaluating solutions against a large number of random contamination events, the solutions are expected to perform well against any single contamination event. Berry et al. (2006) cast the sensor placement problem as an MIP to minimize the mass consumed following a contamination event. A large number of teams attempted to solve the sensor placement problem during the Battle of the Water Sensor Networks (BWSN) (Ostfeld et al. 2008). A set of 5 and 20 sensors was placed to minimize the: 1. time required to detect a contamination event, 2. population affected prior to detection of a contamination, 3. mass delivered prior to detection of a contamination event, and to maximize the likelihood of detection of a contamination event. Methods of the BWSN included genetic algorithms, mixed integer programing, a greedy algorithm, a demand based heuristic, an engineering strawman approach, and population based heuristic evolutionary algorithms. Specifically, the greedy algorithm proved highly effective, and was further studied (Krause et al. 2008) for sensor placement on large WDSs. Krause et al. (2008) structured the population affected objective function as a submodular function reporting the population ‘protected’ as opposed to the population affected as a function of the sensor network. Doing so allowed the proposed greedy algorithm to efficiently solve the sensor placement problem to near optimality, even on large water distribution systems. Later studies further evaluated sensor unit performance by assuming ‘imperfect’ detection (Berry et al. 2009), incorporated risk based objective functions for sensor placement (Weickgenannt et al. 2010), and uncertainties in the sensor placement problem (Xu et al. 2010; Comboul & Ghanem 2013).
In parallel, research examined the task of detecting anomalous changes in water quality measurements to identify a contamination in the system. Numerous algorithms have been developed and tested to identify true contamination events from realistic water quality data including: basic incremental outlier detection algorithms, and linear filters (Mckenna et al. 2008); artificial neural networks (Perelman et al. 2012; Arad et al. 2013); multi-variate classifiers (Oliker & Ostfeld 2014); and model based event detection algorithm incorporating variable contamination event injection concentration and durations, as well as uncertain consumer demands (Yang & Boccelli 2016b). Event detection studies have also considered a system wide event detection algorithm (EDA), where water quality signals from different locations are integrated to better identify true positives and reduce false positives (Koch & Mckenna 2011; Oliker et al. 2016; Yang & Boccelli 2017). Integrating the water quality signals across multiple sensors improved the performance of event detection algorithms, and provided the best performing methods for contamination event detection to date. However, no studies have attempted to place water quality sensors to best perform specifically with respect to the water quality parameters observed at those locations.
Budget constraints are expected to limit the number of sensors that can be deployed, and accordingly, water quality data will be sparsely distributed throughout the WDS. As shown above, this issue has catalyzed a large amount of research to determine what methods can be used to best place WQMSs within a WDS. However, a majority of previous work has placed sensors within the WDS based on the presence of a contaminant, not based on the response of water quality parameters that a WQMS would observe, due to the presence of a contaminant. This limitation may lead to a large gap in the expected performance and the true performance of a WQMS used for contamination detection as a location that may be best for detecting a contaminant, may not be the best for detection of anomalous water quality signals.
This study aims to determine how WQMS locations influence the performance of an early warning system for contamination event detection, specifically when WQMSs observe surrogate water quality signals. This method allows the system design phase to incorporate the effects of background water quality uncertainty into the contamination detection task. Placing sensors under deterministic operating conditions and with conservative contaminants has provided strong baselines for sensor placement, however, these systems may not perform well in a real-world scenario where uncertainties are present. As shown in Figures 1 and 2, following injection the contaminant itself is often propagated throughout the WDS relatively evenly, however, the water quality signals can vary throughout the WDS as a function of operational conditions, and WDS system design. By incorporating background water quality uncertainty into the sensor placement task, we expect that sensors will be placed at network locations that provide water quality signals most indicative of a contamination, and least sensitive to changes in the background water quality signals.
METHODOLOGY
Water quality simulations were performed within EPANET-MSX (Shang et al. 2007). EPANET-MSX uses the hydraulic simulation data calculated within EPANET 2 (Rossman 2000) and then solves the multi-species chemical equilibrium and reaction equations throughout the simulated water distribution system for a defined simulation time. Nicotine was used as a contaminant to be injected into the network for optimizing WQMS locations. To analyze the sensitivity of the WQMS placements to the water quality model used for optimization, a second water quality model was applied and a different contaminant was simulated to be injected into the network, Parathion. Water quality monitoring stations were assumed to monitor three water quality parameters, free chlorine, pH, and alkalinity. Accordingly, the respective reaction and equilibrium equations governing an intrusion of Nicotine and Parathion were applied from the previous studies of Yang & Boccelli (2016a), and Schwartz et al. (2014), respectively, described in the section below.
Water quality simulations
For both water quality models, chlorine input was defined at a source location to simulate chlorination in the water distribution system. Preliminary simulations showed the chlorine concentrations throughout the network to be cyclical with respect to time and correlated with pumping schedules changing the hydraulic regime of the WDS. To reduce the cyclical nature of the chlorine concentration throughout the simulation, water quality simulations were modeled for 12 days, allowing background chlorine levels to reach a relatively steady state. In the baseline model, after the 12-day simulation, contamination events were defined to take place at any point between the 15th and 16th day of the simulation at any node in the network as an instantaneous injection of 5 kg of contaminant. To reduce the computational burden of the water quality simulations, the water quality parameters at the end of the 12-day simulation were saved and a 5-day simulation was initialized with the water quality parameters calculated at the end of the 12-day simulation. To further test the solutions developed using the baseline Nicotine model, a sensitivity analysis was performed by setting the contamination injection duration to 12 hours. A 15-minute water quality timestep was defined for the water quality simulations. It is important to note that the water quality models for the two contaminants are different; the Nicotine model uses a two species second order decay model, and the Parathion model uses a first order decay model. This difference leads to discrepancies in the background chlorine levels in the two models, and thus, Nicotine was considered as the main contaminant of interest for this study. The Parathion injection was used in a sensitivity analysis to compare the sensor networks against a contaminant and water quality models that were not used during the optimization phase.
Nicotine model
Reaction coefficient . | Value . |
---|---|
0.0015 | |
0.0045 | |
0.028 | |
0.239 |
Reaction coefficient . | Value . |
---|---|
0.0015 | |
0.0045 | |
0.028 | |
0.239 |
Parathion model
Parathion was chosen as a second contaminant, used to test the sensitivity of the sensor network to a contamination and a water quality model ‘unknown’ during the optimization phase. Opposed to Nicotine, Parathion injections affect the background pH and alkalinity signals; combined with its effects on the background chlorine signal, Parathion is expected to provide a stronger signal in surrogate water quality parameters indicative of a contamination. Parathion is also a more toxic substance; while Nicotine has a of roughly 9.7 mg/kg (Mayer 2014), Parathion is highly toxic, with a of 2 mg/kg (Schwartz et al. 2014). Parathion should present a case in which the EWS will receive a strong, combined signal from all water quality parameters, however, given the low the system will require especially prompt detection to reduce the population affected. For a single EDA however, the two contaminants require parameterizing an EDA such that it will recognize either contamination signal, while still maintaining a low false positive detection rate for both the Nicotine and Parathion contaminations.
Parameter name . | Value . | Description . |
---|---|---|
Reaction rate coefficient between PA and HOCl | ||
Hydrolysis rate coefficient for PA | ||
– | ||
– | ||
Hydrolysis rate coefficient for PAO | ||
– | ||
– | ||
Reaction rate coefficient between PA and OCl | ||
Hydrolysis rate coefficient between PAO and OCl | ||
– | ||
First order chlorine decay rate |
Parameter name . | Value . | Description . |
---|---|---|
Reaction rate coefficient between PA and HOCl | ||
Hydrolysis rate coefficient for PA | ||
– | ||
– | ||
Hydrolysis rate coefficient for PAO | ||
– | ||
– | ||
Reaction rate coefficient between PA and OCl | ||
Hydrolysis rate coefficient between PAO and OCl | ||
– | ||
First order chlorine decay rate |
Event detection algorithm
SPSA algorithm
Contaminant type . | Network condition . |
---|---|
Nicotine intrusion events | Deterministic background water quality |
Nicotine intrusion events | Uncertain background water quality |
Nicotine intrusion events | Nicotine signal for detection |
Nicotine and Parathion intrusion events | Uncertain background water quality |
Contaminant type . | Network condition . |
---|---|
Nicotine intrusion events | Deterministic background water quality |
Nicotine intrusion events | Uncertain background water quality |
Nicotine intrusion events | Nicotine signal for detection |
Nicotine and Parathion intrusion events | Uncertain background water quality |
Objective functions
To incorporate uncertainty in the background water quality, the objective function initially proposed by Babayan et al. (2005), to optimize WDS design under demand uncertainty, was amended to incorporate background water quality variability and the population affected following a contamination event. Babayan et al. (2005) transformed a stochastic chance constrained optimization problem into an equivalent deterministic problem by calculating the standard deviation of WDS junction pressures due to junction demand uncertainty. The standard deviation of junction pressures was then used to characterize the effect of demand uncertainty, and as a ‘safety factor’ for system designs. For example: ensuring that a junction's pressure was above a minimum threshold pressure plus two times the calculated standard deviation of that junction's pressure was equivalent to meeting a 95% chance constraint, as two standard deviations describe 95% of a value's variability.
Minimizing the PA objective (Equation (19)) requires that a sensor network promptly detects contaminations emanating from locations; and at times that place large populations at risk of exposure. Conversely, contamination events with more benign input locations and times are proportionally less important for the sensor network to detect promptly. With a limited number of WQMSs to place in a WDS, there is an inherent tradeoff between quickly detecting contamination events and detecting all possible contamination events. Minimizing the population affected objective implicitly considers this trade off by attempting to quickly detect the contamination events that quickly affect large populations, and allows longer detection times for contamination events that take a longer time to infect large populations.
Optimization
An efficient noisy multi-objective messy genetic algorithm (GA) was employed to identify the set of network junctions that best detected contamination events. The employed GA used adaptive population sizing, and generational contamination event suite re-sampling to improve the efficiency of the GA. Messy GA operators were chosen to allow the GA to add or remove sensors according to the best system performance. Unlike previous WQMS placement studies, no defined number of WQMSs, number of allowable WQMSs, or solution cost objectives were defined that actively constrained or minimized the number of WQMSs placed in the network. By minimizing the expected number of false positive detections, optimization removed sensors that did not reduce the population affected, and only added false positive detections. Adaptive population sizing, as in Kollat & Reed (2006), improved efficiency in search, and ‘re-stimulated’ search by initializing new populations throughout the GA run, and re-sizing the new populations according to the size of the current Pareto optimal set. Generational re-sampling was incorporated to improve the ability of solutions to generalize and perform well against uncertainty in contamination event characteristics. During each generation of the GA, candidate solutions were evaluated against a suite of randomly generated contamination events, and the mean performance across all contamination events was reported. This method is sensitive to the size of the contamination event suite used. A small contamination event suite may not appropriately characterize the true expected behavior of any single contamination event according to the law of large numbers (i.e. the mean of all possible contamination events). In this case, the generated solutions are likely to perform poorly against contamination events not included in the contamination event suite (overfitting). Using a large contamination event suite will lead to better solution generalization, however, it will greatly increase the computational burden of the algorithm because every solution will need to be evaluated against a large number of contamination events. Prior to the optimization phase, a suite of 400 random contamination events was randomly sampled; and during each generation of the GA, 100 contamination events were randomly sampled from the initial 400 contamination events, and used for evaluation during the GA. This method permitted the use of a relatively small evaluation suite during each GA generation, while the re-sampling operator continuously exposed the GA to a diverse set of contamination events.
Messy GAs (Goldberg et al. 1989) are partitioned into two distinct phases: the primordial phase, and the juxtapositional phase. During the primordial phase, all possible combinations of variables are defined (up to some number of variables, k) and each combination is evaluated. The initial population used by the GA is generated by sampling combinations of variables generated during the primordial phase, according to their relative performance. This method initializes the GA with an initial population ‘primed’ with high performing variable combinations (building blocks). In this study k was set to one, such that the primordial phase of the GA would evaluate every single sensor location. During the primordial phase the SPSA algorithm was run for each sensor location before evaluating the performance of a sensor at that location. A single value was required to score the performance of each single sensor location prior to sampling building blocks into the GA's initial population. The heuristic loss function proposed above, Equation (16), was used to score the single sensor location performances.
During the juxtaposition phase the GA operates similarly to a traditional GA, evaluating a population of solutions, sampling solutions according to their relative performance, crossing solutions together and mutating individual solutions, and lastly placing the newly generated solutions into the population of the next generation. However, in the case of a messy GA, crossover is replaced with a ‘cut and splice’ operator. Cut and splice either cuts two solutions at some point within their respective solution strings, and exchanges the portions of the string around the cut point; or splice simply merges two solutions together end to end. For this study, the probability of a solution to be cut was 0.9, and the probability that a solution would be spliced was 0.1. Mutation was defined to redefine, add, or remove one value of a solution string following cut and splice (i.e. one WQMS within the solution string). The mutation rate was defined according to the size of the current population, such that a single instance of mutation was expected for each generation, and equal probabilities were defined to add, remove, or re-define the value of a solution. A dominance based tournament of size 2 was chosen to select new solutions from the current GA populations. Tournament participants were compared to the current non-dominated set of solutions, and a solution that was non-dominated with respect to the current ‘optimal’ non-dominated set was assigned to ‘win’. In the event of a tie (both solutions ‘win’ or both solutions ‘lose’) a sharing function was used to calculate the density of nearby solutions (a sharing radius was set to 0.25), and the solution with the lowest local solution density was chosen as the tournament winner.
Figure 4 provides a schematic overview of the entire methodology.
CASE STUDIES
The methodology proposed above was tested on two distinct WDSs: the EPANET Net3 network (Rossman 2000), composed of 92 junctions, 2 reservoirs, 3 tanks, 2 pumps, and 117 pipes; and the Ky7 network (Jolly et al. 2014) composed of 481 junctions, 1 reservoir, 3 tanks, 1 pump, and 603 pipes. The Net3 network represents a network with primarily one-directional flow from sources towards consumers, while the Ky7 network represents a network with large amounts of mixing and multi-directional flow in the center of the WDS; and the network is more branched along the periphery. For each WDS, chlorine was input and varied at the source of the WDS. Figure 5 shows a plot of the respective WDSs, the chlorine input locations, and the daily average network-wide chlorine concentrations following the 12-day simulations. For the Net3 network, the mean chlorine concentration across all network junctions was 2.55 mg/L with a standard deviation of 0.32 mg/L, and for the Ky7 network, the mean chlorine concentration across all network junctions was 2.374 mg/L with a standard deviation of 0.497 mg/L.
For each network, the optimization scenarios defined in Table 4 were conducted. Following optimization, final evaluation was conducted against 1,000 newly generated contamination events. Each contamination event can take place at any junction in the network, at any time of the day, with a total mass input ranging from 0 to 5 kg instantaneously input into the system. Each proposed WQMS network was exposed to the contamination events, and the population affected prior to detection was reported. If the affected population was less than 0.5% of the total population served, the solution was deemed to ‘pass,’ and otherwise was deemed to fail. Solutions were compared based on the percentage of the 1,000 contamination events where the solution ‘passed.’ For the previously developed solutions, the EDA parameters calculated using the SPSA algorithm were assigned to the sensors for evaluations. Supplementary data including all Matlab files and EPANET files can be found at: https://www.dropbox.com/sh/8iitq658g547t1o/AABH61Ak1LHR3nn_k4GTucpWa?dl=0.
Parameter . | Value . |
---|---|
20 | |
0.602 | |
0.101 | |
1 | |
2.5 | |
4 |
Parameter . | Value . |
---|---|
20 | |
0.602 | |
0.101 | |
1 | |
2.5 | |
4 |
RESULTS
The proposed optimization method (proposed GA) proved successful to develop a network of WQMSs for contamination event detection using surrogate water quality parameters. Water quality monitoring stations were placed at locations which reduced the expected population affected, while simultaneously reducing the expected number of false positive detections. These two objectives proved effective in also governing the number of sensors placed in the network. Traditionally, sensor placement has been performed under the paradigm of a network coverage problem, where more sensors equals better event detection performance. Given that sensors also provide false positive detections, more sensors may not equate directly to better system performance. Although more sensors are likely to improve the detection capabilities of a contamination early warning system, more sensors are also susceptible to more false positive detections, due to non-contamination event anomalies in the water quality signals, and thus reduce the dependability of the contamination signal. Increased false positive detections may be overcome through proper EDA parameterization, however, it is an important factor to consider in the design of an early warning system. In general, the solutions developed that provided the fewest false detections had the fewest number of sensors in the network.
Final evaluation of the solutions developed herein and from previous studies showed discrepancies in performance between the solutions designed as a part of this study, and classical solutions developed via detection of a conservative contaminant (TEVA-SPOT and the Ohar et al. (2015) solutions). In many cases, the classical solutions provided the best performance with respect to the fraction of contamination events that were detected before an allowable population was affected, however, these solutions led to unreliable false positive detection rates, often around 0.2. When evaluated against Nicotine (the same contaminant used during optimization) the solutions developed without considering background chlorine uncertainty provided performance roughly equal to the classical solutions, as shown in Figure 6(a), indicating that the variability in background water quality leads to false positive detections. In this work, only solutions developed with consideration given to background water quality variability were able to achieve reliable false positive detection rates (Figure 6(a)–(c)). When considering a second contaminant (Parathion), the systems generally performed worse (Figure 6(b)). As expected, incorporating both contaminants into the optimization phase proved to improve the solution quality, for both Nicotine and Parathion contamination intrusions. However, caution should be taken in considering the generalizability of the sensor networks, given the two contaminants used different water quality models. The increase in false positive detection rates observed in the Parathion evaluations is caused by different background water quality levels, which were expected to have arisen from the water quality model's characteristics, as opposed to background water quality. Sensor layouts are presented in Figure 7, with respect to the performance of each sensor layout.
The SPSA algorithm proved highly efficient for determining good parameterization for the del threshold value and window size. As seen in Figure 8 the solution ROC (receiver operator curve) plots were developed by varying the del threshold from 1% of the original del value, to 100% more than the original del value assigned by the SPSA algorithm (at 10% increments, i.e. 10%, 20%, 30% …. 190%, 200%). The ratio of true positive detections (y-axis) and the ratio of false positive detection (x-axis) was then plotted for each instance of the del threshold. The ROC plots show the difficulty in contamination event detection in WDSs. Ideally, a ROC plot will have an area beneath the curve of 1, with the optimal points located in the upper left portion of the ROC curve (true positive detection ration of 1, with a false positive detection of 0). In many cases, the SPSA parameterization corresponds to a point far from what would be expected optimal; however it is a ‘near-optimal’ point for the respective ROC plot.
Similar results were observed for the Ky7 network (Figure 9), where sensors were again placed to best detect a Nicotine contamination according to the response in chlorine, pH, and alkalinity. Again, the solutions developed to incorporate water quality uncertainty performed best when evaluated against random contamination events with potentially uncertain water quality (Figure 9). The solutions developed to detect only Nicotine performed worst, with solutions developed according to deterministic surrogate water quality parameters performing better than the Nicotine detection case, but worse than the case incorporating background water quality uncertainty. Unlike the Net3 case study, in the Ky7 network the deterministic optimization case led to a larger number of potential solutions than the uncertain optimization case. Many of the solutions developed in the deterministic case performed quite well during evaluation with respect to the uncertain solutions. Interestingly, only during the deterministic optimization phase were solutions developed which allowed high false positive detection rates, >0.5. This is indicative of the deterministic optimization case being able to develop sensor networks that result in lower population affected than the uncertain case, however, the solutions are highly sensitive to potential background water quality uncertainty. Figure 10 provides the sensor layouts calculated for the Ky7 network alongside the respective layout's performance.
Receiver operator curves were generated for the Ky7 solution, similarly to the solutions evaluated from the Net3 network. Opposed to the Net3 network, the analysis showed very difficult ROC curves for the solutions generated for the Ky7 network, and the EDA parameters chosen using the SPSA algorithm were often located at sub-optimal points along the curve. Interestingly, the ROC curve shows that the parameters assigned using the SPSA algorithm are often located at the worst possible parameterizations, points that provide the worst performance with respect to the likelihood of false positive detection with zero benefit to the likelihood of contamination event detection (Figure 11). As a secondary analysis, the EDA parameters were varied and evaluated with respect to the affected population, shown in Figure 12. This analysis showed the effectiveness of the SPSA method, as the SPSA parameterizations were best able to span the entire solution space. All other assigned parameterizations cluster solutions within specific reigns of the solutions space, for example, a del value of ½ the SPSA value leads to worse detection performance, and a del value of ¼ the SPSA value leads to a large number of false positive detections regardless of the sensor layout. Increasing the SPSA del value by a factor of 1.5 and 2 leads to worse event detection performance, with relatively no benefit in the false positive detection rate. The discrepancy in performance, according to Figures 11 and 12, is believed to be due to the non-trivial relationship between ROC evaluation measures, event detection likelihoods, and the evaluation measures used in sensor network design. Care should be taken in evaluating event detection algorithms with standard ROC curves, as optimal performance with respect to a ROC curve may not lead to optimal performance with respect to contamination event detection metrics.
A number of sensitivity analyses were performed for the Net3 solutions to determine how the solutions developed within this study under naïve conditions would perform in a more realistic scenario. Accordingly, all solutions were evaluated under conditions of: continuous 12 hour contaminant injection with a total mass input equal to the baseline instantaneous contamination input; consumer demands randomly sampled within 15% of the deterministic model's demand level for each instantaneous contamination event; and an EDA observing only Nicotine and pH water quality data. The results of these evaluations are shown below in Figure 13.
The results of these sensitivity analyses provide valuable insight into the proposed systems. First, the detection performance of the sensor networks generally decreases when considering longer, lower mass input rate contamination events and uncertain demands. Specifically, uncertain demand conditions are shown to greatly increase the number of false positive detections of all solutions, and are important to consider when using surrogate water quality parameters for event detection. Using only the chlorine and pH water quality signals greatly reduces the false positive detection rates of all solutions. Simplifying the EDA to use as few water quality data streams as possible can reduce system complexity, and improve performance, however, it may limit the detection capabilities to a fewer number of reactive contaminants.
CONCLUSIONS
The simulations herein have provided an initial investigation into the influence of uncertain water quality on the placement of water quality monitoring stations. Using surrogate water quality parameters to identify a contamination event combines the signal-processing task of detecting the presence of a contamination, with the sensor placement task of best exposing sensors to potential contaminations. The sensor networks developed herein have been optimized to best detect a contamination event under conditions of background water quality uncertainty.
The method proposed herein successfully placed sensors to best detect contamination events. Minimizing the population affected greater than an allowable affected population while minimizing the expected number of false positive detections for a contamination event provided efficient sensor solutions for contamination event detection. Because no constraint was imposed on the number of sensors within a sensor network, the proposed GA only placed a sensor if it reduced the affected population, with a minimal increase in the potential for false positive detections. This framing of the sensor placement problem provides a unique method to best ensure that only the most efficient sensors are placed in the water distribution system.
Recent developments in event detection in water distribution systems have shown the desire to reduce the rate of false positive detections provided by an event detection algorithm. Identifying the best locations to place water quality monitoring stations by explicitly considering the water quality signal observed at each location in the WDS has been shown to provide lower false positive detection rates and strong performance with respect to the population affected by the contamination, even when employing a basic local EDA. However, using surrogate water quality signals to place sensors in a WDS places a large amount of dependence on the fidelity of the water quality model used. In this study, the solutions developed using the Nicotine water quality model were shown to outperform previous benchmark solutions, deterministic solution models, and contaminant based sensor placements; and although the solution performance degrades when evaluated against a second water quality model, the same trends are observed.
As a preliminary study in EDA based sensor placement there are numerous limitations to this study, and directions to consider in future work. To begin with, of the three water quality parameters chosen, pH and alkalinity are highly correlated, and this correlation can be observed in Figure 3. In the future it would be beneficial to incorporate either the pH or the alkalinity and employ a third measure that is not correlated with the free chlorine or the pH, or to simplify the model and only use free chlorine and pH or alkalinity, which has been shown to provide strong performance for Nicotine detection (Figure 13). Secondly, the EDA used herein is quite naïve; state-of-the-art integrated system wide EDA algorithms or more intelligent algorithms which recognize water quality anomalies caused by normal operation from ‘unknown’ anomalies (Hart & McKenna 2009; Romano et al. 2014) should be incorporated into the proposed framework. Given the modularity of the proposed method, any event detection algorithm can be incorporated for the event detection task. An integrated system wide EDA used during the sensor placement phase would be expected to place sensors at locations that even better detect contamination with reduced false positive rates given the water quality signal observed. This EDA can facilitate intelligent recognition of false positive events caused by normal operation to reduce the false positive rate. In this study, sensor networks were designed and EDA parameters were prescribed based only on instantaneous contamination event simulations. It would be valuable to consider variable contamination event characteristics in future work, specifically for prescribing EDA parameters. Considering longer duration contaminations may improve the system wide total event detection performance when contaminations do not travel through the WDS as a clear contaminant ‘pulse.’ Secondly, uncertainty in consumer demands should be integrated in the contamination event simulations, and system design phase to ensure that the sensor networks are robust to variable network hydraulics, and thus variable contaminant transport times. Demand uncertainty can be easily integrated in to the proposed methodology using the proposed objective function (Equation (19)) by simultaneously simulating uncertain demand realizations alongside uncertain chlorine input realizations. Lastly, new objectives can be incorporated into the optimization framework for additional design criteria. For example, an additional objective could be incorporated to best locate a contamination event following detection, integrating the contamination event detection and localization problems into the sensor network design phase. Given the widespread availably of high-powered computing, fully integrating event detection and event localization tasks within the sensor placement task, even with complex water quality simulations and incorporation of uncertainties, has become more feasible.
ACKNOWLEDGEMENTS
This study was supported by the United States – Binational Science Foundation (BSF), by the Technion Funds for Security research, by the joint Israeli Office of the Chief Scientist (OCS) Ministry of Science, Technology and Space (MOST), and by the Germany Federal Ministry of Education and Research (BMBF), under project no. 02WA1298.