Intentional or unintentional chemical contamination of water distribution systems (WDSs) could have severe health and socio-economic consequences. High potency chemicals constituting, in essence, “super poisons” have the potential to be used in such intrusion scenarios. Some of these contaminants are capable of killing the victim in less than 1 h. Due to their high toxicity levels and short time from exposure to onset of symptoms, 911 call centers are likely the first point of contact for victims or their families with the authorities. Information such as 911 calls could be used to identify the ongoing event and potential intrusion locations. In this way, such emergency calls could function as an intrusion warning system. This study employs network hydraulic modelling to synthesize the 911 call patterns in the aftermath of such events. It then defines the scenarios as a multi-label pattern recognition problem. The synthesized data then was used to train a Convolutional Neural Network (CNN). The trained AI was applied to a real-world WDS with approximately 4000 km of pipe and 26,000 demand nodes. The results indicated that CNN is capable of accurately recognizing the pattern and pinpointing the originating location of the intrusion with an accuracy greater than 93%.

  • Intentional chemical intrusion in water distribution systems (WDSs) is a significant system vulnerability.

  • No sensory system is currently available that can detect synthetic opioids in (WDS) online.

  • 911 call logs were used to define the detection of chemical intrusion as a pattern recognition problem.

  • Deep learning AI, trained by water quality simulation outcomes, can detect these events and locate the point of intrusion.

In the aftermath of the 9/11 terrorist attacks, the possibility of non-state actors using a water distribution system (WDS) as a means to kill and terrorize a civilian population received increased attention and resulted in the development of various modeling and analytical tools such as CANARY (Hart & McKenna 2012). Although WDSs were recognized as potentially vulnerable, most professionals perceived the risk as low because, to be effective, a chemical or biological agent must be (a) produced in sufficient quantities, (b) appropriate for water dissemination, (c) infectious/virulent/toxic, and (d) effective over time and treatment (if introduced into the system upstream of a water treatment plant) (Gleick 2006). These four barriers may have been lowered by the introduction of ‘super poisons’ such as synthetic opioids which could be seen as a surrogate for all substances that have a very high impact relative to mass ratio.

Some of these synthetic opioids are thousands of times more potent than morphine (UNODC 2017). In the absence of concrete observations, the theoretical dose lethal to humans can be in the order of nano-grams per person. If absorbed through digestive tract (from drinking), and or ocular or dermal tissues (taking a shower), the overdosing symptoms (e.g., immobility, unresponsiveness, and finally raspatory arrest) could commence in less than hour. Currently, not much is known about the practicality of these opioids to be disseminated through the water supply, their effect over time, or how they survive water treatment. It is clear, however, that with such a small lethal dose, the logistical barriers may no longer hold. Two more elements act to augment this threat, namely the abundance of these substances and their precursors (The Canadian Press 2017) and the relative ease with which they can be introduced into a WDS. Their high abundance is part of the reason why synthetic opioids are increasingly replacing less potent drugs such as heroin and morphine in the opioid market (Misailidi et al. 2018). Recently, the release of synthetic opioids for malicious purposes has been explored by Cibulsky et al. (2023). As for the introduction of poison into a WDS, this proposition has a long history (Gleick 2006) and is not a difficult task if the poison is very potent and lethal in small amounts. Even moderate-size WDSs encompass hundreds, if not thousands, of kilometers of pipes as well as thousands of potential entry points.

If poison is introduced into the WDS, it is imperative to understand the nature of the substance as well as identify the location(s) at which it was introduced. Previous research was largely focused on back tracking (through inverse calculations) to locate the contamination source(s). These techniques are premised on the assumption that several sensors are actively collecting water quality data across the network and are able to identify a chemical substance as poison or a specific degradation by-product. As soon as any contamination is detected by these sensors, a computer model will generate many contamination scenarios numerically and randomly, and search for the closest scenario that produces water quality data similar to the sensor(s) data pattern. Often an optimization algorithm would guide the search to minimize the differences between the simulated water quality indicators and the indicators obtained by sensory signals. Among these, Zierolf et al. (1998) developed an input/output model for chlorine transport. This was further extended by Feng et al. (2002) and De Sanctis et al. (2010) into a particle backtracking algorithm. This problem was also addressed through the use of nonlinear programming (Laird et al. 2005; Pal & Kant 2015; Oliveira et al. 2018), mixed-integer quadratic programming (Laird et al. 2006; Ostfeld et al. 2008; Mann et al. 2012; Palleti et al. 2016; Adedoja et al. 2020), Genetic Algorithm (Preis & Ostfeld 2007; Preis & Ostfeld 2008; Preis & Ostfeld 2008; Hu et al. 2015; Ohar et al. 2015; Rathi & Gupta 2017; Vrachimis et al. 2020; Yan et al. 2021), other evolutionary optimization techniques (Cristo & Leopardi 2008; Liu et al. 2008; Liu et al. 2011; Kumar et al. 2012; Bazargan-Lari 2014; Yan et al. 2016; Yan et al. 2017; Yan et al. 2020), and statistical and probabilistic approaches (De Sanctis et al. 2008; Hilbe 2009; Preis & Ostfeld 2011; Perelman & Ostfeld 2013; Yang & Boccelli 2014; Seth et al. 2016). Back tracking techniques suffer from uncertainty in the data used for calculation, particularly in water flow and sensor report time (Marlim & Kang 2020). However, the most significant issue with these methods is the existence of the sensors, or rather lack thereof. Despite significant efforts since 9/11 to develop a sensory system, and the aforementioned studies, no sensory system currently exists that can detect the variety of chemical (possibly weaponized) contaminants inside WDSs. Some biological toxicity sensors have been developed, based on the premise that toxins that kill microorganisms (or fish) are potentially toxic to humans as well. However, these sensors cannot operate in water containing disinfectant residuals (that are by design toxic to microorganisms) and are therefore not suited for WDSs.

Anomaly detection methods were also used to detect contaminants that could affect general water quality parameters such as chlorine, pH, total organic carbon, temperature, electric conductivity, alkalinity, and turbidity (Kessler et al. 1998; Green et al. 2003; Ostfeld & Salomons 2004; Ostfeld et al. 2004; Tao et al. 2012; Perelman et al. 2012; Arad et al. 2013; Oliker & Ostfeld 2013; Schwartz et al. 2013; Lambrou et al. 2014; Oliker & Ostfeld 2014; Eliades et al. 2014; Oliker & Ostfeld 2014, p. 12; Housh & Ostfeld 2015; Mohammed et al. 2018; Tinelli & Juran 2019; Barros et al. 2022). These methods rely on the data gathered by actual sensors across the network. A baseline would be created using the recorded data and any deviation from the baseline would be evaluated against a specific criterion to determine if the variation is an anomaly. Various machine learning techniques, or an ensemble of them, have been used in this approach. A variation of this methodology forecasts water quality parameters in the next time step (Maiolo & Pantusa 2018; Ramotsoela et al. 2019; Sun et al. 2019; Grbčić et al. 2020; Fasaee et al. 2021; Grbčić et al. 2021; Lučin et al. 2021; Li et al. 2022) and if the predicted value differs from the actual value it is classified as an anomaly. These techniques could be used for contaminants that are affecting general water quality parameters. However, it is reasonable to assume that low-dosage (PPB) extremely potent substances are not very likely to affect these general water quality parameters in a significant manner. It is worth noting that more research is needed to understand how super-potent substances react in water and impact water quality parameters. Additionally, anomaly detection methods rely on previously detected events to train their algorithm or form their baseline. Therefore, their effectiveness could be compromised if a new contaminant, not previously known, is used.

Both anomaly detection and back tracking techniques require proper or optimized sensor placement across the network. This has been addressed by many researchers (including Berry et al. 2005; Dorini et al. 2006; Krause et al. 2006; Wu & Walski 2006; Isovitsch & VanBriesen 2008; Shen & McBean 2011; Wu & Roshani 2014; Adedoja et al. 2019; Rodriguez et al. 2021; Giudicianni et al. 2022). Various algorithms such as heuristic models, Monte Carlo simulations, non-dominated sorting genetic algorithm, and hyper cube are used to achieve the best sensor arrangement in these studies. The main objective of these approaches is to maximize the coverage or efficiency of each sensor to minimize the number of required sensors. Data obtained from currently available water quality sensors such as chlorine concentration, pH, temperature, DO, conductivity are insufficient to detect a potential chemical intrusion.

Needless to say, the most effective means to locate and identify chemical intrusions is sensors (i.e., essentially lab-on-a-chip) capable of real-time identification of the substance, which are placed in multiple locations throughout a WDS. Currently, however, in the absence of such sensors, ‘softer’ methods can be used in the interim. This research is focused on one such interim method. The proposed approach is premised on the assumption that a relevant breech will be followed by a wave of 911 calls. The call data logs (i.e., the location and time stamp associated with each call) can be analyzed in real time and used to identify the intrusion and discern useful knowledge about locations, rate of spread, etc. The proposed approach requires several simplifying assumptions, the veracity of which would require verification (or adjustments) in future research. This research proposes a methodology that uses artificial intelligence (AI) to detect call patterns that are similar to the dispersion of chemicals inside WDSs.

A substance introduced into a WDS will be carried by the water flow and its spread will be governed by the flow pattern in the network. If the substance is a toxin, water users' exposure to it will therefore be largely governed by the flow pattern (which is time-dependent) and by their location relative to the topology of the distribution network. It is assumed that exposure to a high potency toxin (through ingestion, dermal contact, or mucous membrane exposure) would lead to rapid potential adverse health effects, and symptoms of overdose followed by calls to emergency management units such as 911 centers in North America. These call centers automatically log the location and time stamp of each call. Given enough calls, these call logs could be used to train a pattern recognition algorithm. This algorithm could be designed to identify a waterborne event, such as an intrusion, and then locate its source(s).

For simplicity, we initially ignore the realistic possibility that not all affected cases will trigger an immediate 911 call from the location of exposure (some people may go home first, go to a clinic, to emergency, etc.). However, these cases may be considered in future research, whereby information from clinics and emergency wards is fused with information obtained from 911 calls.

True, real-world call patterns related to chemical intrusions are not available in 911 because this sort of event has not happened since such emergency telecommunications schemes have been in place. However, reasonable call patterns could be synthesized, using hydraulics and water quality models such as EPANET (Rossman 1994).

The ‘bird's eye view’ of the proposed approach is as follows. For a given water distribution network we generate, using EPANET, a large number (millions) of scenarios of contamination events, with varying combinations of contaminant injection timing, duration, location, quantity injected, etc., to produce a large library (or data base) of contamination patterns. Subsequently we simulate a contamination event with its associated pattern of 911 calls. This 911-based pattern is then compared to the database of scenarios to find a matching hydraulic-base pattern. The best match points to the scenario most likely to have generated the simulated contamination event.

The assumption underlying the synthesized patterns is that the spatial and temporal patterns of the calls follow, with a time lag, the contamination patterns. This lag time delays the arrival of the information and for simplicity is assumed to be constant and therefore would not change the associated patterns (and thus could be subtracted from the timing data or ignored).

Not much is publicly known about synthetic opioids' diffusivity in water or their reactions with chemicals found in municipal water, and the various materials found in the network components. For simplicity, it is assumed that these substances have a relative diffusivity of one (i.e., it diffuses in water similar to chlorine). Under a chemical intrusion scenario, many variables related to the intrusion execution (termed here ‘intrusion vector’) are unknown, such as the source location(s) in the network, injection start time and duration, the amount of chemical used, injection patterns, etc. However, certain logical assumptions could reasonably be made to account for the uncertainties associated with these variables. This includes the following:

  • 1- The intrusion could happen at any demand node in the network. However, only one intrusion source location at any time is considered in this study. The intrusion location is selected randomly and all potential points of insertion are considered equally likely to be selected for intrusion. This is an initial simplifying assumption (in reality some locations would be preferred over others), discussed in more detail in the discussion section.

  • 2- If these synthetic substances are ingested, it is assumed to take approximately 1 h for the symptoms to commence (Cibulsky et al. 2023).

  • 3- The minimum intrusion duration was arbitrarily assumed to be 15 min.

  • 4- The maximum intrusion duration was arbitrarily (and conservatively) assumed to be 6 h. This value is conservative because, given such an intrusion, the 911 center will likely be inundated with calls within very few (<< 6) h and, once inundation happens, incoming data are no longer useful.

  • 5- In the scenario generation process, an intrusion duration is assumed to be uniformly distributed between the minimum and maximum values.

  • 6- The time step used in our study for generating simulated scenarios was 15 min. An intrusion could commence at any time step during a 24-h period (i.e., 96 different time points during a day).

  • 7- The process of generating a scenario is as follows: (a) hydraulic simulation begins at 12 AM; (b) contaminant injection occurs randomly at one of the 96 possible points in time during the next 24 h; (c) after injection, hydraulic simulation continues with a chemical component added at the point of injection; (d) hydraulic simulation (using EPANET's water quality computational engine) with added chemicals continues for the (random) duration of the injection; and (e) hydraulic simulation ends at the end of the duration of injection, at which point all the nodes in the network with a contaminant concentration greater than the lethal dose of contaminant are recorded as an image and stored as a single scenario in the database. This process is used to generate the entire database of scenarios. The shortest scenario is thus 15 min (occurring at 12 AM with a 15 min injection duration) and the longest is 30 h (occurring at 12 AM the next night with a 6-h injection duration).

  • 8- The amount of contaminant injected in any one scenario is assumed to be uniformly distributed and ranges between 1 and 40 kg. This amount is assumed to be injected subject to some pattern (i.e., the injection rate can vary every 15 min during the injection period). The rates in this pattern are generated randomly.

  • 9- It is assumed that the victims stay in the same place where they were exposed to the contaminant. This is done to simplify the problem. Considering the short period of time from exposure to symptoms expected with a super-poison, this assumption is deemed reasonable for an initial effort but should be verified (and possibly adjusted) in future research.

Millions of intrusion scenarios with various combinations of input parameters are generated randomly and simulated using the EPANET. The arrival time of the contaminant at each node is recorded in a matrix. The number of generated scenarios is based on the requirement of the AI algorithm used for the pattern recognition and is discussed in more detail in the provided example. The water quality matrix is then converted into an image as is illustrated in Figure 1. A 600 × 600 pixel image provides sufficient resolution to represent most large water networks, including the case study system for this research with 4,000 km of pipes. The image size was selected to accommodate the computational limitation of the AI algorithm (Howard et al. 2019; Sandler et al. 2019). A larger image would provide more information but would also (significantly and unnecessarily) increase the computational burden.
Figure 1

Image created based on the water quality matrix and representing a contamination plume (red dots, for clarity) in the network (gray nodes represent demand nodes).

Figure 1

Image created based on the water quality matrix and representing a contamination plume (red dots, for clarity) in the network (gray nodes represent demand nodes).

Close modal

Figure 1 illustrates a typical snapshot image of a contaminated network. The contaminated nodes are represented by grayscales in each image. The shade of gray denotes the time step at which the node was first contaminated within the 30-h window. In a grayscale image, each pixel has up to 256 values. However, for 30 h of simulation with 15 min time step, only 120 values (i.e., 30 × 4 time steps) are needed. Therefore, the time step values are scaled up, to achieve a better contrast between the various recorded greyscales. The pixel encoding can be summarized as follows: Zero indicates that a pixel is not a node. Unaffected nodes (contaminant concentration less than lethal dose) are recorded as 55 and the time steps are represented from 56 to 255.

It is worth noting that a water distribution network is a graph/vector and converting it to a 360,000-pixel image (i.e., a scalar dataset) may result in losing some data. This is more obvious in a large and detailed network such as the one used as the case study here. Nodes that are physically closer to each other and may not be connected hydraulically, could mash together in the image (i.e., more than one node might fall in one pixel). To prevent overlapping nodes, a geometrical algorithm was developed that shifts the node coordinates in the image while keeping their relative geographical locations in the image correctly. For brevity, this algorithm is not discussed here.

Call centers receive thousands of calls on a daily basis. These are typically related to location- and time-specific events such as fire, crime, traffic accidents, and follow certain spatial and temporal patterns that are generally different from a waterborne contamination event. These unrelated calls are not expected to stop during a waterborne intrusion and therefore would continue even while a contamination episode unfolds. Moreover, the proposed methodology only uses call location and time stamp to maintain the caller's privacy, therefore the call content cannot be used to filter out unrelated calls. As a consequence, in the simulation of the synthesized intrusion scenarios, these calls are considered (and simulated) as ‘noise’ (that is, a base level of background call activity unrelated to the contamination event). The spatial and temporal distribution of these unrelated daily calls are discerned from the 911 call logs and confirmed through consultation with dispatch operators. These pre-existing patterns are then added to the images representing the waterborne intrusion (Figure 2). Due to security and privacy restrictions, more details regarding the spatial and temporal distribution of these (noise) calls could not be provided here.
Figure 2

Noise and contamination plume. Green dots represent unrelated calls (noise).

Figure 2

Noise and contamination plume. Green dots represent unrelated calls (noise).

Close modal

Since the maximum intrusion duration is 6 h, only the background call pattern in the 6 h immediately before the end of an intrusion scenario is combined with the patterns generated from water quality modeling to create the final image (Figure 2). Additionally, thousands of images were generated based on noise patterns without any contamination plume data (water quality modeling outputs) and added to the images database. This is done to train the pattern recognition algorithm (discussed next) for scenarios where there is no chemical intrusion.

Identifying a chemical intrusion and locating its source(s) is defined as a pattern recognition problem through multi-label classification in this research. Convolutional Neural Networks (CNN) are feed forward neural networks that are primarily used to solve pattern recognition tasks (O'Shea & Nash 2015). These algorithms were initially developed for character recognition. CNN has been used in various applications in WDS analysis. For instance, (Kim et al. 2022) used them to detect anomalies associated with pipe burst, and (Sun et al. 2019) combined CNN with costumer complaints data to locate a contamination source in a network.

CNN is a supervised learning technique. It requires training data pairs that include the pattern (i.e., image) and the node(s) associated with the aforementioned pattern. Each contamination scenario is naturally associated with a point of injection. However, in creating CNN-training pairs, each scenario was associated with several nodes, all adjacent to the actual injection node and connected to it by a physical pipe. The rationale for this approach is that nodes that are physically close to each other could be easily and quickly investigated in a single investigation session to find the actual culprit. Thus, identifying a cluster of neighboring nodes rather than a single one shortens significantly the computational time. This balancing act between the speed of computation and the speed of field-action could be further investigated and fine-tuned (in future research) to yield a minimum overall response time. In the algorithm, each node and its directly-linked neighboring nodes (Figure 3) are defined as one class that has an associated contamination image (i.e., training data pair). This approach is called multi-label classification, and has previously been used for detecting adverse drug reactions, building recommendation systems, and sentiment recognition.
Figure 3

Original intrusion node and its directly linked neighbors.

Figure 3

Original intrusion node and its directly linked neighbors.

Close modal

In the classic multi-label approach, the goal is to maximize true positive labels while minimizing false positive labels. This is achieved through penalty functions for every wrong identification. When the penalty for a false negative is equal to the penalty for a false positive, it is implied that both errors are equivalent in importance and independent of each other. In the proposed approach, the importance of avoiding a false positive is far more important than avoiding false negatives (due to the subsequent field work that is required following successful identification of the intrusion point). Therefore, the algorithm penalizes only false positives.

Equations (1) through (8) encapsulate this notion in a binary matrix format. X is a binary matrix with members xij in which each row is a classified sample vector. Y is the binary matrix with members yij of the true label vectors. The operator denotes the Hadamard product which is a per element multiplication of two identical size matrices. SGN() is the sign function, n is the number of samples in the dataset and CA is the classification accuracy. TPij denotes the true positives matrix. FPij denotes the false positives matrix. SiTP denotes a vector where the entries are the sums of true positives per row of the true positives matrix, SiFP denotes the same vector for the false positives matrix. SCi denotes a vector where each entry indicates if a sample has been correctly classified by at least 1 true positive detection and no false positives with a 1, and a 0 otherwise. The n value in Equations (5), (6) and (8), denotes the number of samples.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)

Like any other artificial neural network, CNN uses various weight and bias values to learn the data features and correlate the input data with the output classes. These weights and bias values were optimized during the training phase to maximize the CA value using the MobileNetV3 algorithm (Howard et al. 2019).

As mentioned earlier, routine 911 calls are not expected to cease during a chemical intrusion. It is therefore important for the proposed algorithm to distinguish between these routine calls and the calls related to the intrusion. Equation (8) estimates the classification accuracy without accounting for these unrelated calls (i.e., noise). These noise calls are added as a new class of data. A weighted form of Equation (8) is used to account for this new class of data representing the noise (i.e., the no-intrusion class). This weighed formulation penalizes false positive identification in no-intrusion scenarios (i.e., noise calls).

This formulation is defined in Equations (9)–(14). NOISECLASSMask denotes a vector of length c + 1, where c is the number of classes in the WDS data. In this work, classes correspond to origin nodes of intrusions. The vector only contains a 1 at the last entry which is used to denote the no-intrusion scenarios, this is the entry at the cth position because the count starts at 0. This mask is used to produce the NOISECLASS vector, which indicates if a given sample is a no-intrusion class sample, by calculating the dot product with the true labels' matrix. FP_Noise_Classij denotes the false positives matrix that has been masked by the NOISECLASS vector whereby the false positive matrix information is only retained if the sample represented by a given row is of the no-intrusion noise class. SiFP_Noise_Class denotes a vector whose members are the sums of false positives per row of the false positives matrix masked by the NOISECLASS vector. SC_FP_Noise_Classi is a binary vector that indicates if a given sample is a noise class sample and contains false positive detections. The weighted accuracy metric is defined by CAWeighted. As before the n value denotes the number of samples.

In this metric if a sample is a no-intrusion sample (i.e., only noise) and contains false positives it is worth the weight equivalent of W times wrongly classified samples. Therefore, a no-intrusion sample that is false positive quickly reduces the accuracy of the model by adding a weight of W in the denominator.
(9)
(10)
(11)
(12)
(13)
(14)

There are various CNN architecture blueprints such as ResNets (He et al. 2016), EfficientNet (Tan & Le 2019), and MobileNets (Howard et al. 2017; Howard et al. 2019) to classify objects in images. These blueprints were examined as a starting point to design a CNN architecture for this research and MobileNets was selected. MobileNets was mainly selected due to its ability to handle 600 × 600 pixel images which are significantly larger than images used in much current research in the computer vision field and its memory usage economics.

MobileNets was initially developed by Google for object detection and semantic segmentation. This architecture has been built on various older concepts such as depth-wise separable convolutions, linear bottleneck and inverted residual structure, lightweight attention modules, and modified swish nonlinearity (Howard et al. 2019). Three main variables control the architecture of the MobileNets, namely t, c, and n. The variable t is the expansion factor which governs the internal expansion of the inverted residual block. The output channel size is governed by c, and n controls the number of times that a given residual block is repeated (Sandler et al. 2019). The full version of the MobileNetV2 architecture optimizes these variables to classify various objects in different settings and backgrounds. The images used in contamination source identification are very similar, and therefore the problem is much simpler than the object classification problem, however it uses a large image size (i.e., 600 × 600 pixel). Therefore, the t, c, and n values were re-optimized for MobileNetV2 through enumeration to minimize the GPU ram usage. The optimized MobileNetV2 architecture is shown in Table 1. MobileNetV2 could be used for smaller WDSs with up to 10,000 demand nodes. Larger networks necessitated the use of MobileNetV3.

Table 1

Optimized MobileNets V2 architecture for small to medium datasets

Optimized MobileNetV2 for smaller intrusion detection datasets
Operatortcn (number of sequential instances as defined by the authors)
conv2d – 32 
bottleneck 16 
bottleneck 24 
bottleneck 32 
bottleneck 64 
bottleneck 96 
bottleneck 160 
bottleneck 320 
conv2d 1 × 1 – 1,280 
avgpool – – 
conv2d 1 × 1 – K (total number of classes) – 
Optimized MobileNetV2 for smaller intrusion detection datasets
Operatortcn (number of sequential instances as defined by the authors)
conv2d – 32 
bottleneck 16 
bottleneck 24 
bottleneck 32 
bottleneck 64 
bottleneck 96 
bottleneck 160 
bottleneck 320 
conv2d 1 × 1 – 1,280 
avgpool – – 
conv2d 1 × 1 – K (total number of classes) – 

MobileNetV3 is an improvement over MobileNetV2, which uses Squeeze and Excitation Networks (Senets) to add a weight factor to the output feature maps of its inverted residual blocks. This serves as an attention mechanism that allows the network to choose the most significant image features to use for further convolutions, boosting its classification power. Due to its large classification power, its modules had to be constrained in order to avoid overfitting. The maximum convolution size was reduced to 3 × 3, down from its original 5 × 5 with most of the Senets removed from the higher layers. A summary of all of the optimizations is shown in Table 2. Refer to Table 1 in Howard et al. (2019) for the original MobileNetV3-Large configuration.

Table 2

Optimized MobileNetV3-large model for chemical intrusion detections

InputOperatorExp. size#outSENLs
6002 × 3 conv2d – 16 – HS 
3002 × 16 bneck, 3 × 3 16 16 – RE 
3002 × 16 bneck, 3 × 3 64 24 – RE 
1502 × 24 bneck, 3 × 3 72 24 – RE 
1502 × 24 bneck, 3 × 3 72 40 – RE 
752 × 40 bneck, 3 × 3 120 40 – RE 
752 × 40 bneck, 3 × 3 120 40 – RE 
752 × 40 bneck, 3 × 3 240 80 – HS 
372 × 80 bneck, 3 × 3 200 80 – HS 
372 × 80 bneck, 3 × 3 184 80 – HS 
372 × 80 bneck, 3 × 3 184 80 – HS 
372 × 80 bneck, 3 × 3 480 112 – HS 
372 × 112 bneck, 3 × 3 672 112 – HS 
372 × 112 bneck, 3 × 3 672 160 HS 
182 × 160 bneck, 3 × 3 960 160 HS 
182 × 160 bneck, 3 × 3 960 160 HS 
182 × 160 conv2d, 1 × 1 – 960 – HS 
182 × 960 pool, 7 × 7 – – – – 
12 × 960 conv2d 1 × 1, NBN – 1,280 – HS 
12 × 1,280 conv2d 1 × 1, NBN – – – 
InputOperatorExp. size#outSENLs
6002 × 3 conv2d – 16 – HS 
3002 × 16 bneck, 3 × 3 16 16 – RE 
3002 × 16 bneck, 3 × 3 64 24 – RE 
1502 × 24 bneck, 3 × 3 72 24 – RE 
1502 × 24 bneck, 3 × 3 72 40 – RE 
752 × 40 bneck, 3 × 3 120 40 – RE 
752 × 40 bneck, 3 × 3 120 40 – RE 
752 × 40 bneck, 3 × 3 240 80 – HS 
372 × 80 bneck, 3 × 3 200 80 – HS 
372 × 80 bneck, 3 × 3 184 80 – HS 
372 × 80 bneck, 3 × 3 184 80 – HS 
372 × 80 bneck, 3 × 3 480 112 – HS 
372 × 112 bneck, 3 × 3 672 112 – HS 
372 × 112 bneck, 3 × 3 672 160 HS 
182 × 160 bneck, 3 × 3 960 160 HS 
182 × 160 bneck, 3 × 3 960 160 HS 
182 × 160 conv2d, 1 × 1 – 960 – HS 
182 × 960 pool, 7 × 7 – – – – 
12 × 960 conv2d 1 × 1, NBN – 1,280 – HS 
12 × 1,280 conv2d 1 × 1, NBN – – – 

In MobileNetV3 the t operator has been replaced by the expanded channel size. The s variable indicates the step size where 2 is equal to a size reduction of 50% in the image dimensions. SE indicates whether a particular block uses a Senet. NL explains which activation function was used between hard swish (HS) and relu6 (RE). For details on these functions and model details, consult Howard et al. (2019). The proposed methodology was coded using a combination of C + +, and Python and the Tensorflow library was used extensively to create the CNN engine.

The model was applied to an anonymous water distribution network supplying a population of about 1M (Figure 4). The network includes 4,000 km of pipes, seven pressure zones, and 32,000 nodes, of which 26,000 are nodes with demand that are assumed to be the potential sites of toxin injection.
Figure 4

Case study of the water distribution system.

Figure 4

Case study of the water distribution system.

Close modal

Using the proposed methodology, 500 random contamination scenarios per potential injection point (i.e., nodes with demands) were generated, 12.5M samples in total. The samples were divided into a training and a validation set (80%) and a testing set (20%). The training set was used to train the CNN and optimize its weights and bias values using Equation (9). The testing set was not exposed to the model during training. The CNN model was trained on a server with four Nvidia Tesla v100 GPUs each with 16GB of ram.

The number of potential intrusion points in the case study network was initially limited to 400 and then increased gradually to include all of the nodes with demand (i.e., 26k). This was done to evaluate the performance of the proposed methodology on various network sizes. The results indicated that the CNN algorithm was able to detect intrusion points with an accuracy of above 90% in the network with 26k demand nodes. The accuracy increases in smaller systems with fewer nodes (i.e., classes). The number of samples per class was also changed from 310 to 2,000 for the smallest network (400 nodes) and then was limited to a maximum of 500 for the rest of the systems.

Results indicate that increasing the number of samples per class will increase the accuracy. For instance, when we increase the number of samples from 310 to 500 in the network with 26k demand nodes accuracy increased from 91 to 93%. This is an expected and known behavior of CNN algorithms, as is illustrated in Figure 5. It is also worth noting that increasing the number of samples per class has a higher impact on the accuracy in larger systems than in smaller ones. Increasing the number of samples could potentially compensate for the drop in accuracy caused by a higher number of classes. However, this will also increase computational cost.
Figure 5

Acquired accuracy for various numbers of classes and training samples (labels denote the number of samples used in training).

Figure 5

Acquired accuracy for various numbers of classes and training samples (labels denote the number of samples used in training).

Close modal

The following classification error analysis is performed on the network with ∼3,400 demand nodes. We intentionally reduced the number of classes to better simulate medium size systems, to simplify the statistical analysis, and to save on time required to perform the analysis. For each class we used 250 samples to train the model and 60 samples as test. The analysis is limited to the test dataset that was not exposed to the AI model during the training phase. For this particular system, the model achieved 97% accuracy in locating the intrusion source. The misclassified samples were analyzed to evaluate the model performance relative to various intrusion vectors (e.g., start and end time, duration) and network parameters, such as flow velocity and diameter of the watermains connected to the node of intrusion.

The results indicated that the model performs similarly regardless of the amount of potent substance (1–40 kg) used in an intrusion scenario. This is shown in Figure 6 where the percent of misclassified samples to the total number of samples in each bin are illustrated. The misclassified percentage varies between 1.5 and 2.0% and no significant trend could be observed. This is mainly due to the fact that these synthetic opioids' lethal dose is in nano or micro gram levels. Therefore, even one kilogram of this material would suffice to produce a pattern detectable by the algorithm.
Figure 6

Percent of misclassified samples vs. the weight of potent substances used in the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Figure 6

Percent of misclassified samples vs. the weight of potent substances used in the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Close modal
Contrary to the amount of poison released, the number of unrelated calls (noise) could impact model performance. The percent of misclassified samples increases with a higher number of 911 noise calls (Figure 7) from approximately 1–2% in each bin. This effect could further be observed in Figures 8 and 9 where the percentage of misclassified samples is shown relative to the intrusion start and end time. Misclassified samples increase during the peak hours of incoming 911 calls and decrease during the off-peak hours of the day (Figure 8). This trend repeats itself with approximately 3 h of delay (Figure 9) where the percentage of misclassified samples is shown relative to the intrusion end time. The 3-h delay is equal to the average intrusion scenario length. This trend also corresponds quite well to typical diurnal water demand, where accuracy is higher when water demand is higher.
Figure 7

Percent of misclassified samples vs. number of unrelated calls in the 6-h window (number of misclassified samples in each bin/number of samples in each bin).

Figure 7

Percent of misclassified samples vs. number of unrelated calls in the 6-h window (number of misclassified samples in each bin/number of samples in each bin).

Close modal
Figure 8

Percent of misclassified samples vs. the starting time of the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Figure 8

Percent of misclassified samples vs. the starting time of the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Close modal
Figure 9

Percent of misclassified samples vs. the ending time of the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Figure 9

Percent of misclassified samples vs. the ending time of the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Close modal
Intrusion duration could also have some impact on the proposed model performance (Figure 10). The results suggest that the model performs at its best when the intrusion length is approximately 2–3 h. Shorter period intrusions (e.g., 15 min duration) could be misclassified as noise, while longer intrusion (e.g., greater than 4 h) could result in misclassification of the intrusion point. The impact on the model performance of the shorter intrusions can also be seen in the peak during the initial hours of the day (01:00) in both Figures 8 and 9.
Figure 10

Percent of misclassified samples vs. the duration of the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Figure 10

Percent of misclassified samples vs. the duration of the intrusion (number of misclassified samples in each bin/number of samples in each bin).

Close modal
The intrusion point location in the network could affect model performance. This is correlated with the average flow velocity during the contaminant introduction at the intrusion point. Figure 11 illustrates the average velocity in all watermains connected to the intrusion point during the intrusion period. The results clearly indicate that intrusion points with lower average velocity (e.g., less than 0.05 m/s) could result in higher misclassifications. This could be explained by the fact that lower flow velocity results in longer time required for the potent material to be dispersed in the network. This keeps the contaminated area very small causing the model to misclassify calls as noise. This also might explain the aforementioned observation that higher water demand periods are associated with higher model accuracy rate.
Figure 11

Percent of misclassified samples vs. the average velocity at the intrusion node (number of misclassified samples in each bin/number of samples in each bin), average velocity was estimated based on the flow in each connecting pipe during the intrusion.

Figure 11

Percent of misclassified samples vs. the average velocity at the intrusion node (number of misclassified samples in each bin/number of samples in each bin), average velocity was estimated based on the flow in each connecting pipe during the intrusion.

Close modal
The proposed model performed better on scenarios where the intrusion is initiated from larger watermains. Figure 12 shows the distribution of misclassified samples versus the average diameter of watermains connected to the intrusion points. Larger watermains (i.e., greater than 250 mm) carry more flow with often higher velocity; therefore, the contaminated area and related calls could be detected with higher accuracy. Furthermore, major watermains might have fewer demand nodes compared to smaller watermains and therefore be easier to differentiate.
Figure 12

Percent of misclassified samples vs. the average diameter of the connected watermains at the intrusion node (number of misclassified samples in each bin/number of samples in each bin).

Figure 12

Percent of misclassified samples vs. the average diameter of the connected watermains at the intrusion node (number of misclassified samples in each bin/number of samples in each bin).

Close modal

The proposed approach is, in effect, a stop-gap measure to identify and protect from an intentional or unintentional intrusion into a water distribution network until such time that suitable and reliable sensors are available to detect and locate a contamination event in real time. The approach transforms what is essentially a real-time identification of an intrusion into a rapid image classification problem, where the vast majority of the heavy computing involved (i.e., scenario generation) is carried out in advance, before an intrusion takes place.

The absence (to date) of such events, coupled with the vast dimensionality inherent in the problem formulation, introduces a high level of uncertainty, establishing a need to constrain several boundary conditions as well as to make some simplifying assumptions, some of which are discussed herein.

Points of injection: For simplicity, all demand nodes were considered equally likely to be selected as injection points. In reality some points might be preferred over others. This issue could be explored with public safety experts and variable weights could be used for incorporating the likelihood of injection point. Furthermore, for simplicity, this initial research considered a single point of injection but, clearly, multiple points are possible, both simultaneously or staggered through time. The consideration of such scenarios will render the problem dimensionality so vast (due to the sheer number of possible scenarios) so as to render it unmanageable. More research is required to explore this issue. Such research can go outside the domain of pure engineering/mathematics and include public safety and security experts as well.

Intrusion duration (from 1 to 6 h) and pattern were taken as rational initial guesses that should be explored further. This, combined with the point of injection issue will no doubt further amplify the dimensionality challenge.

Time lag: the assumption that 911 call pattern will follow the contamination plume after a constant time lag might be more realistic if the time lag is taken as a random variable with an assumed probability distribution. This issue could be explored in future research.

Perfect sensor: as mentioned earlier, for simplicity we have assumed that all demand nodes that are contaminated would result in a call to the 911 center. This assumption is naïve and is equivalent to assuming a perfect sensory system. In reality many uncertainties are expected, such as people may not call from the same location, or might be unable to call due to being incapacitated or alone, or might call with longer delays or call from different locations (home, emergency ward), etc. The fact that each node in the network represents the demand for multiple consumers lends credence to an assumption that at least a few of these consumers would initiate a 911 call immediately, however this issue requires further verification in future research.

Finally, the analysis shows that model accuracy increases when the point of injection is a central (large diameter) pipe in the network and when the flow velocity at the point of injection (and possibly throughout the network) is higher. This may be due to the fact that under these conditions the poison plume moves faster and wider, which results in an image with many more affected nodes, which in turn is easier to classify and a greater problem requiring faster response. It follows that when injection is carried out through a minor pipe in low-flow conditions, the plume will move more slowly, the rate of infection will be slower and the rate of incoming 911 calls will be smaller. This suggests the possibility that 911 call centers will not be inundated within 2–3 h but rather much later, encouraging consideration of longer duration scenarios. Theoretically, if the scenarios were explored for longer periods, the low-intensity plumes might develop into more easily classifiable images. It is possible that some measure of the rate at which the plume advances in the network may provide more insight into the nature of the event and help expedite as well as improve the accuracy of the classification process. The provisional advantage of this algorithm is that more rapidly moving plumes that pose greater risk are more easily identified, while smaller plumes that move slowly (but impact fewer victims) are harder to find.

WDSs are inherently vulnerable to chemical intrusions and it is practically not feasible to completely prevent such intrusions solely by physical means. Currently, no sensory system exists to detect these intrusions; however, data from 911 call centers such as call timestamps and GIS location could be utilized to detect a waterborne intrusion and pinpoint its originating location. The proposed approach effectively converts the problem of sensing the potent chemicals in the WDSs into a pattern recognition problem and uses a deep learning algorithm (CNN) to accurately detect and locate these intrusions. This reduces the detection time significantly when compared to existing ad-hoc methods used to guess the occurrence of the intrusion.

As the proposed approach is essentially reactive (uses 911 calls as canaries in the coal mine) casualties will be inevitable before detection. However, this passive monitoring will help to reduce the detection time and, combined with containment measures, limit the potential casualties. Our future research will focus on quantifying the impact of the proposed approach on the risk associated with these intrusions while accounting for various parameters that could impact the performance of the proposed methodology as detailed in the Discussion.

Many variables related to the interaction of potent chemicals with the water infrastructure environment are still unknown. For instance, how potent chemicals would interact with various WDS components (i.e., pipes walls, reservoirs), how they react with additives like chlorine or fluoride, how to clean and recover the contaminated systems, and how to manage the aftermath are all questions that as of yet remain unanswered. Future research could address these issues. Additionally, in this study we only focused on one substance as a proxy due to its relative potency and abundance, albeit without examination into its impact on the above considerations. However, there are many other existing and/or novel chemical, biological, and radioactive agents with dangerous properties. They also need to be studied in the same context. Because the proposed approach is not intended to be a replacement for a sensory system capable of detecting in real-time harmful chemical or biological agents in WDSs, sufficient investment should be made by public safety authorities in developing these systems and better understanding potential intrusion scenarios.

The authors wish to thank the National Research Council of Canada for funding this research. They also extend thanks to Canadian Safety and Security Program, Ottawa Police Department, Health Canada's Toxicovigilance Network, the US Department of Homeland Security, and the various individuals and municipalities who wished to remain unknown for their support of the project.

Data cannot be made publicly available; readers should contact the corresponding author for details.

The authors declare there is no conflict.

Adedoja
O. S.
,
Hamam
Y.
,
Khalaf
B.
&
Sadiku
R.
2019
Sensor placement strategies for contamination identification in water distribution networks: A review
.
WIT Transactions on Ecology and the Environment
229
,
79
90
.
Arad
J.
,
Housh
M.
,
Perelman
L.
&
Ostfeld
A.
2013
A dynamic thresholds scheme for contaminant event detection in water distribution systems
.
Water Research
47
(
5
),
1899
1908
.
Barros
D.
,
Cardoso
S.
,
Oliveira
E.
,
Brentan
B.
&
Ribeiro
L.
2022
Using data mining techniques to isolate chemical intrusion in water distribution systems
.
Environmental Monitoring and Assessment
194
(
3
),
1
15
.
Berry
J.
,
Fleischer
L.
,
Hart
W. E.
,
Phillips
C. A.
&
Watson
J. P.
2005
Sensor placement in municipal water networks
.
Journal of Water Planning and Resources Management
131
,
237
243
.
Cibulsky
S. M.
,
Wille
T.
,
Funk
R.
,
Sokolowski
D.
,
Gagnon
C.
,
Lafontaine
M.
,
Brevett
C.
,
Jabbour
R.
,
Cox
J.
,
Russell
D. R.
,
Jett
D. A.
,
Thomas
J. D.
&
Nelson
L. S.
2023
Public health and medical preparedness for mass casualties from the deliberate release of synthetic opioids
.
Frontiers in Public Health
11
,
1158479
.
Cristo
C.
&
Leopardi
A.
2008
Pollution source identification of accidental contamination in water distribution networks
.
Journal of Water Resources Planning and Management
134
(
2
),
197
.
De Sanctis
A.
,
Boccelli
D.
,
Shang
F.
&
Uber
J.
2008
Probabilistic approach to characterize contamination sources with imperfect sensors
. In:
World Environmental and Water Resources Congress
(Babcock, R. W. & Walton, R., eds.)
.
Ahupua'A
,
HI, USA
, pp.
1
10
.
De Sanctis
A.
,
Shang
F.
&
Uber
J.
2010
Real-time identification of possible contamination sources using network backtracking methods
.
Journal of Water Resources Planning and Management
136 (4),
444
453
.
Dorini
G.
,
Jonkergouw
P.
,
Kapelan
Z.
,
Pierro
F. d.
,
Khu
S.
&
Savic
D.
2006
An efficient algorithm for sensor placement in water distribution systems
. In
8th Annual Water Distribution Systems Analysis Symposium
.
ASCE
,
Cincinnati, Ohio, USA
.
Eliades
D.
,
Lambrou
T.
,
Panayiotou
C.
&
Polycarpou
M.
2014
Contamination event detection in water distribution systems using a model-based approach
.
Procedia Engineering
89
,
1089
1096
.
Fasaee
M.
,
Monghasemi
S.
,
Nikoo
M.
,
Shafiee
M.
,
Berglund
E.
&
Bakhtiari
P.
2021
A K-Sensor correlation-based evolutionary optimization algorithm to cluster contamination events and place sensors in water distribution systems
.
Journal of Cleaner Production
319
,
128763
.
Feng
S.
,
Uber
J.
&
Polycarpou
M.
2002
Particle backtracking algorithm for water distribution system analysis
.
Journal of Environmental Engineering
128 (5),
441
450
.
Giudicianni
C.
,
Herrera
M.
,
Di Nardo
A.
,
Creaco
E.
&
Greco
R.
2022
A Novel Approach for a Suitable Water Quality Sensor Placement in Water Distribution Systems
. In
International Conference EWaS5
,
Naples, Italy
.
Gleick
P. H.
2006
Water and terrorism
.
Water Policy
8
,
482
503
.
Grbčić
L.
,
Lučin
I.
,
Kranjčević
L.
&
Družeta
S.
2020
A machine learning-based algorithm for water network contamination source localization
.
Sensors, MDPI
20
(
9
),
2613
.
Green
U.
,
Kremer
J.
,
Zillmer
M.
&
Moldaenke
C.
2003
Detection of chemical threat agents in drinking water by an early warning real-time biomonitor
.
Environmental Toxicology: An International Journal
18
(
6
),
368
374
.
Hart
D.
&
McKenna
S.
2012
User's Manual for CANARY
.
US EPA, Cincinnati
.
He
K.
,
Zhang
X.
,
Ren
S.
&
Sun
J.
2016
Deep residual learning for image recognition
. In
Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit,
. pp.
770
778
.
Hilbe
M.
2009
Logistic Regression Models
.
Chapman & Hall/CRC, Boca Raton
.
Housh
M.
&
Ostfeld
A.
2015
An integrated logit model for contamination event detection in water distribution systems
.
Water Research
doi:10.1016/j.watres.2015.02.016
.
Howard
A. G.
,
Zhu
M.
,
Chen
B.
,
Kalenichenko
D.
,
Wang
W.
,
Weyand
T.
,
Andreetto
M.
&
Adam
H.
2017
Mobilenets: Efficient convolutional neural networks for mobile vision applications
.
Computer Vision and Pattern Recognition
doi:arXiv:1704.04861
.
Howard
A.
,
Sandler
M.
,
Chu
G.
,
Chen
L.-C.
,
Chen
B.
,
Tan
M.
,
Wang
W.
,
Zhu
Y.
,
Pang
R.
,
Vasudevan
V.
&
Adam
H.
2019
Searching for mobileNetV3
. In
Proc. IEEE Int. Conf. Comput. Vis.
, pp.
1314
1324
.
doi:10.1109/ICCV.2019.00140
.
Isovitsch
S. L.
&
VanBriesen
J. M.
2008
Ensor placement and optimization criteria dependencie in a water distribution system
.
Journal of Water Resources Planning and Management
134
,
186
196
.
Kessler
A.
,
Ostfeld
A.
&
Sinai
G.
1998
Detecting accidental contaminations in municipal water networks
.
Journal of Water Resources Planning and Management
124
(
4
),
192
198
.
Kim
S.
,
Jun
S.
&
Jung
D.
2022
Ensemble CNN model for effective pipe burst detection in water distribution systems
.
Water Resour Manage
36,
5049
5061
.
Krause
A.
,
Leskovec
J.
,
Isovitsch
S.
,
Xu
J.
,
Guestrin
C.
,
VanBriesen
J.
,
Small
M.
&
Fischbeck
P.
2006
Optimizing Sensor Placements In: Water Distribution Systems Using Submodular Function Maximization
. In
The 8th Annual Water Distribution Systems Analysis Symposium
,
Cincinnati, Ohio, USA
.
Kumar
J.
,
Brill
E.
,
Mahinthakumar
G.
&
Ranjithan
S.
2012
Contaminant source characterization in water distribution systems using binary signals
.
Journal of Hydroinformatics
14
(
3
),
585
602
.
Laird
C. D.
,
Biegler
L. T.
,
van Bloemen Waanders
B. G.
&
Bartlett
R. A.
2005
Contamination source determination for water networks
.
Journal of Water Resources Planning and Management
131 (2),
125
134
.
Laird
C. D.
,
Biegler
L.
&
van Bloemen Waanders
B.
2006
Mixed-integer approach for obtaining unique solutions in source inversion of water networks
.
Journal of Water Resources Planning and Management
132 (4),
242
251
.
Lambrou
T.
,
Anastasiou
C.
,
Panayiotou
C.
&
Polycarpou
M.
2014
A low-cost sensor network for real-time monitoring and contamination detection in drinking water distribution systems
.
IEEE Sensors Journal
14
(
8
),
2765
2772
.
Li
Z.
,
Zhang
C.
,
Liu
H.
,
Zhang
C.
,
Zhao
M.
,
Gong
Q.
&
Fu
G.
2022
Developing stacking ensemble models for multivariate contamination detection in water distribution systems
.
Science of The Total Environment
828
,
154284
.
Liu
L.
,
Zechman
E.
,
Brill
J. E.
,
Mahinthakumar
G.
,
Ranjithan
S.
&
Uber
J.
2008
Adaptive contamination source identification in water distribution systems using an evolutionary algorithm-based dynamic optimization procedure
. In:
Water Distribution Systems Analysis Symposium 2006
(Buchberger, S. G., Clark, R. M., Grayman, W. M. & Uber, J. G., eds.). ASCE, Reston
, pp.
1
9
.
Liu
L.
,
Ranjithan
S.
&
Mahinthakumar
G.
2011
Contamination source identification in water distribution systems using an adaptive dynamic optimization procedure
.
Journal of Water Resources Planning and Management
137 (2),
183
192
.
Mann
A.
,
McKenna
S.
,
Hart
W.
&
Laird
C.
2012
Real-time inversion in large-scale water networks using discrete measurements
.
Computers & Chemical Engineering.
doi:10.1016/j.compchemeng.2011.08.001
.
Misailidi
N.
,
Papoutsis
I.
,
Nikolaou
P.
,
Dona
A.
,
Spiliopoulou
C.
&
Athanaselis
S.
2018
Fentanyls continue to replace heroin in the drug arena: The cases of ocfentanil and Carfentanil
.
Forensic Toxicology
36
,
12
13
.
Mohammed
H.
,
Hameed
I.
&
Seidu
R.
2018
Machine learning: Based detection of water contamination in water distribution systems
. In:
Proceedings of the Genetic and Evolutionary Computation Conference Companion
, pp.
1664
1671
.
Oliker
N.
&
Ostfeld
A.
2013
Classification-optimization model for contamination event detection in water distribution systems
. In:
World Environmental and Water Resources Congress 2013: Showcasing the Future
(Patterson, C. L., Struck, S. D., & Murray, D. J., eds.). ASCE, Reston
, pp.
626
636
.
Oliveira
E.
,
Brentan
B.
,
Dantas
R. F.
,
dos Santos Macedo
L.
,
Junior
E. L.
&
Ribeiro
L. C.
2018
Detection of chemical intrusion compounds in water distribution networks by quality sensors data mining
. In
WDSA/CCWI Joint Conference Proceedings, 1
.
O'Shea
K.
&
Nash
R.
2015
An Introduction to Convolutional Neural Networks
.
Ostfeld
A.
&
Salomons
E.
2004
Optimal layout of early warning detection stations for water distribution systems security
.
Journal of Water Resources Planning and Management
130
(
5
),
377
385
.
Ostfeld
A.
,
Kessler
A.
&
Goldberg
I.
2004
A contaminant detection system for early warning in water distribution networks
.
Engineering Optimization
36
(
5
),
525
538
.
Ostfeld
A.
,
Uber
J.
,
Salomons
E.
,
Berry
J.
,
Hart
W.
,
Phillips
C.
,
Watson
J.
,
Dorini
G.
,
Jonkergouw
P.
,
Kapelan
Z.
&
Di Pierro
F.
2008
The battle of the water sensor networks (BWSN): A design challenge for engineers and algorithms
.
Journal of Water Resources Planning and Management
134
(
6
),
556
568
.
Pal
A.
&
Kant
K.
2015
Collaborative heterogeneous sensing: An application to contamination detection in water distribution networks
. In:
24th International Conference on Computer Communication and Networks (ICCCN)
.
IEEE
, pp.
1
8
.
Palleti
V.
,
Narasimhan
S.
,
Rengaswamy
R.
,
Teja
R.
&
Bhallamudi
S.
2016
Sensor network design for contaminant detection and identification in water distribution networks
.
Computers & Chemical Engineering
87
,
246
256
.
Perelman
L.
&
Ostfeld
A.
2013
Bayesian networks for source intrusion detection
.
Journal of Water Resources Planning and Management
139 (4),
426
432
.
Perelman
L.
,
Arad
J.
,
Housh
M.
&
Ostfeld
A.
2012
Event detection in water distribution systems from multivariate water quality time series
.
Environmental Science & Technology
46
(
15
),
8212
8219
.
Preis
A.
&
Ostfeld
A.
2008
Genetic algorithm for contaminant source characterization using imperfect sensors
.
Civil Engineering and Environmental Systems
25,
29
39
.
Ramotsoela
D.
,
Hancke
G.
&
Abu-Mahfouz
A.
2019
Attack detection in water distribution systems using machine learning
.
Human-centric Computing and Information Sciences
9
(
1
),
1
22
.
Rodriguez
J.
,
Bynum
M.
,
Laird
C.
,
Hart
D.
,
Klise
K.
,
Burkhardt
J.
&
Haxton
T.
2021
Optimal sampling locations to reduce uncertainty in contamination extent in water distribution systems
.
Journal of Infrastructure Systems
27
(
3
),
10.1061
.
Rossman
L. A.
1994
EPANET User's Manual
.
Risk Reduction Engrg. Lab., Off. of Res. and Devel., U.S. Environmental Protection Agency
,
Cincinnati
.
Sandler
M.
,
Howard
A.
,
Zhu
M.
,
Zhmoginov
A.
&
Chen
L.-C.
2019
MobileNetV2: Inverted Residuals and Linear Bottlenecks
.
Schwartz
R.
,
Oliker
N.
&
Ostfeld
A.
2013
Water Distribution Systems Complex Contamination Simulations for Event Detection Model Calibration and Verification
. In:
World Environmental and Water Resources Congress 2013: Showcasing the Future
,
(Patterson, C. L., Struck, S. D., & Murray, D. J., eds.). ASCE, Reston
, pp.
1016
1021
.
Seth
A.
,
Klise
K.
,
Siirola
J.
,
Haxton
T.
&
Laird
C.
2016
Testing contamination source identification methods for water distribution networks
.
Journal of Water Resources Planning and Management
142
, 1–38.
Shen
H.
&
McBean
E.
2011
Pareto optimality for sensor placements in a water distribution system
.
Journal of Water Resources Planning and Management
37
(
3
),
243
248
.
Sun
L.
,
Yan
H.
,
Xin
K.
&
Tao
T.
2019
Contamination source identification in water distribution networks using convolutional neural network
.
Environmental Science and Pollution Research
26
(
36
),
36786
36797
.
Tan
M.
&
Le
Q. V.
2019
EfficientNet: Rethinking model scaling for convolutional neural networks
. In:
36th Int. Conf. Mach. Learn. ICML 2019
, pp.
10691
10700
.
Tao
T.
,
Lu
Y.
,
Fu
X.
&
Xin
K.
2012
Identification of sources of pollution and contamination in water distribution networks based on pattern recognition
.
Journal of Zhejiang University SCIENCE A
13
(
7
),
559
570
.
The Canadian Press
2017
Police Make Massive Seizure of Carfentanil in Home East of Toronto
.
UNODC
2017
Recommended Methods for the Identification and Analysis of Fentanyl and its Analogues in Biological Specimens
.
UN Office of Drugs and Crime, Vienna
.
Vrachimis
S.
,
Lifshitz
R.
,
Eliades
D.
,
Polycarpou
M.
&
Ostfeld
A.
2020
Active contamination detection in water-distribution systems
.
Journal of Water Resources Planning and Management
146
,
04020014
.
Wu
Z. Y.
&
Roshani
E.
2014
Sensor Placement Optimization for Water Quality Model Calibration
. In:
World Environmental and Water Resources Congress
(Hubet, W. C., ed.).
ASCE
,
Portland, Oregon
, pp.
535
545
.
Wu
Z.
&
Walski
T. M.
2006
Multi objective optimization of sensor placement
. In:
The 8th Annual Water Distribution Systems Analysis Symposium
,
Cincinnati, Ohio, USA
.
Yan
X.
,
Zhao
J.
,
Hu
C.
&
Wu
Q.
2016
Contaminant source identification in water distribution network based on hybrid encoding
.
Journal of Computational Methods in Sciences and Engineering
16 (2),
379
390
.
Yan
X.
,
Gong
W.
&
Wu
Q.
2017
Contaminant source identification of water distribution networks using cultural algorithm
.
Concurrency and Computation: Practice and Experience
29
(
24
),
e4230
.
Yan
X.
,
Hu
C.
&
Sheng
V.
2020
Data-driven pollution source location algorithm in water quality monitoring sensor networks
.
International Journal of Bio-Inspired Computation
15
(
3
),
171
180
.
Yan
X.
,
Gong
J.
&
Wu
Q.
2021
Pollution source intelligent location algorithm in water quality sensor networks
.
Neural Computing and Applications
33
(
1
),
209
222
.
Yang
X.
&
Boccelli
D.
2014
Bayesian approach for real-time probabilistic contamination source identification
.
Journal of Water Resources Planning and Management
140 (8),
401
419
.
Zierolf
M. L.
,
Polycarpou
M. M.
&
Uber
J. G.
1998
Development and autocalibration of an input-output model of chlorine transport in drinking water distribution systems
.
IEEE Transactions on Control Systems Technology
6
(
4
),
543
553
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).