Detection of water quality failure events at treatment works using a hybrid two-stage method with CUSUM and random forest algorithms

Near real-time event detection is crucial for water utilities to be able to detect failure events in their water treatment works (WTW) quickly and ef ﬁ ciently. This paper presents a new method for an automated, near real-time recognition of failure events at WTWs by the application of combined statistical process control and machine learning techniques. The resulting novel hybrid CUSUM event recognition system (HC-ERS) uses two distinct detection methodologies: one for fault detection at the level of individual water quality signals and the second for the recognition of faulty processes at the WTW level. HC-ERS was tested and validated on historical failure events at a real-life UK WTW. The new methodology proved to be effective in the detection of failure events, achieving a high true detection rate of 82% combined with a low false alarm rate (average 0.3 false alarms per week), reaching a peak F 1 score of 84% as measure of accuracy. The new method also demonstrated higher accuracy compared to the CANARY detection methodology. When applied to real-world data, the HC-ERS method showed the capability to detect faulty processes at WTW automatically and reliably, and hence potential for practical application in the water industry.


GRAPHICAL ABSTRACT INTRODUCTION
Water utilities around the world face considerable challenges in ensuring that their WTWs produce water of the required quality and quantity. To operate at lowest expenditure, WTWs are already heavily monitored and automated using online sensors deployed at the different treatment stages.
Near real-time detection of faulty sensors and/or WTW's processes is essential for efficient and effective plant operation.
However, due to varying water demand, changing influent conditions, dynamics in water treatment processes and imperfect, missing or incorrect sensor data, this is a difficult task to achieve. In the UK, most WTWs use event recognition systems (ERS), which apply thresholds to generate alarms and detect abnormal behaviour in observed signals. Unfortunately, those threshold-based systems have the major drawback that they result in low true detection and high false positive rates (Riss et al. ). Security Technologies ). However, this first generation of software packages still suffers from a number of shortcomings, such as insufficient real detection capability or too many false alarms (Bernard et al. ). To overcome the above shortcomings, new and more efficient technologies need to be developed focusing on innovative, costeffective and, wherever possible, predictive near real-time event detection systems.
In this paper, we investigate the application of the novel hybrid CUSUM event recognition system (HC-ERS) for the detection of failure events at WTWs and demonstrate improvements achieved by evaluating the detection performance of the HC-ERS for real sensor data and historical events. In addition, we compare HC-ERS' performance to the performance of (i) the threshold based WTWs event detection system currently used by one of the largest water companies in the UK and (ii) the well-known CANARY event detection algorithms.

BACKGROUND
Online monitoring of water quality to control the treatment processes of WTWs has made considerable progress in recent years (Storey et al. ). A broad range of fault detection techniques have already been developed (Maiti & Banerjee ). For complex systems such as treatment processes at WTWs, where the generation of analytical models is too difficult or not possible, the application of data-driven event detection methods based on statistical analyses of process data is preferred (Verron et al. ).
Most common data-driven approaches apply conventional statistical techniques such as statistical process control (SPC)  ing's T 2 charts for the fault detection at a multistage WTW. When applied to the time series data of 23 parameters collected from sensors deployed at a real life WTW over a 14-day period, the method showed feasibility in detecting abnormal process conditions and was able to identify specific parameters which contributed to disturbances in the process. Although the model seems to perform well over a short period of time, its validation over a long-term period with changing process conditions was not demonstrated in this study. Inspired by the monEAU vision (Rieger &  approach for learning the optimal control parameters using a SVM algorithm to predict WWTWs' process behaviour in terms of future plant states, estimation of optimal chemicals dosage and identification of most influential parameters. The study carried out by Dogo et al. () provides an overview of work done in anomaly detection on drinking-water quality data focussing on recent AI and ML approaches applied to water distribution systems, but also presents a specific approach for detecting anomalies  condition, the respective time step is labelled with '0'. In this way, a vector containing ones or zeros at each observed data point was generated for each of the observed Y water quality signals as an output of the applied CUSUM fault detection methodology.
Even though CUSUM charts are largely automated, some parameters can be fine-tuned for an optimal adaptation to the specific fault detection application. In particular, CUSUM control charts require a precise definition of the reference value K, which is often chosen as halfway between the target value and the 'out of control' value of the mean. By changing the reference value, the sensitivity of CUSUM method can be adjusted. The higher the K value, the less sensitive the CUSUM charting method becomes. Therefore, a fine-tuning of the system was conducted by adjusting the CUSUM parameters for each of the Y water quality signals individually to investigate the optimal control limits and K value combination, with the aim to explore the best possible CUSUM output to serve as input for the subsequent RF event detection method. To achieve this, a sensitivity analysis was performed by gradually changing the K values (from 1σ to 9σ in 0.5σ increments) for different control limits (1σ, 3σ, 6σ and 12σ) and time windows (1 day

Random forest event detection method
The objective of the event detection methodology is to investigate possible improvements to the CUSUM fault detection performance by moving away from applying detection rules to individual water quality/other sensor signals only.
Indeed, it is expected that moving away from treating individual signals independently (i.e. using a univariate detection method) towards a more sophisticated multivariate event recognition system will increase the true detection rate and, in particular, reduce the false alarm rate. Once the binary output for each signal is generated as a result of the CUSUM fault detection process, a prediction about the likelihood of a WTW failure event occurring is made by a trained RF classifier (Breiman ). It has often been shown that the RF method outperforms other one-class classifier methods by a significant margin (Hempstalk et al. ) hence it was selected here.
The RF method applied in this study works by using a set of input variables (CUSUM method outputs), which are then passed onto each of the decision trees in the forest. RF classifiers implement randomness in the modelling process, by selecting at each node of the decision tree the variable for splitting as a randomly selected sample of the independent input variables. Each tree gives a prediction and the mean of these values is the prediction of the RF. In the event detection method used here, the RF classifier estimates the probability of the presence of a failure event at the WTW. Similar to CUSUM fault detection, the RF classification method is data-driven and learns relevant relations from the dataset of the observed Y water quality signals, that contains pre-labelled events, aiming to classify the condition of WTW' processes into normal or faulty, respectively, to predict the presence of a failure event. For reliable predictions of process conditions, suitable relations between the candidate signals, i.e. across the Y water quality signals needed to be analysed by the classifier. To achieve this, the fine-tuned CUSUM's binary output of the Y Kohavi & John () was used to identify and reject the signals that have been considered as insignificant or counterproductive for model's performance. This optimisation process resulted in a final model that was assumed to perform best, i.e. to demonstrate the best ratio between TP and false positives (FP) (see Figure 3) by using only a subset of the original Y water quality signals.

Detection performance assessment
The performance assessment of all detection methods (i.e. ERSs) used here was conducted by simulating the ERSs using historical time series data (5 min intervals) of Y water quality signals with X pre-labelled events contained in the datasets. All ERS methods were first calibrated using the data from the calibration time period. The performance of calibrated ERS methods was then assessed using unseen data of the validation time period. This was done by creating two-by-two confusion matrices with true/ false positives/negatives, showing the distribution of possible outcomes for Y water quality signals (see Figure 3).
Performance statistics were then calculated for each of the Y water quality signals as shown in Figure 4. The detection performance of the overall ERS is evaluated by averaging the detection rates and summation of FP over all Y observed water quality signals.
The derived performance statistics (see Figure 4)  As part of the first analysis, the data of individual signals were examined to identify large numbers of missing data over a significantly long time period (one month, used here). If data was missing for more than one month ∼30% of total time period).

WTW minor and major events
As part of this study, historical events were manually identified and classified into major and minor events to enable the computation of performance measures.
A total of 5 major events were reported by the water company. These are events that resulted in unplanned shutdowns (full or partial) of the WTW. Figure 8 shows  Once the events were identified as per the above, major and minor events within the final dataset were labelled accordingly.     Table 1). Therefore, an alarm is only raised if the pH value goes below 5.8 or above 7.5 for longer than 10 minutes.

RESULTS AND DISCUSSION
The aims of the analysis conducted are to (a) evaluate the performance of the developed HC-ERS method in terms of its detection capabilities and (b) to compare the performance of the HC-ERS method with the performance of the E-ERS and the CANARY methods.

E-ERS performance assessment
The performance of the E-ERS is evaluated here by using the  Table 1.   As it can be seen from Table 4, the E-ERS is able to detect only 22% of total events, 64% of major and 21% of minor events respectively. The significant higher true detection rate for major events was expected since these events are easier to detect than the minor ones. The E-ERS also generates a considerably high number of false alarms, as demonstrated by the FDR of 38% and the high number of

HC-ERS performance assessment
The performance of the HC-ERS is evaluated here in the same way as it was done for the E-ERS. After stepwise elimination of the redundant signals, the performance of the HC-ERS was evaluated using the 16 signals shown in Table 2.
The HC-ERS's detection statistics for the validation dataset are presented in

CANARY performance assessment
The performance of the well-known CANARY method is   time steps (1 hr) was selected for the BED window because, similar to above, shorter BED sizes increase the number of alarms, while events of short duration (shorter than the BED) will not be detected with larger BED window sizes.
The NRO used for the analyses were calculated as In addition to the above, HC-ERS is also computationally efficient. Indeed, HC-ERS is capable of processing approximately 300 observations per second, including the sensor data validation and pre-processing procedure, while CANARY processes around 100 observations per second.
These results were obtained on a laptop with i5 2.2 GHz processor having 12 GB RAM.

CONCLUSION AND FUTURE WORK
The work presented in this paper introduces a new methodology for near real-time detection of failure events at WTW.  This, of course, may not be true for other case studies and the selection of sensors to use needs to be identified on a case-by-case basis, via suitable preliminary analysis. Assuming good quality data, the selection of sensors will depend largely on the characteristics of events being detected and whether and how these events manifest themselves on different water quality signals. Regarding this, water quality signals containing complementary information (i.e. sensors of different type) are especially useful as this helps with the detection.
Having said this, using redundant sensor information (i.e. multiple sensors of the same type) can be useful too, as it enables to detect events with higher true detection rates and lower false alarm rates. Finally, when using the HC-ERS on data from other WTWs, it is important to ensure that sufficiently large number of real failure events are collected and used for the training of RF classifiers. Again, the exact number of events and their characteristics needs to be decided on a case-by-case basis, depending on the nature and characteristics of events being detected.
The use of enhanced sensors that can provide the 'health status' of assets should also be investigated, to examine possible options for integrating this additional metadata (asset condition) into the detection process. Providing additional information could be beneficial for more reliable detection results and would likely improve the system's overall detection performance.