Effective coagulation is essential to achieving drinking water treatment objectives when considering surface water. To minimize settled water turbidity, artificial neural networks (ANNs) have been adopted to predict optimum alum and carbon dioxide dosages at the Elgin Area Water Treatment Plant. ANNs were applied to predict both optimum carbon dioxide and alum dosages with correlation (*R*^{2}) values of 0.68 and 0.90, respectively. ANNs were also used to developed surface response plots to ease optimum selection of dosage. Trained ANNs were used to predict turbidity outcomes for a range of alum and carbon dioxide dosages and these were compared to historical data. Point-wise confidence intervals were obtained based on error and squared error values during the training process. The probability of the true value falling within the predicted interval ranged from 0.25 to 0.81 and the average interval width ranged from 0.15 to 0.62 NTU. Training an ANN using the squared error produced a larger average interval width, but better probability of a true prediction interval.

## INTRODUCTION

Effective coagulation is integral to maintaining a multi-barrier approach in conventional drinking water treatment trains (GLUMRB 2012). As such, new approaches are required to optimize water treatment processes and prepare for future challenges. The potential application of artificial neural networks (ANNs) has received much interest due to the fact that ANNs can accurately represent non-linear phenomena such as those found during drinking water treatment (Zhang & Stanley 1999; Baxter *et al.* 2002; Shariff *et al.* 2004; Wu & Lo 2008). The Elgin Area Water Treatment Plant (EAWTP), Port Stanley, ON, Canada adjusts raw water pH using carbon dioxide (CO_{2}) as part of treatment that includes coagulation with aluminum sulfate (alum), and the addition of powdered activated carbon (PAC) for taste and odor control followed by sedimentation and filtration. This study presents results of implementing an ANN trained to optimize alum and CO_{2} dosages based on settled water turbidity (SWT) to illustrate how point-wise confidence intervals and dosage-response surfaces may be used to expand the reliability of ANNs for process optimization.

The ANNs utilized a multi-layer perceptron (MLP) architecture, the most common type to be applied for function approximation in the water treatment industry (Baxter *et al.* 2001). This architecture includes three layers of neurons (input, hidden, and output (Supplementary Material, Figure S1, available online at http://www.iwaponline.com/ws/015/066.pdf)). The input layer scales online process parameter data and feeds it into the network. A hidden layer, comprising several transfer functions analogous to biological neurons, is connected to each input of the first layer by a series of weighted connections called axons. While the hidden layer may contain several layers of neurons, the basic MLP type typically includes one, in which the number of neurons is optimized (Baxter *et al.* 2001; Krishnaiah *et al.* 2007; Griffiths & Andrews 2011a, b). The output layer includes one or more functions that provide predictions for a process (Dayhoff & DeLeo 1999). Although ANNs can successfully be applied to model non-linear processes, they are perceived as ‘black boxes’, which may hinder their adoption and full incorporation into process control strategies (Shariff *et al.* 2004). Many approaches have been suggested to provide better context for ANN outputs such that acceptance barriers can be lowered (Andrews *et al.* 1995; Rafiq *et al.* 2001; Quan *et al.* 2014). In this study, focus has been placed on estimating the error in ANN predictions to improve their reliability by using ANNs to produce customized confidence intervals for each multivariate condition. The goal is to produce confidence intervals that provide greater accuracy than normally distributed *t*-test based confidence intervals and provide guidance regarding regions of poor predictive capacity.

## METHODOLOGY

### Input selection and data pre-processing

Historical water quality and operating data from the primary treatment trains of the EAWTP (Figure 1) were collected for a 1-year period (March 2013 to February 2014). Raw water inputs included temperature, conductivity, pH, and turbidity. In addition, operational parameters included alum dosage, CO_{2} dosage, raw water flow rate, pH following coagulation, SWT, cationic polymer and PAC dosages. These data were used to develop ANNs to predict optimum alum and CO_{2} dosages. To address seasonal variability, the data set was divided into Spring (03/01–06/02), Summer/Fall (06/02–11/16), and Winter (11/17–02/28) periods, using a similar approach to that described by Baxter *et al.* (2001) and Griffiths & Andrews (2011a, b) (Supplementary Material, Table S1, available online at http://www.iwaponline.com/ws/015/066.pdf).

Historical data retrieved from the EAWTP were averaged into 1-hour exemplars, defined as one complete record of inlet and outlet parameters. To improve training of the neural networks, data transformations were applied to modify raw water turbidity values such that they followed a normal distribution (Stein 1993; Rafiq *et al.* 2001) (Supplementary Material, Figure S2, online at http://www.iwaponline.com/ws/015/066.pdf).

Owing to the hydraulic residence time associated with flow through the unit processes, raw water changes and control actions influence downstream results following a time delay, referred to as a lag or dead time. Thus, the online parameters can be aligned by a procedure known as data lagging, i.e. adjusting input parameters by a specified number of time steps based on the delay between invoking a change and observing a response. Data lags were calculated using hydraulic retention times based on a typical process flow rate (Table 1). Hourly exemplars were averaged once appropriate time delays were implemented.

Function of parameter | ||||||
---|---|---|---|---|---|---|

Selected online parameter | Lag^{a} | Time delay | SWT ANN | Alum and CO_{2} dose ANN | Error-based confidence interval ANN | Squared error-based confidence interval ANN |

Ln (raw water turbidity) – Lag 3 (NTU) | 41 | 6 h 50 min | I | I | I | I |

Ln (raw water turbidity) – Lag 2 (NTU) | 31 | 5 h 10 min | I | I | I | I |

Ln (raw water turbidity) – Lag 1 (NTU) | 22 | 3 h 40 min | I | I | I | I |

Raw water temperature (°C) | 22 | 3 h 40 min | I | I | I | I |

Raw water pH | 22 | 3 h 40 min | I | I | I | I |

Raw water conductivity (μS/cm) | 22 | 3 h 40 min | I | I | I | I |

CO_{2} dosage (mg/L) | 19 | 3 h 10 min | I | O | I | I |

Alum dosage (mg/L) | 19 | 3 h 10 min | I | O | I | I |

Polymer dosage (mg/L) | 19 | 3 h 10 min | I | I | I | I |

PAC (mg/L) | 19 | 3 h 10 min | I | I | I | I |

Flow rate (L/s) | 6 | 1 h | I | I | I | I |

SWT (NTU) | 0 | 0 min | O | I | – | – |

Predicted SWT (NTU) | 0 | 0 min | – | – | I | I |

Predicted SWT error (NTU) | 0 | 0 min | – | – | O | – |

Predicted SWT squared error (NTU) | 0 | 0 min | – | – | – | O |

Function of parameter | ||||||
---|---|---|---|---|---|---|

Selected online parameter | Lag^{a} | Time delay | SWT ANN | Alum and CO_{2} dose ANN | Error-based confidence interval ANN | Squared error-based confidence interval ANN |

Ln (raw water turbidity) – Lag 3 (NTU) | 41 | 6 h 50 min | I | I | I | I |

Ln (raw water turbidity) – Lag 2 (NTU) | 31 | 5 h 10 min | I | I | I | I |

Ln (raw water turbidity) – Lag 1 (NTU) | 22 | 3 h 40 min | I | I | I | I |

Raw water temperature (°C) | 22 | 3 h 40 min | I | I | I | I |

Raw water pH | 22 | 3 h 40 min | I | I | I | I |

Raw water conductivity (μS/cm) | 22 | 3 h 40 min | I | I | I | I |

CO_{2} dosage (mg/L) | 19 | 3 h 10 min | I | O | I | I |

Alum dosage (mg/L) | 19 | 3 h 10 min | I | O | I | I |

Polymer dosage (mg/L) | 19 | 3 h 10 min | I | I | I | I |

PAC (mg/L) | 19 | 3 h 10 min | I | I | I | I |

Flow rate (L/s) | 6 | 1 h | I | I | I | I |

SWT (NTU) | 0 | 0 min | O | I | – | – |

Predicted SWT (NTU) | 0 | 0 min | – | – | I | I |

Predicted SWT error (NTU) | 0 | 0 min | – | – | O | – |

Predicted SWT squared error (NTU) | 0 | 0 min | – | – | – | O |

** ^{a}**1 Lag = 10 min.

I = input parameter, O = output parameter.

Inconsistent data (attributed to malfunctioning equipment, recalibration, or shutdown scenarios) were identified using histograms and removed to provide a representative data set from which to train the ANNs.

### ANN training, validation and performance

#### Neural network design

The ANNs were developed using NeuroSolutions^{®} v.5.07, a commercially available ANN software (NeuroDimension Inc 2008). All data were randomized then separated as follows: 60% for training, 20% for cross-validation, and 20% for testing. The training set was used to adjust the weights of the synapses in the ANN and ultimately reduce the mean squared error (MSE), while the cross-validation set was used to determine if the ANN was ‘learning’ a given process or merely ‘memorizing’ (over-fitting) the exemplars. Once training was completed, the testing data were applied to evaluate ANN performance.

The momentum learning rule was used to develop ANNs in which future steps in the error reduction algorithm were controlled by a momentum coefficient (μ) and a learning rate (*γ*). The momentum coefficient controlled the influence of previous updates in the axon weight matrix on the next step. The learning rate determined the size of the next weight update, to balance forward progress in error reduction while maintaining stability in the back-propagation process. The initial values (μ = 0.7 and *γ* = 0.5) were adjusted to decrease cross-validation error and maintain the speed of the training process where necessary. The hidden layer neurons were comprised of the hyberbolic tangent (tanh) transfer function type recommended for regression problems. Inputs for this type of neuron were automatically scaled from −0.9 to +0.9 (NeuroDimension Inc 2008).

A MSE was calculated for the training data and for a cross-validation data set to identify any possible over-fitting of the ANN. When the cross-validation error reached a minimum, the training process was terminated (Basheer & Hajmeer 2000). The data were trained in replicates using a variety of hidden layers to compare cross-validation errors between the ANNs as described by Chai (1998).

#### Alum and carbon dioxide dose optimization

Once an ANN had been trained using a set of data, it was then tested with new (previously unseen) data to determine the effectiveness of the learning procedure. For this study, an ANN was used to help understand the effect of dosage changes on process performance, enabling optimization efforts to minimize turbidity. Dosage-response surfaces were developed by systematically introducing a possible range of CO_{2} and alum dosages and observing predicted trends in SWT. When the response surface was smooth and continuous the ANN exhibited stability for a given dosage region. Trends in turbidity reduction with chemical dosages may also be identified based on the surface's minimum or maximum.

Each response surface analyzed in this study utilized conditions from a single exemplar (i.e. one set of historical conditions) to produce a new data set. Multiple new exemplars were created by copying all process conditions, except the CO_{2} and alum dosages which were replaced by a matrix of inputs varying between the 5th and 95th percentiles of historical values (alum (30–60 mg/L) and carbon dioxide (5–25 mg/L)). This data set was tested and an output array plotted as a dosage-response surface. These surfaces are a graphical representation of the predicted outcomes of a series of hypothetical chemical dosages while holding the remaining variables constant. In practice, one of these plots could be produced every time an operator was required to make a dosage decision.

#### Error prediction training

Error estimation based on point-wise confidence intervals was used, as described by others (Lowe & Zapart 1999; Khosravi *et al.* 2011). Error was calculated based on the difference between ANN predictions and measured values. Errors were then appended to the associated exemplars in the original input data set. New ANNs were then trained using this data set to estimate the error associated with each prediction. The absolute values of these predictions were used as confidence intervals. This was repeated with a second set of error values that were squared to impose a greater penalty on poor predictions and provide a higher contrast with respect to performance. Confidence intervals were produced by taking the square root of the ANN predictions. These intervals were customized for each set of input conditions to provide a more flexible and real-time estimation of the ANN's performance in predicting process conditions.

#### Optimization and performance measures

*et al.*2011; Quan

*et al.*2014): where

*c*= coverage index,

_{i}*n*= number of exemplars in the data set, and PICP = probability that any given confidence interval includes the true value. When the point-wise confidence interval (predicted SWT ± error estimate) includes the true value,

*c*= 1 otherwise

_{i}*c*= 0. Since a PICP score of 1.0 could be achieved by an unrealistically wide set of prediction ranges, the prediction interval average width (PIAW) was also used. The PIAW was modified from that described by Quan

_{i}*et al.*(2014) to account for the single error estimate produced by the ANNs rather than upper and lower bounds (Equation (2)). The absolute value of the error estimate was doubled to calculate the span of the interval:

## RESULTS AND DISCUSSION

### Optimum alum dosage and CO_{2} dosage ANNs

Results of the alum dosage and CO_{2} dosage ANNs are shown in Table 2. Predictions from one ANN for both alum dosage and CO_{2} dosage were compared to historical dosages (Figure 2(a) and 2(c)). The CO_{2} dosage predictions (mean absolute error (MAE) = 2.83 mg/L, *R*^{2} = 0.68 and the 95% confidence interval = ±7.62 mg/L (Figure 2(b))) provided a weaker correlation and greater variability than the alum predictions discussed below. The ANN under-predicted CO_{2} dosages above 10 mg/L and over-predicted below 10 mg/L. The true value falls within the 95% confidence interval of the trend line from 2 to 35 mg/L. It should be noted that two major alum dosage peaks in April (Figure 2(c)) did not have corresponding spikes in CO_{2} dosages, which peaked in early May (Figure 2(a)). This illustrates that challenge conditions did not necessarily result in an operational decision to increase the dosages of both chemicals. During the period of data collection, the carbon dioxide dosages had been chosen for pH reduction, rather than for the minimization of SWT. The different dosage objectives may have resulted in a wider variation of possible dosage conditions to satisfy the training outputs.

Season | ANN geometry (Input → Hidden → Output) | Range (mg/L) | MAE (mg/L) | R^{2} |
---|---|---|---|---|

Carbon dioxide | ||||

Spring | 10 → 5 → 2 | 2.29–48.2 | 2.83 | 0.68 |

Summer/Fall | 10 → 9 → 2 | 9.25–78.0 | 6.24 | 0.53 |

Winter | 10 → 6 → 2 | 0.0–95.7 | 7.48 | 0.66 |

Alum | ||||

Spring | 10 → 5 → 2 | 28.4–107 | 2.74 | 0.90 |

Summer/Fall | 10 → 9 → 2 | 16.5–104 | 3.02 | 0.82 |

Winter | 10 → 6 → 2 | 8.92–87.2 | 2.55 | 0.89 |

Season | ANN geometry (Input → Hidden → Output) | Range (mg/L) | MAE (mg/L) | R^{2} |
---|---|---|---|---|

Carbon dioxide | ||||

Spring | 10 → 5 → 2 | 2.29–48.2 | 2.83 | 0.68 |

Summer/Fall | 10 → 9 → 2 | 9.25–78.0 | 6.24 | 0.53 |

Winter | 10 → 6 → 2 | 0.0–95.7 | 7.48 | 0.66 |

Alum | ||||

Spring | 10 → 5 → 2 | 28.4–107 | 2.74 | 0.90 |

Summer/Fall | 10 → 9 → 2 | 16.5–104 | 3.02 | 0.82 |

Winter | 10 → 6 → 2 | 8.92–87.2 | 2.55 | 0.89 |

A similar trend plot for predicted and actual alum dosages is shown in Figure 2(c). The ANN tended to under-predict alum dosage, however, the slope of the trend line (0.92) was very close to the ideal slope (1.0) and the true alum dosages fell within 95% confidence intervals for the entire data set (Figure 2(d)). These results compare favorably with those reported by Yu *et al.* (2000), exceeding the *R*^{2} values (range 0.49–0.80) for those ANNs that did not use previous coagulant dosages as inputs in training, and slightly poorer than the ANNs that did (range 0.90–0.97). They also compare favorably to results reported by Maier *et al.* (2004), even with a wider range of raw water turbidity and exceeding the *R*^{2} range reported by Wu & Lo (2008) for ANNs.

### Settled water turbidity and response surface analysis

A response surface was produced using conditions from the plant on 3 May 2013 that included average raw water turbidity for that season (40 NTU) (Figure 3). Similar surfaces for the Summer/Fall and Winter seasons were also produced (see examples: Supplementary Material, Figures S3 and S4, available online at http://www.iwaponline.com/ws/015/066.pdf). Dosages for this surface ranged from the 5th to 95th percentiles for both chemicals (CO_{2} from 5 to 25 mg/L, alum from 30 to 60 mg/L). The ANN produced a surface with the maximum SWT predicted at the highest alum and CO_{2} dosages. A monotonic decrease in SWT was observed with respect to both chemical dosages, finally reaching a minimum SWT at 5 mg/L CO_{2} and 30 mg/L alum. Further experimentation would be required to investigate the effect of chemical dosages beyond this range to minimize SWT.

Since the prediction of higher SWT with higher chemical dosages is counter to the expected result, further investigation was required. A histogram was used to compare the actual average SWT for all raw water turbidity conditions as a point of reference for the ANN response surface (Supplementary Material, Figure S5, online at http://www.iwaponline.com/ws/015/066.pdf). The averages show increasing SWT as the alum dosage increased with the highest SWT reported for the highest alum dosages. The increase in SWT as alum dosage increases was consistent with both the operational decision to increase alum dosage as raw water turbidity increased and the correlation between raw and SWT (Supplemental Material, Figure S6, online at http://www.iwaponline.com/ws/015/066.pdf). These correlations suggest that some degree of confounding can be learned by the ANN as represented in the SWT dosage-response curve predictions in Figure 3. Plotting surface responses such as these has not been done as a diagnostic for ANN training in online coagulation in previous research and has brought a novel and valuable perspective to the use of historical data for ANN training. This effect was only detected after production of the dosage-response surfaces and appears to be symptomatic of the historical data used to train the ANN. However, its impact is small, since the accepted performance metrics were acceptable. To improve the reliability of these predictions, further research was undertaken to incorporate error estimates into the model and thus allow the ANN to identify the confidence level of a given prediction. This mitigates the confounding issue in that it provides a second measure of calculation and is discussed in more detail in the following section.

### ANN predicted confidence intervals

A summary of the error ANN and squared error ANN results can be found in Table 3. Absolute values of the error estimates were used as negative and positive confidence intervals (vertical whiskers) for the predictions of the SWT ANN (Figure 4(a)). The PICP is represented by the proportion of the whiskers that bracket the ideal line shown on the plot (actual = predicted). For the ANN trained using the error terms, PICP = 0.25, indicating that the whiskers bracket the true value 25% of the time. The PIAW was 0.15 NTU, which was less that the standard deviation for Spring SWT (0.76 NTU). The ANN trained using the squared error produced greater interval responses (PICP = 0.81 and PIAW = 0.62 NTU) (Figure 4(b)). An increased PICP value was desired, but came at the expense of an increased PIAW. However, the PIAW for the squared error ANN was less than the 95% confidence interval range (1.2 NTU) for the SWT, and less than the 80% confidence interval range (0.79 NTU) to which it should correspond. The point-wise confidence intervals provide a measure of context for operators to assess the dosage predictions of a given ANN. The PIAW and the PICP can be used to determine which ANN possesses the characteristics most desirable for use in a process control context, typically one with a balance between maximal probability of correct prediction and minimal interval size. As previously reported by Quan *et al.* (2014) when comparing ANN results for dry bulb temperature of industrial driers, a compromise was observed between increasing the PICP value of the ANN and decreasing the PIAW, with a typical optimum ANN providing a PICP = 0.90–0.93, while the squared error ANN PICP of this study ranged from 0.78 to 0.84 due to a difference in ANN optimization methods. These customized ANN-produced confidence intervals represent an original contribution to online dosage selection for coagulation processes by allowing operation staff to determine in real time their confidence in any given ANN prediction, thus reducing one of the key barriers to adoption (uncertainty about the accuracy of model predictions (Shariff *et al.* 2004)) by the industry.

Season | ANN Geometry (Input → Hidden → Output) | PICP | PIAW (NTU) | t-test Range at PICP Probability (NTU) |
---|---|---|---|---|

Error ANN | ||||

Spring | 12 → 12 → 1 | 0.25 | 0.15 | 0.20 |

Summer/Fall | 13 → 15 → 1 | 0.09 | 0.02 | 0.06 |

Winter | 13 → 7 → 1 | 0.29 | 0.06 | 0.24 |

Squared error ANN | ||||

Spring | 12 → 7 → 1 | 0.81 | 0.62 | 0.79 |

Summer/Fall | 13 → 14 → 1 | 0.78 | 0.21 | 0.61 |

Winter | 13 → 5 → 1 | 0.84 | 0.27 | 0.90 |

Season | ANN Geometry (Input → Hidden → Output) | PICP | PIAW (NTU) | t-test Range at PICP Probability (NTU) |
---|---|---|---|---|

Error ANN | ||||

Spring | 12 → 12 → 1 | 0.25 | 0.15 | 0.20 |

Summer/Fall | 13 → 15 → 1 | 0.09 | 0.02 | 0.06 |

Winter | 13 → 7 → 1 | 0.29 | 0.06 | 0.24 |

Squared error ANN | ||||

Spring | 12 → 7 → 1 | 0.81 | 0.62 | 0.79 |

Summer/Fall | 13 → 14 → 1 | 0.78 | 0.21 | 0.61 |

Winter | 13 → 5 → 1 | 0.84 | 0.27 | 0.90 |

## CONCLUSIONS

The study investigated the use of ANNs to improve coagulation performance at the EAWTP. Three optimization approaches were investigated: the prediction of alum and CO_{2} dosages using historical data, the use of SWT response surfaces based on hypothetical inputs, and a comparison between the use of errors and squared errors as ANN outputs for point-wise confidence intervals.

Alum dosages were successfully predicted for one calendar year of data (Spring, Summer/Fall and Winter), while the greater variation of the CO_{2} dosages in the ANN predictions may have been due to dosage decisions being based on pH, rather than being adjusted for SWT. A trained SWT ANN was used to produce response surfaces that predicted the effects of the chemical additions when searching for an optimum dosage-response to minimize the SWT or chemical inputs. The error ANN produced smaller average confidence intervals than the squared error ANN, but the probability of capturing the true value in the predicted interval was also reduced. The squared error ANN demonstrated a smaller average interval range than the range of the *t*-test statistic of the same (PICP) probability. Both confidence interval estimates and surface responses show promise as parts of a real-time decision-making process for operators, or employed offline in conjunction with traditional optimization exercises such as jar tests. Further research should be undertaken to validate the response surfaces with pilot plant and/or laboratory scale data to eliminate concerns about confounded parameters and confirm the best means to fully integrate the use of artificial intelligence with the existing SCADA and operations personnel of the EAWTP.

## ACKNOWLEDGEMENTS

This work was funded in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) Industrial Research Chair at the University of Toronto and the Ontario Graduate Scholarship (OGS) program. The authors would like to acknowledge Carolyn de Groot from the Elgin and Huron Primary Water Supply Systems for her assistance with process data collection and support.