Artificial neural network modeling approach for the prediction of five-day biological oxygen demand and wastewater treatment plant performance

The measurement of the wastewater BOD5 level requires five days, and the use of a prediction model to estimate BOD5 saves time and enables the adoption of an online control system. This study investigates the application of artificial neural networks (ANNs) in predicting the influent BOD5 concentration and the performance of WWTPs. The WWTP performance was defined in terms of the COD, BOD, and TSS concentrations in the effluent. Sensitivity analysis was performed to identify the best-performing ANN network structure and configuration. The results showed that the ANN model developed to predict the BOD concentration performed the best among the three outputs. The topperforming ANN models yielded R values of 0.752, 0.612, and 0.631 for the prediction of the BOD, COD, and TSS concentrations, respectively. The optimal performing models were obtained (three inputs – one output), which indicated that the influent temperature and conductivity greatly affect the WWTP performance as inputs in all models. The developed prediction model for the influent BOD5 concentration attained a high accuracy, i.e., R1⁄4 0.754, which implies that the model is viable as a soft sensor for online control and management systems for WWTPs. Overall, the ANN model provides a simple approach for the prediction of the complex processes of WWTPs.


GRAPHICAL ABSTRACT INTRODUCTION
Rapid urban development leads to an increase in wastewater flow rates, which requires a high pace of advancement in treatment methods. Wastewater treatment plant (WWTP) effluent quality is a key factor in environmental and health concerns (Hamed et al. ). Thus, it is important to be able to model and simulate WWTPs to identify optimal conditions and to avoid future operation failures. However, the characteristics and variables involved in WWTPs are numerous and exhibit a high level of complexity, which leads to difficulties in modeling through linear regression models (Hamed et al. ). Moreover, physical and chemical simu-  (FFNN), which was developed using MATLAB software. In all single-input networks, the network structure consists of three layers (input, hidden, and output layers). In multi-input networks, the network structure consists of four layers (two hidden layers). The optimal network structure was found to be 1-40-1 with the COD as the input variable (R 2 ¼ 0. 987  pH. The results indicated that the optimal network structure was 8-7-1, which led to a high EQI prediction efficiency (R ¼ 0.96, and MSE ¼ 0.1).
Moreover, Abba & Elkiran () predicted the COD of the effluent of wastewater treatment plants using a feedforward neural network (FFNN). The COD of the effluent is considered a major parameter to assess the performance of WWTPs. An FFNN was applied to predict the COD, BOD, pH, T-P, T-N, TSS, and conductivity of the effluent at WWTP inlets. Several structures and input combinations were considered; however, the FFNN model with an 8-8-1 network structure, utilizing all four input parameters, yielded good performance and the highest accuracy in effluent COD prediction (R 2 ¼ 0.7, and RMSE ¼ 0.0108).
The use of neural network models to simulate wastewater plants provides a framework for plant operation monitoring (Mjalli et al. ). This monitoring framework enables minimizing the cost of operation and determining the quality of environmental stability. Moreover, previous studies did not present the method used for determining the ANN configuration. Thus, this study provides a simple approach to identify the optimal ANN configuration for WWTP plant performance prediction.
Moreover, several regression models and artificial neural networks were developed to predict the five-day bio- Although there have been a few studies that developed soft sensors for BOD monitoring, there is a lack of parameter analysis and its effect on predicting the inlet BOD concentration, which will be performed in this study.
Most of the literature utilized TSS, COD, and BOD to predict the performance of WWTPs using ANN-based models. These parameters are considered the controlling parameters of the effluent quality from WWTPs (Abba & Elkiran ). Furthermore, WWTPs commonly consist of a series of complex processes that cannot be modeled using simple regression techniques due to the large number of input parameters and data points required to capture the plant performance. Thus, the capacity of ANN modeling makes it a reliable solution in modeling the performance of wastewater treatment plants. The objective of this paper is to investigate the application of ANNs in WWTP modeling and to provide a simple algorithm to determine the optimal ANN configuration capturing the complex behavior of treatment plants. In contrast to most of the previous literature, this study aims to provide a simple determination method for the optimal configuration, which is defined in this study as a simple and general configuration that can be extrapolated, in future studies, to multiple treatment plants with similar processes. Additionally, the generated model could be utilized in the design of control systems and the monitoring of plant performance and water quality parameters. This study also aims to develop a viable ANN to act as a soft sensor of the influent BOD 5 in WWTPs, as there is a lack of this specific approach in the previous literature. The above soft sensor is expected to decrease the five-day period of the BOD 5 measurement to several hours, which will allow the use of online control systems.

METHODOLOGY AND MODEL DEVELOPMENT Data collection and preparation
The data analyzed in this study were collected from the Kabd WWTP, which is located in Kuwait, over a seven-year period (2013 to 2019). A total of six wastewater parameters were adopted as model inputs, including the influent temperature, pH, conductivity, and TSS, COD, and BOD concentrations. These parameters were employed to predict the WWTP performance, which is represented by the effluent BOD, COD, and TSS concentrations. The influent temperature, pH, conductivity, TSS, and COD were used to predict the influent BOD 5 concentration. The total number of data points was 2,397 points; however, they included missing values. Thus, IBM SPSS Statistics 26 was applied to conduct missing value analysis along with descriptive analysis on the data set, as indicated in Table 1. The BOD 5 concentration had the most missing values among all the other parameters, which could be explained by worker negligence in performing the tedious five-day experimental procedure to estimate the BOD 5 value. Thus, a BOD 5 prediction model is a valuable approach to calculate these missing values. Listwise deletion of missing data points was conducted using a developed MATLAB code. This code located the missing values of any parameter and deleted the whole data row. After removing the data points with missing values for all of the parameters, the total number of data points was 1,032 points.
Moreover, data normalization was performed according to where Y represents the values of each parameter studied and Y min and Y max are the minimum and maximum values, respectively, of the variable to be normalized. Furthermore, a correlation matrix was developed to ensure the independence of the input parameters. Data preprocessing and preparation were performed using MATLAB 2017b.

Artificial neural network
The idea of strong computing methods, known as neural networks, as an equivalent to the human brain, was developed  proportional to the complexity of the system to be modeled (Dreyfus ). The working principle of ANN is basically the same for the many network types. The basic processing element, the neuron, receives input signals then processes them through an activation function and provides an output signal. Furthermore, the weight of each neuron and the transfer functions are responsible for passing signals from one layer to the next layer. The mathematical expression of the neural network working principle is given in Equation (2): where Y i is the value of predicted output i, f is the activation function, W ij is the weight assigned to each input j, M is the total number of inputs, and b i is the bias for each output.
The most common type of activation function is the sigmoid function (Haykin ). Examples of the sigmoid function are the tan-sigmoid function and the log-sigmoid function, expressed by Equations (3) and (4), respectively where x is the input of the activation function. In this study, the MATLAB 2017b neural network tool is applied to develop and train the feedforward neural network with the tan-sigmoid transfer function (Equation (3)) for all layers.
The ANN is trained using one input matrix containing up to six inputs for each of the three outputs. Table 2

Correlation matrix
The developed correlation matrix, summarized in Table 3, indicates the absence of any linear relationship among the parameters. However, this does not indicate the absence of any other types of relationships, as the small correlation coefficient indicates that conventional regression methods are unsuitable to predict such a complex system. Moreover, the input parameters lack any correlation among each other, which is required to train a reliable ANN. The presence of correlated input parameters biases the ANN towards the effect of these parameters.

ANN modeling and prediction
The development of the ANN model was realized using various structures to obtain the ANN with the optimal performance. First, each of the influent parameters was applied as input to predict each of the effluent parameters (one input to one output). Then, the number of inputs was gradually increased until all the inputs (six inputs) were applied to predict each output. The number of hidden The key performance indicator considered in this study is the coefficient of determination (R 2 ) (Equation (5)) and the index of agreement (d ) (Equation (6)) between the predicted values and the measured values for each ANN developed: where N is the total number of points, OBS is the measured value of the variable, P is the corresponding predicted value of the variable, and the bar indicates the mean value of the variable.
The results of developing a one input to one output The best input performances in predicting the effluent BOD were obtained for the influent BOD, pH, and temperature. The top three COD predictors were the temperature, pH, and conductivity. The effluent TSS was best predicted by the temperature, influent BOD, and influent TSS. The effect of the initial pH on the BOD rate and wastewater treatment process has been examined in previous literature   (Ahsan et al. ). The next step was to train three input and one output ANN models. It should be noted that every possible combination of three input parameters was considered with different numbers of hidden layers.
Effluent BOD prediction    neural network is designed to predict the whole treatment process.

Effluent COD prediction
The accuracy results for each input combination are shown in Figure 6 using an ANN with different numbers of hidden layers. The input-26 combination (temperature-conductivity-pH) outperformed the other input combinations.
It was also observed that the combinations containing temperature þ conductivity, or temperature þ pH (inputs 20 to 26) resulted in a higher prediction performance than that of the other combinations. The results also revealed performance spikes for the other combinations; however, these were not considered due to the lack of consistency.
The temperature-conductivity-pH combination was trained for various network structures, and the optimal ANN structure (R 2 ¼ 0.6115 and d ¼ 0.877) contained three hidden layers with 13 neurons for each layer (1-13-13-13-1), as shown in Figure 7. Figures 8 and 9 show the performance of the three-input ANN model as a parity

Effluent TSS prediction
For the TSS ANN model, the input-20 combination (temperature-conductivity-BOD) yielded better results than the other combinations, as shown in Figure 10. Moreover, the input combinations containing the conductivity attained a higher prediction accuracy than the other combinations.
In contrast to BOD and COD, optimum TSS modeling occurred with four hidden layers rather than with three hidden layers. This indicates the amount of complexity involved in TSS modeling over BOD and COD modeling.
The temperature-conductivity-BOD combination was trained for various network structures, and the optimal ANN structure (R 2 ¼ 0.6308 and d ¼ 0.884) consisted of four hidden layers with 11 neurons for each layer  (1-11-11-11-11-1), as shown in Figure 11. Figures 12 and   13 show the performance of the three-input ANN model as a parity plot between the measured and predicted values and as a comparison of the prediction and measured values along the data sequence, respectively. TSS was found to be the most challenging parameter to model. However, the developed ANN managed to predict TSS with a sufficient accuracy in regard to the measured values. As with   To study the cause of the low value of the performance indicator for the models developed in this paper, residual plots were generated, which depict the goodness of fit of the created ANN models (Guo et al. ). Figure 14 shows the residual plots for each model. The low relationship between the residual and predicted data indicates that the low model performance is mainly due to data noise.
Hence, the ANN model would yield a higher prediction accuracy using smoother data. Moreover, ANN models provide sufficient accuracy for the prediction of the effluent concentration.

Influent BOD 5 prediction
Several ANN configurations (i.e., various input combinations, layers, and neurons) were developed and tested for the prediction of the BOD 5 concentration in the influent wastewater treatment plant. Figure 15 shows the best prediction results of the test data set for each ANN configuration with respect to the coefficient of determination (R 2 ). The difference between underfitting and overfitting is clearly demonstrated in Figure 15, where the R 2 values are lower using either a small (one layer) or large (five layers) number of layers, and the maximum R 2 value occurs between these extremes.
Moreover, sensitivity analysis of the input parameters was conducted utilizing the ANN prediction accuracy under the various input combinations.    Table 4, ANN-V was adopted to predict BOD 5 , and the results are shown in Figure 16. The results demonstrate the high accuracy of the ANN in predicting BOD 5 with the exception of a few data points. The spikes deviating from the measurement values occur due to noise in the training data set similar to the previously developed ANN.
The R 2 values from the literature and previous models designed to predict BOD 5 in WWTPs are listed in Table 6.

CONCLUSION
The developed ANN models suitably predicted the WWTP performance, which was defined based on the effluent TSS, BOD, and COD concentrations in the treatment plant, with a high degree of reliability. A low model performance mainly occurred due to noise in the data used to develop the ANN.
The results demonstrated that preliminary data analysis and preparation are essential for ANN training. It is recommended to gradually increase the complexity (number of inputs, hidden layers, and neurons) of the network configuration until no further improvement is recorded. In this study, a three input and one output configuration was sufficient, and a further increase in inputs resulted in overfitting of the system. The results also indicated that increasing the number of inputs was not always beneficial. Another finding from the developed models is the importance of the input parameters in the wastewater treatment process. The parameter significance could be interpreted from the model input combination that yielded the highest prediction accuracy. For instance, the influent temperature and conductivity greatly affected the WWTP performance as they were used as inputs in all models. However, the influent BOD concentration was important in the treatment process regarding the effluent

FUNDING
This research did not receive a specific grant from any funding agencies in the public, commercial, or not-for-profit sectors.

DATA AVAILABILITY STATEMENT
Data cannot be made publicly available; readers should contact the corresponding author for details.