Abstract
Wastewater treatment plants (WWTPs) are complex systems that must maintain high levels of performance to achieve adequate effluent quality to protect the environment and public health. Artificial intelligence and machine learning methods have gained attention in recent years for modeling complex problems, such as wastewater treatment. Although artificial neural networks (ANNs) have been identified as the most common of these methods, no study has investigated the development and configuration of these models. We conducted a systematic literature review on the use of ANNs to predict the effluent quality and removal efficiencies of full-scale WWTPs. Three databases were searched, and 44 records of the 667 identified were selected based on the eligibility criteria. The data extracted from the papers showed that the majority of studies used the feedforward neural network model with a backpropagation training algorithm to predict the effluent quality of plants, particularly in terms of organic matter indicators. The findings of this research may help in the search for an optimum design modeling process for future studies of similar prediction problems.
HIGHLIGHTS
Machine learning approaches are effective for modeling wastewater treatment plants (WWTPs).
Artificial neural networks (ANNs) are the most employed in the wastewater treatment sector.
The various ANN structures used in the sector have not been adequately studied.
The systematic review focused on the use of ANN for performance prediction of WWTPs.
The findings are beneficial for future studies with similar prediction problems.
ABBREVIATIONS
- ANFIS
Adaptive neuro-fuzzy inference system
- ANFIS-GA
Adaptive neuro-fuzzy inference system coupled with genetic algorithm
- NH4-N
Ammonia nitrogen
- BOD
Biochemical oxygen demand
- CBOD
Carbonaceous biochemical oxygen demand
- COD
Chemical oxygen demand
- R²
Coefficient of determination
- R
Correlation coefficient
- DCB
Deep cascade-forward backpropagation networks
- DFFNN
Deep feedforward neural network
- DSAE-NN-GA
Deep learning which combines stacked autoencoders with neural network and genetic algorithm
- EC
Electrical conductivity
- ELM
Extreme learning machine
- FFNN
Feedforward neural network
- Q
Flow rate
- GA
Genetic algorithm
- GRNN
Generalized regression neural networks
- HELM
Hierarchical extreme learning machine
- LSTM
Long short-term memory
- LSTM-AM
Long short-term memory based on attention mechanism
- MSE
Mean square error
- MLP
Multilayer perceptron network
- MLP-GA
Multilayer perceptron network coupled with genetic algorithm
- NO3-N
Nitrate nitrogen
- NO2-N
Nitrite nitrogen
- NARX
Nonlinear autoregressive with exogenous neural network
- PO4
Phosphate/orthophosphate
- RBF
Radial basis function neural network
- RBF-GA
Radial basis function neural network coupled with genetic algorithm
- RVFL
Random vector functional link networks
- RHONN
Recurrent high-order neural network
- RMSE
Root mean square error
- SO-RBF
Self-organizing radial basis function neural network
- SWNN
Small-world neural network
- T
Temperature
- TKN
Total Kjeldahl nitrogen
- TN
Total nitrogen
- TP
Total phosphorus
- TSS
Total suspended solids
- VSS
Volatile suspended solids
INTRODUCTION
Recent concerns regarding environmental issues have induced specialists to focus their attention on the efficient operation and control of wastewater treatment plants (WWTPs) (Mjalli et al. 2007; Pham et al. 2020). WWTPs are highly complex and dynamic systems that require consistent high performance despite hourly, daily, and seasonal fluctuations (Corominas et al. 2018).
The treatment of wastewater is affected by several chemical, physical, and microbiological factors. The complexity of wastewater treatment technology results in uncertainty and variation in the treatment system, leading to fluctuations in effluent quality and environmental risks to the receiving water (Zhao et al. 2020; Zhang et al. 2023). Hence, proper operation and control are essential for safeguarding public health and protecting the environment (Nourani et al. 2018).
Safe operation and control of WWTPs can be achieved through the development of a robust and appropriate mathematical model for predicting plant performance based on past observations of key quality parameters (Hamed et al. 2004; Singh et al. 2010; Nasr et al. 2012). Modeling is widely used to assess the performance of WWTPs (Hamed et al. 2004; Mjalli et al. 2007; Singh et al. 2010); however, the complexity and dynamics of treatment systems make it difficult to perform predictions and simulations using traditional linear methods (Nourani et al. 2018).
Artificial intelligence (AI) has become a powerful tool for minimizing the complexities in wastewater treatment (Zhao et al. 2020; Malviya & Jaspal 2021; Zhang et al. 2023). Zhao et al. (2020) conducted a bibliometric analysis of the trends in AI technology as applied to wastewater treatment. Those authors found that the number of published articles utilizing AI in wastewater treatment research was 19 times greater in 2019 than that in 1995. Most AI techniques have been modeled using experimental data to simulate, predict, confirm, and optimize contaminant removal in wastewater treatment processes (Zhao et al. 2020).
Machine learning is a central subfield of AI. Machine learning algorithms are increasingly used and play a fundamental role in the operation of WWTPs (de Canete et al. 2021). Machine learning approaches have become powerful tools for dealing with the complexities of uncertain and dynamic problems. Therefore, these techniques are becoming common for modeling complex environmental problems, such as that of wastewater treatment and optimization of wastewater (Guo et al. 2015; Ye et al. 2020; Zhao et al. 2020). These approaches maximize the knowledge obtained from data and operational experience and help strengthen the management and control of WWTPs, thereby improving the performance of these facilities (Zhao et al. 2020).
Machine learning methods can be supervised or unsupervised. Supervised methods are used to build predictive models that characterize the link between explanatory and response variables. These models predict the response variable of interest (output) using the explanatory variables (inputs) of the dataset (Lantz 2013; Corominas et al. 2018; Newhart et al. 2019; Newhart et al. 2022). Supervised machine learning includes models such as naïve Bayes, regression trees, artificial neural networks (ANNs), and support vector machines (Lantz 2013). Unsupervised methods are used to build descriptive models. They are applied when the goal is to identify patterns in the data without any advanced knowledge of the possible relationships involved (Newhart et al. 2019).
Previous literature reviews have identified ANNs as the most employed in the wastewater treatment sector. Hadjimichael et al. (2016) conducted a literature review on the application of AI methods (mainly machine learning) to the urban water sector. Those authors found 1,394 papers on wastewater published between 1935 and 2016, and ANNs were found to be the most common method used in various sectors of water-related research, including that of wastewater treatment (Hadjimichael et al. 2016). ANNs have emerged as an attractive option for predicting and classifying water systems as well as for modeling and optimizing performance (Hadjimichael et al. 2016).
Corominas et al. (2018) performed a literature review of computer-based techniques for data analysis to improve the operation of WWTPs. Those authors described various methods that enable the transformation of data into pertinent information. According to Corominas et al. (2018), the European Union is the leading region in this field with the largest number of studies (61%), followed by Asia-Oceania (34%) and North America (12%). A minority of studies (less than 4%) have been conducted by South American or African research groups. Among the 340 selected papers (published up to 2015), ANN was the most commonly used technique, particularly for predicting process performance, soft sensing, and control (Corominas et al. 2018).
Zhao et al. (2020) conducted a bibliometric analysis covering 1995–2019 of trends in applying AI technology to wastewater treatment. According to those authors, research has mainly focused on AI technology in relation to pollutant removal. The majority of studies utilized ANN models to simulate and predict the performance of biological WWTP, and there has been an increase in the number of publications using this technique in recent years (Zhao et al. 2020).
Soft measurement estimates variables that are difficult to measure by correlating them with available variables that are more readily measured (Osman & Li 2020). Ching et al. (2021) conducted a literature review covering 102 studies on the development of soft sensors for wastewater treatment. Those authors showed that neural networks were the most common modeling approach. These methods have remained the dominant methodologies for soft sensor development since the early 2000s, and it appears that ANNs will continue to predominate in the coming years (Ching et al. 2021).
Bahramian et al. (2023) conducted a comprehensive literature review on the state-of-the-art in the application of data-driven models in WWTPs. They searched publications from 2000 to 2021 and selected 281 studies for qualitative assessment. The ANNs were identified as the most popular model among the studies and were commonly used as a prediction model focusing on the removal of pollutants (Bahramian et al. 2023).
Zhang et al. (2023) provided a summary of the status and trends in AI research as applied to wastewater treatment, based on published papers and patents from 2000 to 2022. According to the authors, ANN is the most common and widely used model for AI in wastewater treatment (Zhang et al. 2023).
The parameters of wastewater treatment monitoring data tend to share nonlinear and complex chemical relationships (Ching et al. 2021). The nonlinear nature of an ANN can accurately predict pollutant removal in WWTPs (Ye et al. 2020). The wide usage of ANNs in water-related research relates to their ability to learn (through the training process) complex nonlinear and multi-input/output relationships between process parameters using historical data (Madić & Radovanović 2011). ANNs can also be applied when there is insufficient knowledge of the process to construct a mechanistic model of the wastewater treatment system, which relies on fundamental material and energy balances and empirical correlations that are often inaccurate (Mjalli et al. 2007). Many simplifications and assumptions are required to ensure that mechanistic models are tractable and computable, and accordingly, they have many limitations (Wang et al. 2021).
ANN models consist of predefined mathematical functions that effectively capture the nonlinear relationships between variables in complex systems (Civelekoglu et al. 2009). ANNs require historical data during training, after which they should have the ability to extrapolate correlations to new data (Palani et al. 2008). The ANN learns from the training data and captures the relationships between data points, which can be used for simulation, prediction, and optimization (Zhao et al. 2020).
The concept of an ANN was based on the biological human brain and its learning processes. ANNs are numerical structures comprising nodes (neurons) and connections (weights) (Mjalli et al. 2007; Nezhad et al. 2016). The ANN architecture is the overall structure and manner in which information flows from one layer to another (Chen et al. 2020). The architecture consists mainly of the number of neurons and the manner in which they are interconnected (Mjalli et al. 2007). An ANN includes a variety of hyperparameters that must be tuned during model development, including the number of hidden layers, number of neurons in each hidden layer, and activation functions that are applied (Ching et al. 2021).
The main task in designing a robust neural network is to determine the appropriate model architecture to minimize the overall model error (Madić & Radovanović 2011; Nezhad et al. 2016). Selecting a network structure (e.g., a feedforward neural network (FFNN) with one hidden layer and five neurons in the hidden layer that are connected by a sigmoid activation function, or a deep neural network with multiple hidden layers and multiple parameters) is a crucial step in the design of ANNs. The structure must be optimized for reducing computer processing, achieving adequate performance, and avoiding overfitting (Mjalli et al. 2007).
There is a limited theoretical and practical background to assist in the systematic selection of ANN hyperparameters through model development and training processes (Madić & Radovanović 2011). Therefore, most studies choose the appropriate ANN model structure using a trial-and-error approach (Mjalli et al. 2007; Palani et al. 2008; Madić & Radovanović 2011; Chen et al. 2020), whereby several networks are trained and compared (Mjalli et al. 2007; Madić & Radovanović 2011), which is challenging and time-consuming (Lee et al. 2011). Choosing the ANN architecture and selecting the training algorithm (which is used to minimize the error between the observed and predicted output) and related parameters is primarily related to the experience of the designer (Madić & Radovanović 2011).
Although previous literature reviews have identified the ANN as the data-driven technique and machine learning model most applied in the wastewater sector (Hadjimichael et al. 2016; Corominas et al. 2018; Zhao et al. 2020; Malviya & Jaspal 2021; Bahramian et al. 2023), no studies have identified the model structures adopted in this research. No specific literature review has been found on the use of ANN in the wastewater treatment sector. Therefore, the current investigation may improve the configuration of models based on studies in this field. Understanding the hyperparameter tuning process from datasets of WWTPs might improve the efficiency of determining the optimum setting and the performance of future models.
METHODS
Review objective and research question
With the increased use of neural network methods for predictions, it is important to study their role in predicting WWTP performance. The various ANN structures and hyperparameters used in the wastewater treatment sector have not been adequately studied. Therefore, a systematic review was conducted to develop an understanding of WWTP performance predictions using an ANN.
A systematic review is a literature review based on clearly formulated questions. It identifies relevant studies and summarizes evidence using an explicit methodology (Khan et al. 2003). A systematic review differs from a traditional general review, as it adopts a replicable, scientific, and transparent process (Qazi et al. 2015). The current study followed the guidelines and protocols for systematic reviews (Khan et al. 2003; Pullin & Stewart 2006; Page et al. 2021).
The first step in a systematic review is to formulate a specific question. The following research question was the basis of this review: ‘What are the main architectures and hyperparameters of ANN models used to predict the performance of different types of full-scale WWTPs?’
Search strategy
The next step is to identify relevant studies by formulating a formal search strategy. The systematic review design reported here was initiated in August 2021. After several refinements and improvements, the publication search began in February 2022. The ScienceDirect, Scopus, and Web of Science databases were searched, and the results restricted to peer-reviewed articles published in journals from 2011 through 2021 in English. Pilot searches were performed to refine the keywords, and the following final search strategy was used, based on document titles, abstracts, and author-specified keywords: (‘wastewater treatment plant’ OR ‘sewage treatment plant’ OR WWTP) AND (‘neural network’ OR ANN).
Selection criteria
The study selection criteria flow directly from the review questions and should be previously specified. The reasons for inclusion and exclusion were recorded (Khan et al. 2003). The eligibility criteria were designed to focus exclusively on the use of ANNs for predicting the performance of WWTPs in terms of effluent quality or removal efficiencies. The goal was to gather a comprehensive set of studies specifically focused on the application of ANNs in this context. The selection process was structured as follows:
Inclusion Criteria:
- a.
Studies using ANNs: Only studies that employed ANNs as the modeling tool for predicting the effluent quality or removal efficiencies of WWTPs were considered for inclusion. Other machine learning algorithms and modeling techniques were excluded to maintain a specific focus on ANNs.
- b.
Full-scale WWTPs: Only studies involving full-scale WWTPs were included in the review. Pilot- and bench-scale plants were excluded to ensure relevance to real operational conditions.
- c.
Domestic effluent treatment: The review was limited to studies that focused on WWTPs specifically designed to treat domestic effluent. Industrial plants were excluded.
Exclusion Criteria:
- a.
Studies using ANNs for other purposes: Studies utilizing ANNs for purposes other than predicting WWTP performance in terms of effluent quality or removal efficiencies (e.g., energy consumption control, process optimization) were excluded, as they deviated from the primary research focus.
- b.
Non-journal publications: Publications such as book chapters, conference papers, and lecture notes were excluded from the review as journal publications were the focus.
Language and data criteria:
- a.
Language: Articles published in languages other than English were excluded.
- b.
Data availability: During the full-text screening, an additional criterion was applied to assess whether the selected papers contained the necessary data and information to effectively answer the main research question (Espinosa et al. 2020). Studies lacking relevant data were excluded.
After selecting documents based on the search strategy, duplicates were removed using Mendeley software. Then, non-journal publications were excluded. Subsequently, articles were screened for exclusion criteria based on their titles. The abstracts were then evaluated for the inclusion and exclusion criteria, and the remaining articles were subsequently screened based on their full text for the eligibility criteria.
Data extraction and analysis
The next step was to extract data from the final selected papers by identifying relevant information related to the research question (Qazi et al. 2015). A detailed investigation was conducted, and data from papers were extracted and presented in a table with the following fields: (i) reference (author(s), year, journal, and paper title); (ii) country of the study; (iii) wastewater treatment technology and inflow rate/design flow of the facility; (iv) monitoring frequency and period; (v) number of samples; (vi) data division into training/validation/testing datasets (%); (vii) input and output variables; (viii) data preprocessing methods; (ix) neural network architectures and hyperparameters (ANN methods, training algorithms, number of hidden layers, number of neurons in each hidden layer, and activation functions); and (x) metrics of model performance.
RESULTS AND DISCUSSION
Search results
The complete spreadsheet with detailed information extracted from the 44 papers is shown in Supplementary Table S1. The following sections discuss the main results presented in Supplementary Table S1.
WWTPs characteristics
One study (Ge et al. 2020) assessed two WWTPs, while the remaining 43 studies evaluated only one treatment facility. Three papers (7%) did not provide information on the treatment technology adopted in the WWTP under investigation. The conventional activated sludge process was the most common and was found in 18 papers (41%), followed by anaerobic/anoxic/oxic processes in five articles (11%). The remaining 41% of the studies included WWTPs that employed different activated sludge configurations (anoxic/oxic processes, activated sludge with coagulation/flocculation, extended aeration activated sludge, aerated lagoon followed by activated sludge, step-feed activated sludge processes, sequential batch reactors, intermittent cycle extended aeration-sequential batch reactors), two-stage trickling filters, constructed wetlands, and membrane bioreactors. The activated sludge processes had a higher prevalence in the selected studies because this is the most employed wastewater treatment technology globally (Sin & Al 2021).
Sixteen (36%) of the selected papers did not mention the size of the WWTP under study. The remaining 28 papers (64%) reported the inflow rate, design flow of the WWTPs, or both. The sizes of the WWTPs were variable, ranging from 52.1 to 11,574 L/s. However, most studies assessed large WWTPs. Fourteen WWTPs had inflow rates or design flows above 1,000 L/s. The inclusion of large facilities in the studies may be because large systems have better monitoring schemes with more data to train the ANN models. Larger WWTPs also have improved operational control, which encourages the development of models for predicting system performance.
Datasets characteristics
The WWTP data were collected at various time intervals, from continuous online sensor measurements to quarterly laboratory results (Newhart et al. 2019). In the WWTPs under study, 11 (25%) of the publications did not include the monitoring frequency, while three presented more than one frequency, from daily to monthly.
Four papers (9%) had samples collected at a frequent temporal resolution, such as every 10 min, every hour, or three times a day. The most common data collection period was daily (20 papers, 45%). Other studies collected samples every 2 or 3 days, or 3 days a week (three papers, 7%); weekly (four papers, 9%); monthly (five papers, 11%); and biweekly, once or twice a month, or every 2 weeks (two papers, 5%).
Six (14%) studies did not provide the period in which the data were collected. The remaining 38 studies had distinct time frames, from 3 months (Ge et al. 2020) to more than 15 years (Hejabi et al. 2021). Most studies (50%) assessed 1–2 years of a dataset.
Data preprocessing methods
Other methods of preprocessing used were the removal of outliers, abnormal data, noise, or errors in the data (Jami et al. 2011, 2012; Zhao et al. 2012; Kusiak & Wei 2013; Han et al. 2014; Qiao et al. 2016; Yaqub et al. 2020; Alsulaili & Refaie 2021); the estimation, interpolation, or imputation of missing points (Zhao et al. 2012; Han et al. 2014; Aldaghi & Javanmard 2021; Liu et al. 2021); and the use of multivariate statistical analyses, such as clustering methods and principal component analyses (Qiao et al. 2016; Zhao et al. 2016; Yasmin et al. 2017; Han et al. 2018; Sharghi et al. 2019; Abba et al. 2021b), mainly for the selection of input variables of the models.
Modeling development
Data dividing
Data division is an important step in modeling (Chen et al. 2020). Most studies (30 papers, 68%) divided the dataset into training and testing subsets. The training dataset is used to develop the model, that is, to accomplish network learning and fit the network weights. The testing dataset is used to evaluate how well the model generalizes to unseen data, that is, how accurately the network predicts targets for inputs that are not in the training set (Mjalli et al. 2007; Lantz 2013; Zhao et al. 2020). Of these 30 papers, 26 mentioned the proportion of data division. The most common allocation (used in eight studies) was 75% for training and 25% for testing.
A different approach was adopted in 12 (27%) articles that divided the dataset into training, validation, and testing subsets, and the validation dataset was used to optimize the model (Zhao et al. 2020) by adjusting the hyperparameters (Chen et al. 2020). Most of these (seven studies) divided the dataset into 70% for training, 15% for validation, and 15% for testing.
Different approaches have accomplished data division. Nine papers divided data in chronological order, in which the first data points were used for training, and the remainder for validation and testing. Another nine papers randomly divided the dataset.
For larger samples, it was expected that a greater percentage would be destined to train the model. However, there was no significant correlation between the number of samples and the percentage used for training (p = 0.27 and Pearson correlation coefficient = 0.22). This confirms that there are no uniform rules for dividing the dataset, and most researchers divided the data either by domain knowledge or arbitrarily (Chen et al. 2020).
Input and output parameters
Forty papers (91%) used effluent quality indicators as the target parameters, and four (9%) had removal efficiencies as the targets. The majority (28 papers, 64%) of the studies had more than one output parameter in single-output models (20 papers), multi-output models (seven papers), or both (one paper).
Table 1 shows that biochemical oxygen demand (BOD) and chemical oxygen demand (COD) effluent concentrations were the outputs in most papers. Other target parameters commonly used in the models were effluent concentrations of solids (total suspended solids, TSS) and effluent concentrations of nutrients (ammonia nitrogen, NH4-N, total nitrogen, TN, and total phosphorus, TP). The three most used output variables appeared in the word cloud generated from the 44 selected papers of the systematic review (Figure 3). Among these three, the largest term in the word cloud was BOD, followed by COD and TSS, which is according to Table 1. According to Alsulaili & Refaie (2021), most studies have utilized BOD, COD, and TSS to predict the performance of WWTPs using ANN-based models.
Target variable . | Number of publications . |
---|---|
Effluent BOD | 25 |
Effluent COD | 21 |
Effluent TSS | 19 |
Effluent NH4-N | 10 |
Effluent TN | 7 |
Effluent TP | 7 |
Effluent pH | 4 |
Effluent quality index | 2 |
Removal efficiency of NH4-N | 2 |
Effluent CBOD | 1 |
Effluent biodegradable dissolved organic nitrogen | 1 |
Effluent total coliform | 1 |
Effluent fecal streptococci | 1 |
Effluent TKN | 1 |
Effluent PO4 | 1 |
Effluent NO2 | 1 |
Effluent NO3 | 1 |
Effluent T | 1 |
Effluent EC | 1 |
Removal efficiency of fecal coliform | 1 |
Removal efficiency of total coliform | 1 |
Removal efficiency of arsenic | 1 |
Removal efficiency of TN | 1 |
Removal efficiency of TP | 1 |
Removal efficiency of TSS | 1 |
Removal efficiency of COD | 1 |
Removal efficiency of BOD | 1 |
Removal efficiency of sulfide | 1 |
Target variable . | Number of publications . |
---|---|
Effluent BOD | 25 |
Effluent COD | 21 |
Effluent TSS | 19 |
Effluent NH4-N | 10 |
Effluent TN | 7 |
Effluent TP | 7 |
Effluent pH | 4 |
Effluent quality index | 2 |
Removal efficiency of NH4-N | 2 |
Effluent CBOD | 1 |
Effluent biodegradable dissolved organic nitrogen | 1 |
Effluent total coliform | 1 |
Effluent fecal streptococci | 1 |
Effluent TKN | 1 |
Effluent PO4 | 1 |
Effluent NO2 | 1 |
Effluent NO3 | 1 |
Effluent T | 1 |
Effluent EC | 1 |
Removal efficiency of fecal coliform | 1 |
Removal efficiency of total coliform | 1 |
Removal efficiency of arsenic | 1 |
Removal efficiency of TN | 1 |
Removal efficiency of TP | 1 |
Removal efficiency of TSS | 1 |
Removal efficiency of COD | 1 |
Removal efficiency of BOD | 1 |
Removal efficiency of sulfide | 1 |
Key variables in wastewater treatment must be evaluated to control pollution (Osman & Li 2020), and their use as targets in the models confirms that they are important for assessing the performance of a WWTP. BOD and COD reflect organic water pollution and are considered the most important parameters for effluent quality control (Nourani et al. 2021). BOD is difficult to measure online, and laboratory measurements are time-consuming, as they are calculated by a 5-day off-line delay (Osman & Li 2020; Rahmati et al. 2021), which reinforces the importance of the development of predictive models for this parameter. TSS is another important variable, as excess TSS depletes dissolved oxygen in effluent water (Verma et al. 2013). There has been a continuous increase in the number of studies concerning nutrient removal (Ching et al. 2021) due to the control of effluents to prevent eutrophication of water bodies. According to Ching et al. (2021), the various parameters involved in the nitrogen removal process are consistent areas of interest in soft sensor development. In comparison, there are fewer sensor studies on phosphorus removal processes. The significance of phosphorus as a wastewater parameter depends on the local abundance or shortage of this nutrient (Ching et al. 2021).
The quality of the treated effluent depends on the influent quality and process parameters of the WWTP (Khatri et al. 2020). The explanatory variables (input) of the models were highly changeable in the studies, as many affect WWTP performance. Most papers (52%) had influent wastewater quality and quantity indicators as input variables. This means that the majority of studies used influent characteristics to predict effluent wastewater quality, demonstrating the value of using ANNs to represent the complex and nonlinear relationship between raw influent and treated effluent water quality measurements (Saleh 2021). For example, Bekkari & Zeddouri (2019) used the influent variables pH, temperature (T), TSS, total Kjeldahl nitrogen (TKN), BOD, and COD as inputs. The purpose of that study was to predict the performance of an activated sludge WWTP in Algeria in terms of effluent COD. In evaluating WWTP soft sensors, Ching et al. (2021) also found that influent quality parameters were used in most cases as input variables for modeling effluent quality.
Other approaches included using treated effluent quality indicators as input variables to predict a different effluent indicator as the output, wastewater quality indicators sampled at different locations in the treatment train, and combinations of influent quality indicators and operational variables (such as returned sludge flow rate, sludge volume index, food/microorganism ratio, sludge retention time, and energy and chemical products consumption). For example, to predict the effluent concentrations of TP, BOD, COD, TSS, and NH4-N in a WWTP (Harbin, China), Zhao et al. (2016) developed an ANN model using raw wastewater quality data (influent concentrations of TP, BOD, COD, TSS, NH4-N, and influent pH) and energy consumption parameters (electricity consumption, coagulant, and flocculants) as the input variables.
Table 2 shows the most common input variables, all of which were included in more than 20% of the papers, highlighting their importance as predictors of WWTP performance in the ANN models. The majority of studies included indicators of organic matter, BOD and COD, as both input (influent concentrations, Table 2) and output (effluent concentrations, Table 1) variables. According to Ching et al. (2021), COD is one of the strongest estimators for BOD; hence, most studies use COD concentrations as inputs for BOD models.
Input variable . | Number of publications . |
---|---|
Influent COD | 31 |
Influent TSS | 28 |
Influent BOD | 25 |
Influent pH | 20 |
Influent NH4-N | 17 |
Influent TN | 11 |
Influent Q | 10 |
Influent TP | 9 |
Input variable . | Number of publications . |
---|---|
Influent COD | 31 |
Influent TSS | 28 |
Influent BOD | 25 |
Influent pH | 20 |
Influent NH4-N | 17 |
Influent TN | 11 |
Influent Q | 10 |
Influent TP | 9 |
Other important input parameters in the models were influent TSS concentration, pH, nutrients concentration (NH4-N, TN, and TP), and flow (Q). The choice of these variables may be related to their ease of measurement (such as pH and Q) or the ability to develop models to predict some indicators in the treated effluent using the same indicator measured in the influent as one of the explanatory variables.
ANN methods
Bahramian et al. (2023) and Corominas et al. (2018) also found that FFNNs were the most popular architecture. These networks serve as universal approximators and can effectively learn complex patterns, making them suitable for solving a wide range of problems. However, it is essential to be cautious of potential overfitting issues, and careful hyperparameter tuning is often required to achieve optimal performance.
The other commonly used neural network types are described next. A multilayer perceptron network (MLP) is a type of FFNN (Bagheri et al. 2015) and was used in seven (16%) studies. According to Newhart et al. (2022), a neural network that uses sigmoid functions in the hidden layer and a linear function in the output layer is more commonly referred to as an MLP.
A radial basis function neural network (RBF) is another type of FFNN (Bagheri et al. 2015) that uses radial basis activation functions in the hidden layer (Chen et al. 2020). Although Newhart et al. (2022) mentioned that RBF is increasingly used, it was adopted in only three (7%) papers in this systematic review.
An extreme learning machine (ELM) was used in four studies (9%). An ELM consists of a single hidden layer FFNN (Abba et al. 2021b) where the values of the weights between the input and hidden layers are randomly selected and the weights between the hidden and output layers are analytically characterized (Pham et al. 2020). As an ELM only needs to learn the output weight, it can reduce computation problems because the weights of the input and hidden layers do not require adjustment (Chen et al. 2020).
Deep learning refers to the use of multiple hidden layers in a network (Corominas et al. 2018) and is suitable for modern applications with highly complex processes (Osman & Li 2020). Deep learning methods were used in three studies (7%). One of these (Osman & Li 2020) was published in 2020, and the other two (El-Rawy et al. 2021; Wang et al. 2021) in 2021. This result indicates that deep learning is a recent technique. Corominas et al. (2018) did not find any advances in the identification of deep learning methods for wastewater treatment applications in papers published up to 2015.
Recurrent neural networks were used in three papers (7%), two of them utilizing long short-term memory (LSTM) methods. Recurrent neural networks are distinguished by their internal memory features, which allow observations to be considered in an ordered sequence (Newhart et al. 2022). Recurrent neural networks allow signals to travel in both directions using loops to learn highly complex patterns (Lantz 2013). LSTM is capable of learning sequences of events over a period of time and can capture long-term dependencies in the data. Therefore, LSTM is frequently used to deal with time-series tasks, including those of wastewater data (Liu et al. 2021).
An adaptive neuro-fuzzy inference system (ANFIS) is a hybrid learning method that combines neural and fuzzy methods. It integrates the learning capacities of the ANN with fuzzy logic reasoning abilities to map the input–output relationships (Ye et al. 2020; Onu et al. 2021). ANFIS uses a hybrid of backpropagation and least-squares algorithms to train the parameters and automatically generate ‘If/Then’ rules (Zhao et al. 2020). ANFIS was used in seven papers (16%).
Network structure
As shown in Figure 6, each layer of a neural network structure contains a certain number of neurons, also known as nodes. The numbers of input and output nodes are the number of features in the input data and the number of output variables to be modeled, respectively. The number of hidden layers and neurons in these layer(s) are configured by the user before training the model, and depend on the difficulty of the problem (Saleh 2021). An insufficient number of hidden layer neurons may reduce prediction accuracy, causing underfitting problems. However, an excessive amount of neurons may lead to overfitting, whereby the error on the training set is driven to a small value and the test data are presented to the network with a large error. This implies that the generalization ability of the neural network was affected (Gaya et al. 2014; Chen et al. 2020; Ye et al. 2020).
In most studies (27 papers, 61%), the authors tuned the network structure using a trial-and-error approach, whereby ranges of values for the number of hidden layers and hidden neurons were tested to search for the optimum architecture. In some cases, other configurations were also tested by trial-and-error, such as the proportions of samples allocated to the training, validation, and testing subsets, and the training algorithms and activation functions to be used. In this trial-and-error approach, several ANNs are developed and compared to select the best result. For example, Sharghi et al. (2019) developed FFNN models to predict effluent BOD concentrations in an activated sludge WWTP. Those authors adopted one hidden layer, and the optimal hidden layer was determined by varying the number of nodes from 1 to 10. The authors observed the best results in a model with five neurons in the hidden layer.
Five of the 27 papers that adopted the trial-and-error approach established the range of hidden neurons and/or hidden layers to be tested using equations from the literature. To some extent, the use of equations may contribute to determining the model structure as they guide researchers based on previous studies (Chen et al. 2020).
Another approach to determine the best network structure was adopted in five (11%) papers that used hybrid learning and combined various neural network methods (MLP, ANFIS, ELM, RBF, or deep learning) with a genetic algorithm (GA). A GA is an efficient search algorithm that can be applied to identify the combination of hyperparameters that will result in the best model performance (Ching et al. 2021). These hybrid models use a GA to iteratively optimize the parameters in the neural network to increase the problem-solving ability (Zhao et al. 2020).
Table 3 shows the final and complete network structures of the papers that presented this information. The structure column indicates the number of neurons in the input layer, each hidden layer, and the output layer. For example, Jami et al. (2011) developed a model using the influent BOD concentration, NH4-N concentration, pH, and Q as explanatory variables (four input neurons), with 15 neurons in the single hidden layer of the FFNN, to predict the effluent concentrations of NH4-N (one output neuron) in a sequential batch reactor WWTP in Malaysia.
Reference . | Output parameter(s) . | Structure . |
---|---|---|
Jami et al. (2011) | Effluent NH4-N | 4-15-1a |
Lee et al. (2011) | Effluent BOD | 8-19-14-1b |
Effluent COD | 8-27-1b | |
Effluent SS | 8-3-6-1b | |
Effluent TN | 8-17-23-1b | |
Qiao et al. (2011) | Effluent COD, BOD, SS, and NH4-N (multi-output model) | 8-4-8-4c |
Zhang & Hu (2012) | Effluent BOD | 5-2-3-8-1d |
Chen & Lo (2012) | Effluent Q, BOD, COD, and SS (multi-output model) | 4-16-4e |
Jami et al. (2012) | Effluent BOD, SS, COD (single-output models) | 1-20-1a or 3-30-1a |
Kusiak & Wei (2013) | Effluent CBOD | 5-3-1e |
Effluent TSS | 5-10-1e | |
Liu et al. (2013) | Effluent COD | 9-54-6-6-1f |
Han et al. (2014) | Effluent BOD | 5-150-1g and 5-180-1g |
Gaya et al. (2014) | Effluent COD, SS, NH4-N (single-output models) | 5-10-1a |
Bagheri et al. (2015) | Effluent COD, TN, TSS (single-output models) | 5-10-1b; 5-5-1h |
Simsek (2016) | Effluent biodegradable dissolved organic nitrogen | 4-10-1e |
Zhao et al. (2016) | Effluent TP, BOD, COD, SS, and NH4-N (multi-output model) | 9-19-5a, 9-19-5a, 9-16-5a, 9-14-5a, and 9-15-5a |
Nezhad et al. (2016) | Effluent quality index | 8-7-1a |
Hazali et al. (2017) | Effluent TN, TP, NH4-N (single-output models) | 6-6-1i |
Nourani et al. (2018) | Effluent BOD, COD, TN (single-output models) | 5-3-1a |
Elfanssi et al. (2018) | Effluent TSS, BOD, COD, total coliform, and fecal streptococci (multi-output model) | 5-7-8-7-5a |
Sharghi et al. (2019) | Effluent BOD | 3-5-1a |
Khatri et al. (2019) | Effluent TSS | 7-4-1a |
Effluent pH, COD, TKN (single-output models) | 7-5-1a | |
Effluent BOD, NH4-N, TP (single-output models) | 7-6-1a | |
Bekkari & Zeddouri (2019) | Effluent COD | 6-50-1a |
Khatri et al. (2020) | Removal efficiency of fecal coliform | 10-6-1a |
Removal efficiency of total coliform | 10-8-1a | |
Ge et al. (2020) | Removal efficiency of arsenic | 4-3-1a |
Al-Obaidi (2020) | Effluent quality index | 5-3-1a |
Osman & Li (2020) | Effluent BOD | 19-13-13-13-1j |
El-Rawy et al. (2021) | Removal efficiency of TSS, COD, BOD, NH4-N, sulfide (single-output models) | 5-8-1a; 5-10-10-10-10-1k, 5-10-10-10-10-1l |
Wang et al. (2021) | Effluent TSS | 32-128-256-128-1k |
Effluent PO4 | 32-256-128-128-1k | |
Nourani et al. (2021) | Effluent BOD, COD (single-output models) | 5-3-1a |
Alsulaili & Refaie (2021) | Effluent BOD | 3-17-17-17-1a |
Effluent COD | 3-13-13-13-1a | |
Effluent TSS | 3-11-11-11-11-1a | |
Aldaghi & Javanmard (2021) | Effluent Q, BOD, COD, TSS, pH, T, TP, NO3, TN, NO2, NH4-N, and EC (multiple-output model) | 12-25-12e |
Saleh (2021) | Effluent COD | 9-6-6-1a |
Effluent BOD | 9-6-6-6-1a | |
Effluent TSS | 9-6-6-6-1a | |
Effluent COD, BOD, and TSS (multiple-output model) | 7-6-6-3a | |
Abba et al. (2021b) | Effluent BOD | 6-6-1e |
Effluent COD, TN, TP (single-output models) | 9-10-1e |
Reference . | Output parameter(s) . | Structure . |
---|---|---|
Jami et al. (2011) | Effluent NH4-N | 4-15-1a |
Lee et al. (2011) | Effluent BOD | 8-19-14-1b |
Effluent COD | 8-27-1b | |
Effluent SS | 8-3-6-1b | |
Effluent TN | 8-17-23-1b | |
Qiao et al. (2011) | Effluent COD, BOD, SS, and NH4-N (multi-output model) | 8-4-8-4c |
Zhang & Hu (2012) | Effluent BOD | 5-2-3-8-1d |
Chen & Lo (2012) | Effluent Q, BOD, COD, and SS (multi-output model) | 4-16-4e |
Jami et al. (2012) | Effluent BOD, SS, COD (single-output models) | 1-20-1a or 3-30-1a |
Kusiak & Wei (2013) | Effluent CBOD | 5-3-1e |
Effluent TSS | 5-10-1e | |
Liu et al. (2013) | Effluent COD | 9-54-6-6-1f |
Han et al. (2014) | Effluent BOD | 5-150-1g and 5-180-1g |
Gaya et al. (2014) | Effluent COD, SS, NH4-N (single-output models) | 5-10-1a |
Bagheri et al. (2015) | Effluent COD, TN, TSS (single-output models) | 5-10-1b; 5-5-1h |
Simsek (2016) | Effluent biodegradable dissolved organic nitrogen | 4-10-1e |
Zhao et al. (2016) | Effluent TP, BOD, COD, SS, and NH4-N (multi-output model) | 9-19-5a, 9-19-5a, 9-16-5a, 9-14-5a, and 9-15-5a |
Nezhad et al. (2016) | Effluent quality index | 8-7-1a |
Hazali et al. (2017) | Effluent TN, TP, NH4-N (single-output models) | 6-6-1i |
Nourani et al. (2018) | Effluent BOD, COD, TN (single-output models) | 5-3-1a |
Elfanssi et al. (2018) | Effluent TSS, BOD, COD, total coliform, and fecal streptococci (multi-output model) | 5-7-8-7-5a |
Sharghi et al. (2019) | Effluent BOD | 3-5-1a |
Khatri et al. (2019) | Effluent TSS | 7-4-1a |
Effluent pH, COD, TKN (single-output models) | 7-5-1a | |
Effluent BOD, NH4-N, TP (single-output models) | 7-6-1a | |
Bekkari & Zeddouri (2019) | Effluent COD | 6-50-1a |
Khatri et al. (2020) | Removal efficiency of fecal coliform | 10-6-1a |
Removal efficiency of total coliform | 10-8-1a | |
Ge et al. (2020) | Removal efficiency of arsenic | 4-3-1a |
Al-Obaidi (2020) | Effluent quality index | 5-3-1a |
Osman & Li (2020) | Effluent BOD | 19-13-13-13-1j |
El-Rawy et al. (2021) | Removal efficiency of TSS, COD, BOD, NH4-N, sulfide (single-output models) | 5-8-1a; 5-10-10-10-10-1k, 5-10-10-10-10-1l |
Wang et al. (2021) | Effluent TSS | 32-128-256-128-1k |
Effluent PO4 | 32-256-128-128-1k | |
Nourani et al. (2021) | Effluent BOD, COD (single-output models) | 5-3-1a |
Alsulaili & Refaie (2021) | Effluent BOD | 3-17-17-17-1a |
Effluent COD | 3-13-13-13-1a | |
Effluent TSS | 3-11-11-11-11-1a | |
Aldaghi & Javanmard (2021) | Effluent Q, BOD, COD, TSS, pH, T, TP, NO3, TN, NO2, NH4-N, and EC (multiple-output model) | 12-25-12e |
Saleh (2021) | Effluent COD | 9-6-6-1a |
Effluent BOD | 9-6-6-6-1a | |
Effluent TSS | 9-6-6-6-1a | |
Effluent COD, BOD, and TSS (multiple-output model) | 7-6-6-3a | |
Abba et al. (2021b) | Effluent BOD | 6-6-1e |
Effluent COD, TN, TP (single-output models) | 9-10-1e |
Obs.: Neural network methods. aFFNN; bMLP-GA; cRHONN; dSWNN; eMLP; fANFIS-GA; gHELM; hRBF-GA; iSO-RBF; jDSAE-NN-GA; kDFFNN; lDCB.
Although some recent studies have used deep learning, most developed shallow neural networks with a single hidden layer (Table 3). Other review papers have identified that most ANN models use a single hidden layer (Corominas et al. 2018; Ye et al. 2020) as this is usually sufficient to investigate many problems (Saleh 2021). There was a wide range in the number of neurons in the hidden layer(s) of the studies, from 2 to 256.
Considering the studies that developed single-output models for both BOD and COD effluent concentrations (the two most common target variables in the studies, Table 1), the same network structure for the two variables was adopted in three papers (Jami et al. 2012; Nourani et al. 2018, 2021). In the other four papers, more complex structures were used to model BOD effluent concentrations, with greater numbers of hidden layers (Lee et al. 2011; Saleh 2021) or hidden neurons (Khatri et al. 2019; Alsulaili & Refaie 2021). Only one study (Abba et al. 2021b) had a larger number of hidden neurons for the COD model. This result shows that modeling BOD concentrations may be more complex than modeling COD concentrations, with more intricate network structures required to map the relationship between the input and output phases.
Activation functions
In a neural network, each artificial neuron in the hidden and output layers calculates the weighted sum of its inputs and produces an output value using predefined activation functions, also known as transfer functions (Mjalli et al. 2007; Elfanssi et al. 2018). Therefore, the activation function is applied to a certain layer to obtain the output of that layer, which is then used as the input for the next layer (Sharma et al. 2020).
Activation functions introduce nonlinearity into the neural network. The choice of the activation function is important because it affects the prediction performance of the neural network (Sharma et al. 2020).
From the papers in the systematic review that included this information, the most common activation functions in the hidden layer were the logistic sigmoid (nine studies) and hyperbolic tangent (12 studies) functions. This result is in accordance with Corominas et al. (2018), who mentioned hyperbolic tangent and sigmoid functions among the typically applied activation functions in ANN models for nonlinear classification and regression problems in wastewater treatment research. Newhart et al. (2022) stated that the most widely used ANN activation function in environmental engineering is the logistic sigmoid function.
Training algorithms
The training of a neural network is performed by adjusting the neurons weights to minimize the error between the observed data and network output (Mjalli et al. 2007; Nasr et al. 2012). The most common learning algorithm used for this purpose is backpropagation, which involves working backward layer by layer from the output to adjust the weights accordingly and reduce the average error across all layers (Mjalli et al. 2007; Nezhad et al. 2016; Newhart et al. 2019). The backpropagation algorithm was used in 27 papers in this systematic review (61%).
Backpropagation is the most widely used ANN training algorithm (Zhao et al. 2016; Chen et al. 2020; Ye et al. 2020; Newhart et al. 2022), and is commonly applied in the field of environmental pollution control (Ye et al. 2020). The majority of applications of neural networks in engineering or wastewater treatment problems use the FFNN architecture with a backpropagation training algorithm because of its accuracy and capability (Al-Ghazawi & Alawneh 2021).
The standard backpropagation algorithm uses the gradient descent optimization method to perform calculations (Chen & Lo 2012; Zhao et al. 2016). This method involves the network weight value moving along a negative gradient of the performance function. Hence, the weight and bias values are continually renewed to minimize the performance function (Chen & Lo 2012).
Software tools
Sixteen studies (36%) did not specify which software tools were used for model development. Among the papers that provided this information, the most frequently used was MATLAB, which was used in 21 publications (48%). Other tools included R (two studies, 5%), SPSS (two studies, 5%), Python (one study, 2%), MATLAB integrated with C ++ (one study, 2%), and MATLAB integrated with C# (one study, 2%).
MATLAB was also found by Corominas et al. (2018), Ye et al. (2020), and Bahramian et al. (2023) to be the most popular software platform in the literature for modeling WWTPs with AI techniques. According to the authors, the wide usage of this software platform is due to its packages and toolboxes, which are user-friendly and convenient for users with minimal knowledge of data science (Ye et al. 2020; Bahramian et al. 2023).
Model performance
The model performance indicates the results of a comparison of the experimental data with the predicted data (Zhao et al. 2020). The performance of the models in the studies was calculated using various statistical metrics, including error (mainly mean square error (MSE) and root mean square error (RMSE)) and goodness-of-fit (mainly correlation coefficient (R) and coefficient of determination (R²)). The MSE and RMSE indicators identify the errors between the experimental values and model output, with smaller results signifying higher accuracy. The metrics R and R² indicate the degree of correlation between the observed and predicted values, with higher R or R² values indicating better prediction performance (Ye et al. 2020).
Some papers presented the metrics separately for the training and testing subsets, each target variable being modeled, and each type of model used. For this reason, there are many results for model performance, which can be found in Supplementary Table S1.
The following discusses the results of the performance of the models. As the metrics of errors, RMSE and MSE, depend on the unit of the variable or if they are presented as normalized data, the R and R² results are presented in Table 4. These data highlight the large variability in the results, with R ranging from −0.018 to 0.998 and R² from 0.260 to 0.998.
Reference . | Output parameter(s) . | ANN methods . | Model performance . |
---|---|---|---|
Jami et al. (2011) | Effluent NH4-N | FFNN | R = 0.7980 |
Zhao et al. (2012) | Effluent BOD | Selective ensemble ELM-GA | R² = 0.7576 |
Effluent COD | R² = 0.7729 | ||
Effluent SS | R² = 0.5957 | ||
Effluent NH4-N | R² = 0.8273 | ||
Chen & Lo (2012) | Effluent Q | MLP | R = 0.9781 |
Effluent BOD | R = 0.6963 | ||
Effluent COD | R = −0.0178 | ||
Effluent SS | R = 0.1031 | ||
Jami et al. (2012) | Effluent BOD | FFNN | R = 0.346948 |
Effluent COD | R = 0.052622 | ||
Effluent SS | R = 0.158717 | ||
Liu et al. (2013) | Effluent COD | ANFIS-GA | R² = 0.800 |
Effluent TN | R² = 0.577 | ||
Effluent TP | R² = 0.284 | ||
Gaya et al. (2014) | Effluent COD | FFNN | R = 0.647 |
Effluent SS | R = 0.512 | ||
Effluent NH4-N | R = 0.425 | ||
Effluent COD | ANFIS | R = 0.847 | |
Effluent SS | R = 0.995 | ||
Effluent NH4-N | R = 0.948 | ||
Bagheri et al. (2015) | Effluent COD | MLP-GA | R² = 0.98044 |
Effluent TN | R² = 0.98479 | ||
Effluent TSS | R² = 0.95484 | ||
Effluent COD | RBF-GA | R² = 0.97232 | |
Effluent TN | R² = 0.98325 | ||
Effluent TSS | R² = 0.95217 | ||
Simsek (2016) | Effluent biodegradable dissolved organic nitrogen | ANFIS | R² = 0.94 |
MLP | R² = 0.78 | ||
RBF | R² = 0.66 | ||
GRNN | R² = 0.97 | ||
Heddam et al. (2016) | Effluent BOD | GRNN | R = 0.922 |
Nezhad et al. (2016) | Effluent quality index | FFNN | R = 0.96 |
Hazali et al. (2017) | Effluent TP | SO-RBF | R² = 0.8442 |
Effluent TN | R² = 0.7282 | ||
Effluent NH4-N | R² = 0.2833 | ||
Yasmin et al. (2017) | Effluent pH | FFNN | R = 0.39698 |
ANFIS | R = 0.70868 | ||
Nourani et al. (2018) | Effluent BOD | FFNN | R² = 0.6600 |
Effluent COD | R² = 0.9363 | ||
Effluent TN | R² = 0.9022 | ||
Effluent BOD | ANFIS | R² = 0.7640 | |
Effluent COD | R² = 0.9260 | ||
Effluent TN | R² = 0.9410 | ||
Sharghi et al. (2019) | Effluent BOD | FFNN | R² = 0.67 |
Khatri et al. (2019) | Effluent pH | FFNN | R = 0.816 |
Effluent BOD | R = 0.649 | ||
Effluent COD | R = 0.656 | ||
Effluent TSS | R = 0.457 | ||
Effluent TKN | R = 0.670 | ||
Effluent NH4-N | R = 0.493 | ||
Effluent TP | R = 0.748 | ||
Bekkari & Zeddouri (2019) | Effluent COD | FFNN | R = 0.8781 |
Khatri et al. (2020) | Removal efficiency of fecal coliform | FFNN | R = 0.986 |
Removal efficiency of total coliform | R = 0.977 | ||
Ge et al. (2020) | Removal efficiency of arsenic | FFNN | R² = 0.851 |
Al-Obaidi (2020) | Effluent quality index | FFNN | R² = 0.998 |
Osman & Li (2020) | Effluent BOD | DSAE-NN-GA | R² = 0.987 |
El-Rawy et al. (2021) | Removal efficiency of BOD | FFNN | R = 0.55564 |
Removal efficiency of COD | R = 0.90859 | ||
Removal efficiency of TSS | R = 0.52105 | ||
Removal efficiency of NH4-N | R = 0.95459 | ||
Removal efficiency of sulfide | R = 0.9866 | ||
Removal efficiency of BOD | R = 0.76327 | ||
Removal efficiency of COD | DFFNN | R = 0.66487 | |
Removal efficiency of TSS | R = 0.70718 | ||
Removal efficiency of NH4-N | R = 0.99427 | ||
Removal efficiency of sulfide | R = 0.92402 | ||
Removal efficiency of BOD | R = 0.77167 | ||
Removal efficiency of COD | DCB | R = 0.94572 | |
Removal efficiency of TSS | R = 0.80847 | ||
Removal efficiency of NH4-N | R = 0.97696 | ||
Removal efficiency of sulfide | R = 0.98585 | ||
Al-Ghazawi & Alawneh (2021) | Effluent BOD | FFNN | R² = 0.48 |
Effluent COD | R² = 0.45 | ||
Effluent SS | R² = 0.44 | ||
Effluent NH4-N | R² = 0.26 | ||
Wang et al. (2021) | Effluent TSS | DFFNN | R² = 0.920 |
Effluent PO4 | R² = 0.872 | ||
Nourani et al. (2021) | Effluent BOD | FFNN | R² = 0.7182 |
Effluent COD | R² = 0.7178 | ||
Effluent BOD | ANFIS | R² = 0.7203 | |
Effluent COD | R² = 0.7148 | ||
Elmaadawy et al. (2021) | Effluent BOD | RVFL | R² = 0.924 |
Effluent TSS | R² = 0.917 | ||
Alsulaili & Refaie (2021) | Effluent BOD | FFNN | R² = 0.752 |
Effluent COD | R² = 0.6115 | ||
Effluent TSS | R² = 0.6308 | ||
Abba et al. (2021a) | Effluent TSS | NARX | R² = 0.9846 |
Effluent pH | R² = 0.6293 | ||
Hejabi et al. (2021) | Effluent BOD | FFNN | R² = 0.760 |
Effluent COD | R² = 0.715 | ||
Effluent TSS | R² = 0.632 | ||
Liu et al. (2021) | Effluent COD | LSTM-AM | R² = 0.869 |
Aldaghi & Javanmard (2021) | Effluent Q, BOD, COD, TSS, pH, T, TP, NO3-N, TN, NO2-N, NH4-N, and EC | MLP | R = 0.5804 |
Saleh (2021) | Effluent BOD | FFNN | R = 0.99782 |
Effluent COD | R = 0.77301 | ||
Effluent TSS | R = 0.8317 | ||
Abba et al. (2021b) | Effluent BOD | ELM | R² = 0.6341 |
Effluent COD | R² = 0.9742 | ||
Effluent TN | R² = 0.9656 | ||
Effluent TP | R² = 0.8807 | ||
Effluent BOD | MLP | R² = 0.5776 | |
Effluent COD | R² = 0.9555 | ||
Effluent TN | R² = 0.86662 | ||
Effluent TP | R² = 0.72544 | ||
Rahmati et al. (2021) | Effluent BOD | FFNN | R = 0.897 |
ANFIS | R = 0.930 |
Reference . | Output parameter(s) . | ANN methods . | Model performance . |
---|---|---|---|
Jami et al. (2011) | Effluent NH4-N | FFNN | R = 0.7980 |
Zhao et al. (2012) | Effluent BOD | Selective ensemble ELM-GA | R² = 0.7576 |
Effluent COD | R² = 0.7729 | ||
Effluent SS | R² = 0.5957 | ||
Effluent NH4-N | R² = 0.8273 | ||
Chen & Lo (2012) | Effluent Q | MLP | R = 0.9781 |
Effluent BOD | R = 0.6963 | ||
Effluent COD | R = −0.0178 | ||
Effluent SS | R = 0.1031 | ||
Jami et al. (2012) | Effluent BOD | FFNN | R = 0.346948 |
Effluent COD | R = 0.052622 | ||
Effluent SS | R = 0.158717 | ||
Liu et al. (2013) | Effluent COD | ANFIS-GA | R² = 0.800 |
Effluent TN | R² = 0.577 | ||
Effluent TP | R² = 0.284 | ||
Gaya et al. (2014) | Effluent COD | FFNN | R = 0.647 |
Effluent SS | R = 0.512 | ||
Effluent NH4-N | R = 0.425 | ||
Effluent COD | ANFIS | R = 0.847 | |
Effluent SS | R = 0.995 | ||
Effluent NH4-N | R = 0.948 | ||
Bagheri et al. (2015) | Effluent COD | MLP-GA | R² = 0.98044 |
Effluent TN | R² = 0.98479 | ||
Effluent TSS | R² = 0.95484 | ||
Effluent COD | RBF-GA | R² = 0.97232 | |
Effluent TN | R² = 0.98325 | ||
Effluent TSS | R² = 0.95217 | ||
Simsek (2016) | Effluent biodegradable dissolved organic nitrogen | ANFIS | R² = 0.94 |
MLP | R² = 0.78 | ||
RBF | R² = 0.66 | ||
GRNN | R² = 0.97 | ||
Heddam et al. (2016) | Effluent BOD | GRNN | R = 0.922 |
Nezhad et al. (2016) | Effluent quality index | FFNN | R = 0.96 |
Hazali et al. (2017) | Effluent TP | SO-RBF | R² = 0.8442 |
Effluent TN | R² = 0.7282 | ||
Effluent NH4-N | R² = 0.2833 | ||
Yasmin et al. (2017) | Effluent pH | FFNN | R = 0.39698 |
ANFIS | R = 0.70868 | ||
Nourani et al. (2018) | Effluent BOD | FFNN | R² = 0.6600 |
Effluent COD | R² = 0.9363 | ||
Effluent TN | R² = 0.9022 | ||
Effluent BOD | ANFIS | R² = 0.7640 | |
Effluent COD | R² = 0.9260 | ||
Effluent TN | R² = 0.9410 | ||
Sharghi et al. (2019) | Effluent BOD | FFNN | R² = 0.67 |
Khatri et al. (2019) | Effluent pH | FFNN | R = 0.816 |
Effluent BOD | R = 0.649 | ||
Effluent COD | R = 0.656 | ||
Effluent TSS | R = 0.457 | ||
Effluent TKN | R = 0.670 | ||
Effluent NH4-N | R = 0.493 | ||
Effluent TP | R = 0.748 | ||
Bekkari & Zeddouri (2019) | Effluent COD | FFNN | R = 0.8781 |
Khatri et al. (2020) | Removal efficiency of fecal coliform | FFNN | R = 0.986 |
Removal efficiency of total coliform | R = 0.977 | ||
Ge et al. (2020) | Removal efficiency of arsenic | FFNN | R² = 0.851 |
Al-Obaidi (2020) | Effluent quality index | FFNN | R² = 0.998 |
Osman & Li (2020) | Effluent BOD | DSAE-NN-GA | R² = 0.987 |
El-Rawy et al. (2021) | Removal efficiency of BOD | FFNN | R = 0.55564 |
Removal efficiency of COD | R = 0.90859 | ||
Removal efficiency of TSS | R = 0.52105 | ||
Removal efficiency of NH4-N | R = 0.95459 | ||
Removal efficiency of sulfide | R = 0.9866 | ||
Removal efficiency of BOD | R = 0.76327 | ||
Removal efficiency of COD | DFFNN | R = 0.66487 | |
Removal efficiency of TSS | R = 0.70718 | ||
Removal efficiency of NH4-N | R = 0.99427 | ||
Removal efficiency of sulfide | R = 0.92402 | ||
Removal efficiency of BOD | R = 0.77167 | ||
Removal efficiency of COD | DCB | R = 0.94572 | |
Removal efficiency of TSS | R = 0.80847 | ||
Removal efficiency of NH4-N | R = 0.97696 | ||
Removal efficiency of sulfide | R = 0.98585 | ||
Al-Ghazawi & Alawneh (2021) | Effluent BOD | FFNN | R² = 0.48 |
Effluent COD | R² = 0.45 | ||
Effluent SS | R² = 0.44 | ||
Effluent NH4-N | R² = 0.26 | ||
Wang et al. (2021) | Effluent TSS | DFFNN | R² = 0.920 |
Effluent PO4 | R² = 0.872 | ||
Nourani et al. (2021) | Effluent BOD | FFNN | R² = 0.7182 |
Effluent COD | R² = 0.7178 | ||
Effluent BOD | ANFIS | R² = 0.7203 | |
Effluent COD | R² = 0.7148 | ||
Elmaadawy et al. (2021) | Effluent BOD | RVFL | R² = 0.924 |
Effluent TSS | R² = 0.917 | ||
Alsulaili & Refaie (2021) | Effluent BOD | FFNN | R² = 0.752 |
Effluent COD | R² = 0.6115 | ||
Effluent TSS | R² = 0.6308 | ||
Abba et al. (2021a) | Effluent TSS | NARX | R² = 0.9846 |
Effluent pH | R² = 0.6293 | ||
Hejabi et al. (2021) | Effluent BOD | FFNN | R² = 0.760 |
Effluent COD | R² = 0.715 | ||
Effluent TSS | R² = 0.632 | ||
Liu et al. (2021) | Effluent COD | LSTM-AM | R² = 0.869 |
Aldaghi & Javanmard (2021) | Effluent Q, BOD, COD, TSS, pH, T, TP, NO3-N, TN, NO2-N, NH4-N, and EC | MLP | R = 0.5804 |
Saleh (2021) | Effluent BOD | FFNN | R = 0.99782 |
Effluent COD | R = 0.77301 | ||
Effluent TSS | R = 0.8317 | ||
Abba et al. (2021b) | Effluent BOD | ELM | R² = 0.6341 |
Effluent COD | R² = 0.9742 | ||
Effluent TN | R² = 0.9656 | ||
Effluent TP | R² = 0.8807 | ||
Effluent BOD | MLP | R² = 0.5776 | |
Effluent COD | R² = 0.9555 | ||
Effluent TN | R² = 0.86662 | ||
Effluent TP | R² = 0.72544 | ||
Rahmati et al. (2021) | Effluent BOD | FFNN | R = 0.897 |
ANFIS | R = 0.930 |
It is unfeasible to determine the reasons for the differences between the studies because the context of each application is different, with distinct methods, target parameters, and datasets (Ching et al. 2021). Even when a single study developed different types of neural network methods for the same target variable, various situations were observed. For example, Yasmin et al. (2017) observed a better prediction accuracy of the ANFIS model compared with the FFNN method when modeling the pH effluent. In contrast, Nourani et al. (2021) achieved similar results with ANFIS and FFNN when modeling the same output parameters (effluent BOD and COD concentrations). This highlights that the advantage of one method over another may be due to the context of the application, the differences in the dataset used, and the configuration settings in the model of each study.
Limitations of the review and future perspectives
The ever-evolving nature of machine learning techniques leads to numerous possibilities for applications in the wastewater treatment sector. This systematic review focused specifically on the use of ANNs for predicting the performance of WWTPs in terms of effluent quality and removal efficiencies. This more focused approach was necessary due to the rigorous methods employed in a systematic review, allowing for thorough selection, screening, and analysis of publications, facilitating a deeper understanding of the main architectures, hyperparameter configurations of the models, and assessment of the studies. It is important to note that the implementation of the models in real-world WWTPs was not the primary focus of this work. However, it is worth mentioning that one of the main challenges in implementing these models remains the availability of high-quality data (Corominas et al. 2018; Faisal et al. 2023; Ray et al. 2023).
Other systematic reviews should be conducted for other specific applications of neural networks and even other machine learning algorithms in the wastewater treatment sector. For instance, neural networks and different machine learning approaches have been utilized for the optimization of WWTPs, including operational cost and energy consumption optimization, automation, control of operational conditions, real-time monitoring, forecasting of membrane fouling or operational failure (Ray et al. 2023), fault detection, and multi-objective control strategies that aim to maintain effluent quality while reducing energy consumption (Faisal et al. 2023). Each of these applications could serve as the focus of new systematic reviews.
Still considering the constantly evolving nature of machine learning and its applications, according to Zhang et al. (2023), future AI research applied to wastewater treatment will continue to focus on the removal of phosphorus, organic pollutants, and emerging contaminants. Promising directions for research include exploring microbial community dynamics, achieving multi-objective optimization, improving the performance of WWTPs to remove various pollutants, and predicting water quality under specific conditions (Zhang et al. 2023).
CONCLUSIONS
The results of the systematic review of the use of ANN models for the prediction of the performance of full-scale WWTPs, considering 44 relevant papers that were extracted and assessed accordingly, indicated the main trends and applications in the field. Most studies modeled a large activated sludge facility because they have better monitoring and control schemes. The datasets usually included a monitoring period of 1–2 years, with daily samplings, resulting in relatively small datasets (median = 361.5). Prior to training the models, the most common preprocessing method was the min–max normalization in the range [0, 1], and data division was achieved mainly with either 75% for training and 25% for testing the model, or 70% for training, 15% for validation, and 15% for testing.
The publications used influent indicator qualities as the input variables for neural network models to predict WWTP effluent quality, mainly those of organic matter concentrations. Although other methods were utilized, such as MLP, RBF, hybrid learning, and in recent years, deep learning, the FFNN architecture with a backpropagation training algorithm was the most common. In general, shallow networks with single hidden layers were used, and good performance was achieved.
Not all models must be tuned in the same manner, as they vary according to the dataset characteristics and study objectives. However, the findings of this research may act as a starting point and provide highly beneficial information to industry and research practitioners in the search for an optimum design modeling process in future studies with similar prediction problems.
ACKNOWLEDGMENTS
The authors would like to thank Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) for their financial support during the course of the research.
FUNDING
This work was supported by Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq). The funding sources had no involvement in study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.
CONFLICT OF INTEREST
The authors declare there is no conflict.