Abstract
A predictive model to estimate hydrogen sulfide (H2S) emission from sewers would offer engineers and asset managers the ability to evaluate the possible odor/corrosion problems during the design and operation of sewers to avoid in-sewer complications. This study aimed to model and forecast H2S emission from a gravity sewer, as a function of temperature and hydraulic conditions, without requiring prior knowledge of H2S emission mechanism. Two different adaptive neuro-fuzzy inference system (ANFIS) models using grid partitioning (GP) and subtractive clustering (SC) approaches were developed, validated, and tested. The ANFIS-GP model was constructed with two Gaussian membership functions for each input. For the development of the ANFIS-SC model, the MATLAB default values for clustering parameters were selected. Results clearly indicated that both the best ANFIS-GP and ANFIS-SC models produced smaller error compared with the multiple regression models and demonstrated a superior predictive performance on forecasting H2S emission with an excellent R2 value of >0.99. However, the ANFIS-GP model possessed fewer rules and parameters than the ANFIS-SC model. These findings validate the ANFIS-GP model as a potent tool for predicting H2S emission from gravity sewers.
HIGHLIGHTS
An adaptive neuro-fuzzy inference system (ANFIS) model was proposed to predict H2S emission from gravity sewers.
The ANFIS model was constructed based on grid partitioning (GP) and subtractive clustering (SC).
The ANFIS-GP/ANFIS-SC model indicated an excellent prediction accuracy (>99%) for H2S emission.
The ANFIS-GP model was found to be computationally simpler than the ANFIS-SC model due to the creation of fewer fuzzy rules.
Graphical Abstract
INTRODUCTION
Domestic wastewater is transported to water resource recovery facilities via the underground pipelines (or tunnels) called sanitary sewers. During transportation, the attachment and subsequent growth of microorganisms on the submerged surfaces of the sewer occur, which leads to the formation of a biological slime layer, more often named biofilm. The biofilm's deeper regions (closer to the sewer walls) are mostly inhabited by anaerobic microorganisms because oxygen penetration is limited by the outer region of the biofilm. Microbial activity of sulfate-reducing organisms – obligate anaerobes – within the biofilm's deeper regions reduces sulfate (SO42−) to hydrogen sulfide (H2S) (Gutierrez et al. 2016; Li et al. 2019). The formation of H2S depends on the presence of readily biodegradable chemical oxygen demand and sulfur compounds in the sewers (Park et al. 2014). According to Carrera et al. (2016), domestic wastewater usually contains SO42− in a concentration range of 40–200 mg SO42−/L (equals to 13.3–66.7 mg SO42−-S/L).
The H2S product in the biofilm can be diffused to the bulk liquid phase and – when the sewer runs partially full – can then be released to the overlaying gas phase if the supporting factors such as high-turbulence flow, low pH, and high temperature of the liquid phase are prevailed (Yongsiri et al. 2004a; Jiang et al. 2017; Wu et al. 2018; Zuo et al. 2019). In the case of sufficient ventilation of sewer, H2S can be released to the air; this gas has an extremely unpleasant smell (rotten egg odor) with a very low odor detection threshold (0.0001–0.03 ppm) and causes a detrimental impact on human health, ranging from eye irritation and headaches at lower levels (1–10 ppm) and sudden death at levels >1,000 ppm (Salehi & Chaiprapat 2019). In the case of insufficient ventilation of sewer, H2S is accumulated in the sewer atmosphere (the confined air space above the flow surface) and subsequently adsorbed on the sewer moist walls (unsubmerged walls) where it can be oxidized by the action of Thiobacilli-generating sulfuric acid (H2SO4) (Huber et al. 2016; Jiang et al. 2017). The H2SO4 product is the major cause of the corrosion of cement and metal materials in sewers (Jiang et al. 2015, 2016; Wu et al. 2018; Fytianos et al. 2020). The corrosion of sewer networks is an increasing universal concern because it has the potential to cost the wastewater industry billions of dollars to maintain and rehabilitate the damaged sewers. As an example, the Sanitation Districts of Los Angeles County estimated an annual total cost of US$13.75 billion to perform rehabilitation and the maintenance of the corroded sanitary sewers in the United States (García et al. 2020).
Several predictive models to estimate H2S emission from gravity sewers have been developed since the early 1970s. They can be categorized into the following two main groups: (1) modified sewer re-aeration models in which the overall mass transfer coefficient of H2S replaces the overall mass transfer coefficient of oxygen
and (2) models estimating H2S emission rate as a function of hydraulic conditions (Carrera et al. 2016). Some examples of the first group are those developed by Parkhurst & Pomeroy (1972), United States Environmental Protection Agency (USEPA) (1974), Taghizadeh-Nasser (1986), Jensen (1995), and Yongsiri et al. (2003, 2004a, 2004b, 2005). Examples of the second group of the models are the ones proposed by Lahav et al. (2004, 2006). The critical parameter in these second group models is velocity gradient – adapted from the mixing theory – that is linked to the head loss in the sewer.
Yet, despite all these developments and efforts made to model H2S emission from gravity sewers, to the best of the authors' knowledge, the application of an adaptive neuro-fuzzy inference system (ANFIS) has never been exploited. The ANFIS models have advantages over the theoretical/empirical models for complex nonlinear systems because they are constructed based only on a dataset of the input and output variables of the system under consideration, without requiring prior knowledge of the interrelationship between the variables. In addition, they have been proved as a robust modeling tool for nonlinear complex systems with high generalization power. Therefore, the present study aimed – for the first time – to fill this research gap. An ANFIS model based on grid partitioning (GP)/subtractive clustering (SC) of the input domain (will be referred to hereafter as ANFIS-GP and ANFIS-SC, respectively) was developed to predict H2S emission from a gravity sewer using the experimental data reported in Lahav et al. (2006). The predictive abilities of the ANFIS-GP and ANFIS-SC models were compared with each other and with that of the conventional multiple regression by means of two descriptive statistics, including the coefficient of determination (R2) and root-mean-squared error (RMSE). In addition, the two ANFIS models were compared with each other in terms of complexity (i.e., number of rules and parameters).
This paper will present the ANFIS model structure followed by the methodology describing the data processing and modeling approach for the ANFIS-GP, ANFIS-SC, and multiple regression models. Then, results of the three models are compared and discussed, and finally, the paper ends with the conclusions.
ANFIS MODELING STRUCTURE
ANFIS, originally introduced by Jang (1993), is a knowledge-based system that deals with a number of conditional statements (commonly known as IF-THEN rules) and a set of input–output data of the system under consideration. It is used to describe the input–output behavior of complex nonlinear and uncertain problems that are difficult or complicated to be modeled using the mathematical approaches. The ANFIS model combines two soft computing techniques – artificial neural network (ANN) and fuzzy inference system (FIS) – in a single framework. In fact, the ANFIS model has a potential to capture the strengths of these two techniques: the adaptive and learning ability of ANN and the ability of FIS in knowledge representation and interpretation with the help of fuzzy IF-THEN rules (Jang 1993; Jang et al. 1997). The mathematical background of ANFIS model is described in detail below.
For simplicity, let us suppose that the FIS under consideration has two inputs (each with two fuzzy sets) and one output. For the first-order Takagi–Sugeno FIS (Takagi & Sugeno 1985), a typical rule set with two fuzzy IF-THEN rules can be expressed in the following form:
As seen from expressions (1) and (2), each fuzzy IF-THEN rule is composed of two parts, namely the IF part, which is known as antecedent (or premise) part, and the THEN part, which is called the consequent (or conclusion) part. Hence, the parameters pertaining to the fuzzy sets Ai and Bi are referred to as antecedent parameters, and the set {αi, βi, γi} is referred to as the consequent parameters.
Figure 1 illustrates the architecture of a typical ANFIS model, which is functionally equivalent to a two input single-output first-order Takagi–Sugeno FIS with two fuzzy rules.
Schematic representation of ANFIS based on the first-order Takagi–Sugeno FIS with a two-dimensional input space, one output and two fuzzy IF-THEN rules (notes: x1 and x2 are the input variables; is the membership function of the jth fuzzy set (A or B) associated with the ith input xi for i= 1 and 2; the nodes labeled as π perform a product operation on the incoming signals, and the role of the nodes labeled as N is the normalization of the firing strength wi (i = 1 and 2); the node labeled as
applies a summation function on its incoming signals; yi (i= 1 and 2) represents the output of the ith fuzzy rule; a rectangular block indicates an adaptive node whose parameters are adjustable, whereas a diamond block indicates a fixed node that does not have adjustable parameters; the directional links ‘ → ’ only represent the flow of information, in other words, no weight is assigned to the links.
Schematic representation of ANFIS based on the first-order Takagi–Sugeno FIS with a two-dimensional input space, one output and two fuzzy IF-THEN rules (notes: x1 and x2 are the input variables; is the membership function of the jth fuzzy set (A or B) associated with the ith input xi for i= 1 and 2; the nodes labeled as π perform a product operation on the incoming signals, and the role of the nodes labeled as N is the normalization of the firing strength wi (i = 1 and 2); the node labeled as
applies a summation function on its incoming signals; yi (i= 1 and 2) represents the output of the ith fuzzy rule; a rectangular block indicates an adaptive node whose parameters are adjustable, whereas a diamond block indicates a fixed node that does not have adjustable parameters; the directional links ‘ → ’ only represent the flow of information, in other words, no weight is assigned to the links.
As seen in Figure 1, the ANFIS model is a feed-forward neural network consisting of six distinct functional blocks (more often called layers), in which each layer is composed of a set of processing elements named nodes (neurons or elements). The nodes are connected together by links that indicate the signal flow direction from one node to another. Each node applies a specific function on its incoming signals and sends out the product, which is an input for the next layer. Owing to space limitation, the detailed description of each layer is provided in Supplementary Material.
ANFIS structure identification
One of the most crucial steps in the establishment of an ANFIS model is the structure identification, which involves the selection of an appropriate MF, the assignment of an optimal number of MFs to each input variable, and the determination of the number of fuzzy rules. To achieve this aim, two different techniques have been developed with focus on partitioning of the input space: (i) GP and (ii) SC. A brief description of the theoretical background of these two techniques is presented in the following subsections.
Grid partitioning
A GP technique refers to dividing a multi-dimensional input space into rectangular subspaces using an axis-paralleled partition with respect to the number and the type of MFs in each dimension, which are set by the user. Each rectangular subspace, called the fuzzy region, is specified by a fuzzy rule. The total number of fuzzy rules is equal to the number of all possible combinations of the MFs of all inputs (Benmouiza & Cheknane 2019; Babanezhad et al. 2020). For instance, applying GP on a data space comprised of N inputs (x1, x2,…, xN) generates M1×M2×…×MN fuzzy rules, where Mi denotes the number of MFs for the ith input xi for i= 1 to N. Therefore, for a given input space, an increase in the number of MFs per input leads to a remarkable increase in the number of fuzzy rules accordingly, and consequently, an increase in the number of trainable parameters so that the computational load of the model raises.
Subtractive clustering
SC, originally proposed by Chiu (1994), is an effective and rapid one-pass technique by which a large dataset is organized and categorized. In other words, the dataset is split into a number of homogeneous groups, named as clusters, such that the identical data points belong to the same cluster. The number of generated clusters is the same as the number of fuzzy rules (Rahnema et al. 2019). Applying the SC technique on a data space creates a number of clusters whose centers are used to identify the MFs, in the form of ‘gaussmf’, for the inputs (Asadi et al. 2020). This technique has gained remarkable attention because the fuzzy rules can be automatically created rather than the GP technique using which the manual selection of the number of input MFs in advance is a prerequisite to generate fuzzy rules. In addition, the SC technique prevents the problem of combinatorial explosion of the fuzzy rules in the case of a high-dimensional dataset. In other words, it reduces the rule-base complexity of the ANFIS models (Kumar et al. 2017). A brief description of the SC technique is given below (Heddam et al. 2019).
After the density measure of each data point is determined, the data point with the highest density measure is selected to be the first cluster center.
This process – searching new cluster centers – is repeated with respect to the three criteria as given in Table 1. Once the clustering process is complete, a fuzzy IF-THEN rule is assigned to each cluster; the antecedent of the fuzzy rules are created by projecting the clusters onto each dimension in the input space (Chiu 1994, 1997).
Criteria used in the SC technique to accept or reject a data point as a cluster center (Chiu 1994)
Criterion . | Description . | Remarks . | |
---|---|---|---|
I | ![]() | xk* is accepted as a cluster center, and the process is continued to search new cluster centers | |
II | ![]() | xk* is rejected as a cluster center, and the process is terminated | |
III | ![]() | ![]() | xk* is accepted as a cluster center, and the process is continued |
![]() | xk* is rejected as a cluster center, its density measure is set to 0, and the process is continued |
Criterion . | Description . | Remarks . | |
---|---|---|---|
I | ![]() | xk* is accepted as a cluster center, and the process is continued to search new cluster centers | |
II | ![]() | xk* is rejected as a cluster center, and the process is terminated | |
III | ![]() | ![]() | xk* is accepted as a cluster center, and the process is continued |
![]() | xk* is rejected as a cluster center, its density measure is set to 0, and the process is continued |
SC, subtractive clustering.
D1* and Dk* are the density measures corresponding to x1* and xk*, respectively, where x1* is the first cluster center and xk* is the cluster center candidate at step K.
AR, a threshold for the density measure above which the data point is accepted as a cluster center. This parameter ranges between 0 and 1 with an optimal value of 0.5 (note: the larger AR, the fewer cluster centers (Chiu 1994)).
RR, a threshold for the density measure below which the data point is rejected as a cluster center. This parameter ranges between 0 and 1 with an optimal value of 0.15 (note: the smaller RR, the more cluster centers (Chiu 1994)).
‘L’, minimal distance between xk* and all previously defined cluster center (x1*, x2*,…, x*k−1).
‘r’, cluster radius.
ANFIS parameter identification
It has been proven by numerous studies that the hybrid algorithm – that integrates the error backpropagation and the least squares estimation (LSE) methods – is highly efficient in optimizing ANFIS parameters (Chen et al. 2018; Kashyap et al. 2019). Each iteration (epoch) of the hybrid algorithm is composed of two passes, including a forward pass and a backward pass. In the forward pass, the functional signals (the nodes output) go forward until the defuzzification layer wherein the LSE method is used to determine the consequent parameters under the condition that the antecedent parameters are held fixed. Then, the error between target and ANFIS-predicted values is computed. If this error is greater than the pre-specified threshold, the backward pass starts. In the backward pass, the error signals are propagated from the output layer backward to the input layer and gradient descent is used to tune the antecedent parameters, while the consequent parameters remain fixed. The output of the ANFIS is computed by employing the consequent parameters determined in the forward pass. The details and mathematical background of this algorithm can be found in Jang (1993) and Jang et al. (1997).
METHODOLOGY
Dataset
The data used in this study were obtained from the study of Lahav et al. (2006). Briefly, Lahav et al. (2006) operated an artificial gravity-flow sewer (a 27-m-long PVC pipe with an internal diameter of 0.16 m) using sulfide-containing water with an initial sulfide concentration of 20.4–27.2 mg S/L. The authors established an empirical equation describing dissolved sulfide in the aqueous phase of the sewer pipe as a function of temperature and hydraulic conditions (c.f. Table 2 presenting the range of temperature and hydraulic conditions used in the experimental runs).
Temperature and hydraulic conditions used in the experimental-gravity sewer runs (Lahav et al. 2006)
Parameters . | Symbol . | Operating range . | Units . |
---|---|---|---|
Temperature | x1 | 16.8–31.3 | °C |
Pipe's slope | x2 | 1–3 | % m/m |
Flow rate | x3 | 0.0014–0.0078 | m3/s |
Hydraulic deptha | x4 | 0.014–0.089 | m2/m |
Mean flow velocity | x5 | 0.65–1.55 | m/s |
Liquid volume fraction in the pipeb | x6 | 0.024–0.183 | m3/m3 |
Time | x7 | 0–410 | min |
Parameters . | Symbol . | Operating range . | Units . |
---|---|---|---|
Temperature | x1 | 16.8–31.3 | °C |
Pipe's slope | x2 | 1–3 | % m/m |
Flow rate | x3 | 0.0014–0.0078 | m3/s |
Hydraulic deptha | x4 | 0.014–0.089 | m2/m |
Mean flow velocity | x5 | 0.65–1.55 | m/s |
Liquid volume fraction in the pipeb | x6 | 0.024–0.183 | m3/m3 |
Time | x7 | 0–410 | min |
aHydraulic depth represents the cross-sectional area of the flow in the pipe divided by the flow surface width.
bVolume of liquid in the pipe divided by the total volume of liquid in the system; the system includes the pipe, an upstream container, and a downstream container (for more information, see section 2 (Materials and Methods) in Lahav et al. 2006).
To construct an ANFIS/MLR model in this study, x1–x7 were considered as the input variables, while the total sulfide in the aqueous phase of the sewer pipe (y) was the only output variable (c.f. Table 2 for the notation xi for i= 1–7). The dataset consists of a total number of 596 input–output data pairs as given in Supplementary Material, Table S1. The input–output data pairs will be referred to hereafter as patterns); the jth pattern contains a collection of eight data points as {x1j, x2j,…, x7j, yj} for j= 1–596.
Data pre-processing
Before using the original (raw) dataset to construct ANFIS models, a two-step data pre-processing approach was applied. The first step involved normalizing the entire data points, while the second step dealt with splitting the normalized dataset. A detailed description of each respective step is given below.
Step 1: data normalization
Step 2: data splitting
After obtaining the normalized dataset, it was randomized and subsequently divided into three disjoint subsets including training, validation, and testing subsets. In this study, 416 patterns corresponding to about 70% of the dataset were assigned to training subset, while the remaining 30% of the dataset (i.e., 180 patterns) were split into two equal halves, termed validation and testing subsets. The training subset was used to develop the ANFIS models. The validation subset was served in conjunction with the training subset to prevent the ANFIS models from overfitting the training data. The testing subset was utilized to assess the accuracy and effectiveness of the trained (developed) ANFIS models for predicting the output.
The training, validation, and testing subsets were stored in the workspace of MATLAB® (version 8.3.0.532, R2014a) (The MathWorks Inc., Natick, MA, USA) in the form of arrays, in which each row consists of a pattern with the last column (from left to right) indicating the output value and the remaining columns representing inputs' values.
Modeling approaches
This section presents three different modeling approaches, including ANFIS-GP, ANFIS-SC, and multiple regression models, applied for predicting H2S emission from the gravity sewer.
ANFIS-GP model
To develop the ANFIS-GP model, the ANFIS Editor GUI (graphical user interface) of the Fuzzy Logic Toolbox in the framework of MATLAB® (version 8.3.0.532, R2014a) was used. The ‘anfisedit’ was typed in the MATLAB command window to display the ANFIS Editor GUI, which includes four distinct panels, namely (i) load data, (ii) generate FIS, (iii) train FIS, and (iv) test FIS. Initially, the training, validation, and testing subsets were loaded from the MTALAB workspace into the ANFIS Editor GUI, and then, an initial FIS was created by choosing the GP technique. For the system under consideration in this study, various ANFIS-GP models were tried. It is important to note that the following points were taken into account to choose the best model as the one with minimum error:
All possible combinations of the input variables were tested. Table 3 presents the combinations of input variables used for the model's development.
The ‘gaussmf’, one of the most commonly used MFs in the literature, was tested for each input (note that only two MFs were assigned to each input in order to make the models as simple as possible).
Takagi–Sugeno FIS in the form of a first-order polynomial function was used.
Combinations of input variables used for model development
Model . | Input combination . | Model . | Input combination . | Model . | Input combination . |
---|---|---|---|---|---|
M1 | [X1, X7] | M22 | [X1, X2, X3, X7] | M43 | [X1, X2, X3, X5, X7] |
M2 | [X2, X7] | M23 | [X1, X2, X4, X7] | M44 | [X1, X2, X3, X6, X7] |
M3 | [X3, X7] | M24 | [X1, X2, X5, X7] | M45 | [X1, X2, X4, X5, X7] |
M4 | [X4, X7] | M25 | [X1, X2, X6, X7] | M46 | [X1, X2, X4, X6, X7] |
M5 | [X5, X7] | M26 | [X1, X3, X4, X7] | M47 | [X1, X2, X5, X6, X7] |
M6 | [X6, X7] | M27 | [X1, X3, X5, X7] | M48 | [X1, X3, X4, X5, X7] |
M7 | [X1, X2, X7] | M28 | [X1, X3, X6, X7] | M49 | [X1, X3, X4, X6, X7] |
M8 | [X1, X3, X7] | M29 | [X1, X4, X5, X7] | M50 | [X1, X3, X5, X6, X7] |
M9 | [X1, X4, X7] | M30 | [X1, X4, X6, X7] | M51 | [X1, X4, X5, X6, X7] |
M10 | [X1, X5, X7] | M31 | [X1, X5, X6, X7] | M52 | [X2, X3, X4, X5, X7] |
M11 | [X1, X6, X7] | M32 | [X2, X3, X4, X7] | M53 | [X2, X3, X4, X6, X7] |
M12 | [X2, X3, X7] | M33 | [X2, X3, X5, X7] | M54 | [X2, X3, X5, X6, X7] |
M13 | [X2, X4, X7] | M34 | [X2, X3, X6, X7] | M55 | [X2, X4, X5, X6, X7] |
M14 | [X2, X5, X7] | M35 | [X2, X4, X5, X7] | M56 | [X3, X4, X5, X6, X7] |
M15 | [X2, X6, X7] | M36 | [X2, X4, X6, X7] | M57 | [X1, X2, X3, X4, X5, X7] |
M16 | [X3, X4, X7] | M37 | [X2, X5, X6, X7] | M58 | [X1, X2, X3, X4, X6, X7] |
M17 | [X3, X5, X7] | M38 | [X3, X4, X5, X7] | M59 | [X1, X2, X3, X5, X6, X7] |
M18 | [X3, X6, X7] | M39 | [X3, X4, X6, X7] | M60 | [X1, X2, X4, X5, X6, X7] |
M19 | [X4, X5, X7] | M40 | [X3, X5, X6, X7] | M61 | [X1, X3, X4, X5, X6, X7] |
M20 | [X4, X6, X7] | M41 | [X4, X5, X6, X7] | M62 | [X2, X3, X4, X5, X6, X7] |
M21 | [X5, X6, X7] | M42 | [X1, X2, X3, X4, X7] | M63 | [X1, X2, X3, X4, X5, X6, X7] |
Model . | Input combination . | Model . | Input combination . | Model . | Input combination . |
---|---|---|---|---|---|
M1 | [X1, X7] | M22 | [X1, X2, X3, X7] | M43 | [X1, X2, X3, X5, X7] |
M2 | [X2, X7] | M23 | [X1, X2, X4, X7] | M44 | [X1, X2, X3, X6, X7] |
M3 | [X3, X7] | M24 | [X1, X2, X5, X7] | M45 | [X1, X2, X4, X5, X7] |
M4 | [X4, X7] | M25 | [X1, X2, X6, X7] | M46 | [X1, X2, X4, X6, X7] |
M5 | [X5, X7] | M26 | [X1, X3, X4, X7] | M47 | [X1, X2, X5, X6, X7] |
M6 | [X6, X7] | M27 | [X1, X3, X5, X7] | M48 | [X1, X3, X4, X5, X7] |
M7 | [X1, X2, X7] | M28 | [X1, X3, X6, X7] | M49 | [X1, X3, X4, X6, X7] |
M8 | [X1, X3, X7] | M29 | [X1, X4, X5, X7] | M50 | [X1, X3, X5, X6, X7] |
M9 | [X1, X4, X7] | M30 | [X1, X4, X6, X7] | M51 | [X1, X4, X5, X6, X7] |
M10 | [X1, X5, X7] | M31 | [X1, X5, X6, X7] | M52 | [X2, X3, X4, X5, X7] |
M11 | [X1, X6, X7] | M32 | [X2, X3, X4, X7] | M53 | [X2, X3, X4, X6, X7] |
M12 | [X2, X3, X7] | M33 | [X2, X3, X5, X7] | M54 | [X2, X3, X5, X6, X7] |
M13 | [X2, X4, X7] | M34 | [X2, X3, X6, X7] | M55 | [X2, X4, X5, X6, X7] |
M14 | [X2, X5, X7] | M35 | [X2, X4, X5, X7] | M56 | [X3, X4, X5, X6, X7] |
M15 | [X2, X6, X7] | M36 | [X2, X4, X6, X7] | M57 | [X1, X2, X3, X4, X5, X7] |
M16 | [X3, X4, X7] | M37 | [X2, X5, X6, X7] | M58 | [X1, X2, X3, X4, X6, X7] |
M17 | [X3, X5, X7] | M38 | [X3, X4, X5, X7] | M59 | [X1, X2, X3, X5, X6, X7] |
M18 | [X3, X6, X7] | M39 | [X3, X4, X6, X7] | M60 | [X1, X2, X4, X5, X6, X7] |
M19 | [X4, X5, X7] | M40 | [X3, X5, X6, X7] | M61 | [X1, X3, X4, X5, X6, X7] |
M20 | [X4, X6, X7] | M41 | [X4, X5, X6, X7] | M62 | [X2, X3, X4, X5, X6, X7] |
M21 | [X5, X6, X7] | M42 | [X1, X2, X3, X4, X7] | M63 | [X1, X2, X3, X4, X5, X6, X7] |
Note: see Table 2 for the notations X1, X2, …, X7.
Each ANFIS-GP model was trained with the training subset using the hybrid algorithm, and the model error (called training error) was determined; the training error goal was set to zero. At each epoch, the model validation error was also calculated. An epoch, at which the validation error started to rise while the training error continued to decrease, was considered as a sign to terminate iterating the training algorithm because of the occurrence of overfitting. In the case that both the training and validation errors continued to decrease while the number of the epochs increased, the learning algorithm stopped iterating whenever the training error goal achieved; otherwise, the iterating progressed up to an epoch beyond which the training error value remained constant. When none of these three stopping criteria was met, the learning algorithm continued iterating up to a pre-defined number of epochs (100 epochs).


From Equations (9) and (10), it can be seen that the closer the RMSE to zero and R2 to unity, the smaller the difference between and
. In other words, the model perfectly fits the data when the R2 value is equal to unity and the RMSE value is equal to zero.
Once the training process was complete, the trained (developed) model was subjected to the testing phase, so that the testing subset (unseen data during the training/validation process) was fed to the model in order to assess the predictive ability of the model by means of the two aforementioned statistical indices (R2 and RMSE; see Equations (9) and (10)).
Note that when the model training and testing phases were complete, the normalized output values of the model were anti-normalized to their original values by reversing transformation of Equation (7).
ANFIS-SC model
The ANFIS-SC model was constructed in the same manner as the ANFIS-GP, except that the number of MFs per input was automatically determined instead of specifying by the user. In the FIS generation panel of the ANFIS Editor GUI, the SC technique was selected and the MATLAB default values for the clustering parameters were selected (r= 0.50, λ = 1.25, AR= 0.50, RR= 0.15; MATLAB® version 8.3.0.532, R2014a). Note that the training phase parameters, including the number of training epochs, training error goal, the type of output MF, training algorithm, and training stopping criteria, were the same as those used for the ANFIS-GP model. In addition, the model testing phase was performed similar to that for the ANFIS-GP model.
Multiple regression-based approach
The multiple regression-based analysis was also performed for predicting H2S emission from the gravity sewer. The experimental data were evaluated by means of DataFit software (trial version 9.1.32, Copyright© 1995–2014, Oakdale Engineering, PA, USA), which contains 242 types of regression models. As the regression models were solved, they were automatically ranked based on the goodness of fit. In addition, t-ratios and p-values were estimated to assess the significance of the model coefficients (p < 0.05 was considered statistically significant). It should be pointed out that the training subset was served to estimate the regression model coefficients, while the testing subset was applied to evaluate the models’ prediction accuracy in terms of R2 and RMSE given by Equations (9) and (10).
RESULTS AND DISCUSSIONS
In this study, various modeling approaches were applied to predict H2S emission from a gravity-flow sewer as a function of temperature and hydraulic conditions.
First, two different ANFIS models, namely ANFIS-GP and ANFIS-SC, were constructed based on the first-order Takagi–Sugeno FIS. The hybrid-learning algorithm was employed to train the models. The inputs to the models were temperature, pipe's slope, flow rate, hydraulic depth, mean flow velocity, liquid volume fraction in the pipe, and time, while models’ output was H2S emission from the aqueous phase of the sewer. The data used were taken from the experimental study of Lahav et al. (2006). Models’ performances were assessed with two descriptive statistical indicators, namely R2 and RMSE.
Second, multiple regression models were established whose results were compared with those of the ANFIS-GP and ANFIS-SC models.
ANFIS-GP model
To estimate H2S emission from the gravity-flow sewer pipe, 63 ANFIS-GP models were developed whose training, validation, and testing results in terms of R2 and RMSE are presented in Table 4. In addition, the number of fuzzy rules and the total number of parameters (linear and nonlinear) for each model are given.
Performance of ANFIS-GP models developed in this study
Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|
. | . | Training . | Validation . | Testing . | ||||
Nr . | Ntp . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | |
M1 | 4 | 20 | 0.9692 | 5.8842 | 0.9714 | 6.1405 | 0.9715 | 5.4812 |
M2 | 0.9606 | 6.6505 | 0.9585 | 7.4065 | 0.9516 | 7.1492 | ||
M3 | 0.9685 | 5.9450 | 0.9647 | 6.8247 | 0.9597 | 6.5223 | ||
M4 | 0.9549 | 7.1114 | 0.9514 | 8.0133 | 0.9457 | 7.5703 | ||
M5 | 0.9681 | 5.9855 | 0.9682 | 6.4825 | 0.9605 | 6.4558 | ||
M6 | 0.9542 | 7.1701 | 0.9516 | 7.9945 | 0.9442 | 7.6705 | ||
M7 | 8 | 44 | 0.9824 | 4.4410 | 0.9860 | 4.2984 | 0.9810 | 4.4800 |
M8 | 0.9878 | 3.6949 | 0.9887 | 3.8674 | 0.9868 | 3.7291 | ||
M9 | 0.9838 | 4.2709 | 0.9824 | 4.8159 | 0.9798 | 4.6155 | ||
M10 | 0.9884 | 3.6029 | 0.9920 | 3.2604 | 0.9888 | 3.4450 | ||
M11 | 0.9836 | 4.2853 | 0.9824 | 4.8207 | 0.9798 | 4.6151 | ||
M12 | 0.9855 | 4.0379 | 0.9815 | 4.9393 | 0.9775 | 4.8711 | ||
M13 | 0.9823 | 4.4597 | 0.9787 | 5.3068 | 0.9714 | 5.4918 | ||
M14 | 0.9860 | 3.9614 | 0.9817 | 4.9285 | 0.9810 | 4.4883 | ||
M15 | 0.9825 | 4.4293 | 0.9785 | 5.3297 | 0.9716 | 5.4740 | ||
M16 | 0.9835 | 4.3069 | 0.9861 | 4.2828 | 0.9780 | 4.8210 | ||
M17 | 0.9876 | 3.7322 | 0.9889 | 3.8281 | 0.9843 | 4.0680 | ||
M18 | 0.9837 | 4.2730 | 0.9860 | 4.2935 | 0.9782 | 4.7954 | ||
M19 | 0.9825 | 4.4277 | 0.9816 | 4.9270 | 0.9776 | 4.8661 | ||
M20 | 0.9698 | 5.8235 | 0.9741 | 5.8510 | 0.9638 | 6.1810 | ||
M21 | 0.9828 | 4.4001 | 0.9822 | 4.8446 | 0.9776 | 4.8566 | ||
M22 | 16 | 96 | 0.9953 | 2.2945 | 0.9956 | 2.4106 | 0.9940 | 2.5171 |
M23 | 0.9953 | 2.2992 | 0.9957 | 2.3785 | 0.9939 | 2.5363 | ||
M24 | 0.9953 | 2.2883 | 0.9955 | 2.4467 | 0.9943 | 2.4593 | ||
M25 | 0.9953 | 2.2989 | 0.9957 | 2.3871 | 0.9939 | 2.5353 | ||
M26 | 0.9953 | 2.3026 | 0.9949 | 2.5971 | 0.9930 | 2.7272 | ||
M27 | 0.9953 | 2.2953 | 0.9953 | 2.4959 | 0.9940 | 2.5183 | ||
M28 | 0.9954 | 2.2810 | 0.9950 | 2.5813 | 0.9931 | 2.7065 | ||
M29 | 0.9954 | 2.2659 | 0.9954 | 2.4595 | 0.9939 | 2.5334 | ||
M30 | 0.9929 | 2.8208 | 0.9916 | 3.3308 | 0.9897 | 3.2941 | ||
M31 | 0.9954 | 2.2640 | 0.9953 | 2.4783 | 0.9939 | 2.5381 | ||
M32 | 0.9954 | 2.2640 | 0.9954 | 2.4783 | 0.9939 | 2.5381 | ||
M33 | 0.9950 | 2.3711 | 0.9956 | 2.4070 | 0.9939 | 2.5289 | ||
M34 | 0.9948 | 2.4067 | 0.9956 | 2.4083 | 0.9933 | 2.6596 | ||
M35 | 0.9946 | 2.4538 | 0.9949 | 2.5868 | 0.9933 | 2.6673 | ||
M36 | 0.9943 | 2.5198 | 0.9952 | 2.5154 | 0.9921 | 2.8802 | ||
M37 | 16 | 96 | 0.9948 | 2.4097 | 0.9955 | 2.4339 | 0.9936 | 2.6090 |
M38 | 0.9939 | 2.6197 | 0.9948 | 2.6299 | 0.9915 | 2.9984 | ||
M39 | 0.9934 | 2.7167 | 0.9937 | 2.8775 | 0.9884 | 3.4951 | ||
M40 | 0.9945 | 2.4938 | 0.9946 | 2.6683 | 0.9925 | 2.8217 | ||
M41 | 0.9941 | 2.5667 | 0.9937 | 2.8764 | 0.9904 | 3.1762 | ||
M42 | 32 | 212 | 0.9956 | 2.2328 | 0.9951 | 2.5358 | 0.9939 | 2.5465 |
M43 | 0.9958 | 2.1752 | 0.9954 | 2.4536 | 0.9939 | 2.5311 | ||
M44 | 0.9956 | 2.2249 | 0.9951 | 2.5518 | 0.9938 | 2.5509 | ||
M45 | 0.9956 | 2.2227 | 0.9954 | 2.4580 | 0.9940 | 2.5198 | ||
M46 | 0.9956 | 2.2317 | 0.9952 | 2.5211 | 0.9838 | 2.5484 | ||
M47 | 0.9956 | 2.2159 | 0.9953 | 2.4816 | 0.9939 | 2.5463 | ||
M48 | 0.9957 | 2.2003 | 0.9952 | 2.5153 | 0.9936 | 2.6085 | ||
M49 | 0.9956 | 2.2294 | 0.9953 | 2.4952 | 0.9930 | 2.7269 | ||
M50 | 0.9957 | 2.1974 | 0.9952 | 2.5273 | 0.9936 | 2.6060 | ||
M51 | 0.9957 | 2.2091 | 0.9953 | 2.4998 | 0.9933 | 2.6591 | ||
M52 | 0.9955 | 2.2479 | 0.9953 | 2.4950 | 0.9936 | 2.6081 | ||
M53 | 0.9952 | 2.3314 | 0.9954 | 2.4741 | 0.9934 | 2.6466 | ||
M54 | 0.9955 | 2.2395 | 0.9953 | 2.4876 | 0.9935 | 2.6228 | ||
M55 | 0.9953 | 2.3068 | 0.9953 | 2.5042 | 0.9930 | 2.7091 | ||
M56 | 0.9953 | 2.3026 | 0.9960 | 2.3061 | 0.9938 | 2.5606 | ||
M57* | 64 | 472 | – | – | – | – | – | – |
M58* | ||||||||
M59* | ||||||||
M60* | ||||||||
M61* | ||||||||
M62* | ||||||||
M63* | 128 | 1,052 |
Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|
. | . | Training . | Validation . | Testing . | ||||
Nr . | Ntp . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | |
M1 | 4 | 20 | 0.9692 | 5.8842 | 0.9714 | 6.1405 | 0.9715 | 5.4812 |
M2 | 0.9606 | 6.6505 | 0.9585 | 7.4065 | 0.9516 | 7.1492 | ||
M3 | 0.9685 | 5.9450 | 0.9647 | 6.8247 | 0.9597 | 6.5223 | ||
M4 | 0.9549 | 7.1114 | 0.9514 | 8.0133 | 0.9457 | 7.5703 | ||
M5 | 0.9681 | 5.9855 | 0.9682 | 6.4825 | 0.9605 | 6.4558 | ||
M6 | 0.9542 | 7.1701 | 0.9516 | 7.9945 | 0.9442 | 7.6705 | ||
M7 | 8 | 44 | 0.9824 | 4.4410 | 0.9860 | 4.2984 | 0.9810 | 4.4800 |
M8 | 0.9878 | 3.6949 | 0.9887 | 3.8674 | 0.9868 | 3.7291 | ||
M9 | 0.9838 | 4.2709 | 0.9824 | 4.8159 | 0.9798 | 4.6155 | ||
M10 | 0.9884 | 3.6029 | 0.9920 | 3.2604 | 0.9888 | 3.4450 | ||
M11 | 0.9836 | 4.2853 | 0.9824 | 4.8207 | 0.9798 | 4.6151 | ||
M12 | 0.9855 | 4.0379 | 0.9815 | 4.9393 | 0.9775 | 4.8711 | ||
M13 | 0.9823 | 4.4597 | 0.9787 | 5.3068 | 0.9714 | 5.4918 | ||
M14 | 0.9860 | 3.9614 | 0.9817 | 4.9285 | 0.9810 | 4.4883 | ||
M15 | 0.9825 | 4.4293 | 0.9785 | 5.3297 | 0.9716 | 5.4740 | ||
M16 | 0.9835 | 4.3069 | 0.9861 | 4.2828 | 0.9780 | 4.8210 | ||
M17 | 0.9876 | 3.7322 | 0.9889 | 3.8281 | 0.9843 | 4.0680 | ||
M18 | 0.9837 | 4.2730 | 0.9860 | 4.2935 | 0.9782 | 4.7954 | ||
M19 | 0.9825 | 4.4277 | 0.9816 | 4.9270 | 0.9776 | 4.8661 | ||
M20 | 0.9698 | 5.8235 | 0.9741 | 5.8510 | 0.9638 | 6.1810 | ||
M21 | 0.9828 | 4.4001 | 0.9822 | 4.8446 | 0.9776 | 4.8566 | ||
M22 | 16 | 96 | 0.9953 | 2.2945 | 0.9956 | 2.4106 | 0.9940 | 2.5171 |
M23 | 0.9953 | 2.2992 | 0.9957 | 2.3785 | 0.9939 | 2.5363 | ||
M24 | 0.9953 | 2.2883 | 0.9955 | 2.4467 | 0.9943 | 2.4593 | ||
M25 | 0.9953 | 2.2989 | 0.9957 | 2.3871 | 0.9939 | 2.5353 | ||
M26 | 0.9953 | 2.3026 | 0.9949 | 2.5971 | 0.9930 | 2.7272 | ||
M27 | 0.9953 | 2.2953 | 0.9953 | 2.4959 | 0.9940 | 2.5183 | ||
M28 | 0.9954 | 2.2810 | 0.9950 | 2.5813 | 0.9931 | 2.7065 | ||
M29 | 0.9954 | 2.2659 | 0.9954 | 2.4595 | 0.9939 | 2.5334 | ||
M30 | 0.9929 | 2.8208 | 0.9916 | 3.3308 | 0.9897 | 3.2941 | ||
M31 | 0.9954 | 2.2640 | 0.9953 | 2.4783 | 0.9939 | 2.5381 | ||
M32 | 0.9954 | 2.2640 | 0.9954 | 2.4783 | 0.9939 | 2.5381 | ||
M33 | 0.9950 | 2.3711 | 0.9956 | 2.4070 | 0.9939 | 2.5289 | ||
M34 | 0.9948 | 2.4067 | 0.9956 | 2.4083 | 0.9933 | 2.6596 | ||
M35 | 0.9946 | 2.4538 | 0.9949 | 2.5868 | 0.9933 | 2.6673 | ||
M36 | 0.9943 | 2.5198 | 0.9952 | 2.5154 | 0.9921 | 2.8802 | ||
M37 | 16 | 96 | 0.9948 | 2.4097 | 0.9955 | 2.4339 | 0.9936 | 2.6090 |
M38 | 0.9939 | 2.6197 | 0.9948 | 2.6299 | 0.9915 | 2.9984 | ||
M39 | 0.9934 | 2.7167 | 0.9937 | 2.8775 | 0.9884 | 3.4951 | ||
M40 | 0.9945 | 2.4938 | 0.9946 | 2.6683 | 0.9925 | 2.8217 | ||
M41 | 0.9941 | 2.5667 | 0.9937 | 2.8764 | 0.9904 | 3.1762 | ||
M42 | 32 | 212 | 0.9956 | 2.2328 | 0.9951 | 2.5358 | 0.9939 | 2.5465 |
M43 | 0.9958 | 2.1752 | 0.9954 | 2.4536 | 0.9939 | 2.5311 | ||
M44 | 0.9956 | 2.2249 | 0.9951 | 2.5518 | 0.9938 | 2.5509 | ||
M45 | 0.9956 | 2.2227 | 0.9954 | 2.4580 | 0.9940 | 2.5198 | ||
M46 | 0.9956 | 2.2317 | 0.9952 | 2.5211 | 0.9838 | 2.5484 | ||
M47 | 0.9956 | 2.2159 | 0.9953 | 2.4816 | 0.9939 | 2.5463 | ||
M48 | 0.9957 | 2.2003 | 0.9952 | 2.5153 | 0.9936 | 2.6085 | ||
M49 | 0.9956 | 2.2294 | 0.9953 | 2.4952 | 0.9930 | 2.7269 | ||
M50 | 0.9957 | 2.1974 | 0.9952 | 2.5273 | 0.9936 | 2.6060 | ||
M51 | 0.9957 | 2.2091 | 0.9953 | 2.4998 | 0.9933 | 2.6591 | ||
M52 | 0.9955 | 2.2479 | 0.9953 | 2.4950 | 0.9936 | 2.6081 | ||
M53 | 0.9952 | 2.3314 | 0.9954 | 2.4741 | 0.9934 | 2.6466 | ||
M54 | 0.9955 | 2.2395 | 0.9953 | 2.4876 | 0.9935 | 2.6228 | ||
M55 | 0.9953 | 2.3068 | 0.9953 | 2.5042 | 0.9930 | 2.7091 | ||
M56 | 0.9953 | 2.3026 | 0.9960 | 2.3061 | 0.9938 | 2.5606 | ||
M57* | 64 | 472 | – | – | – | – | – | – |
M58* | ||||||||
M59* | ||||||||
M60* | ||||||||
M61* | ||||||||
M62* | ||||||||
M63* | 128 | 1,052 |
The number of parameters for model M* was greater than the number of patterns for training the models, and hence, these models were not considered in this study.
See Table 3 for the model inputs.
Nr, number of fuzzy rules; Ntp, number of the total parameters; R2, coefficient of determination; RMSE, root-mean-squared error.
It can be seen from Table 4 that the ANFIS-GP models M1, M2,…,M56 offer good performance with R2 values in the range of 0.9514–0.9960 and 0.9442–0.9943, and RMSE values in the range of 2.306–8.013 and 2.460–7.671, for validation and testing phases, respectively. Note that models M57 to M63 were found to be inappropriate because the creation of models M57 to M62 and model M63 resulted in a total number of 472 and 1,052 parameters, respectively, which was greater than the size of the training subset (equals to 416 input–output data pairs). From the results in Table 4, the ANFIS-GP models (M1, M2,…,M6), which use only one of the input variables (X1, X2, or X6) in combination with the input variable X7, can efficiently estimate H2S emission. Among these models, model M1 is ranked as the best ANFIS-GP model, which gives R2 and RMSE values of 0.9714 and 6.141, respectively, for the validation phase, and the R2 value of 0.9715 and the RMSE value of 5.481 for the testing phase. This reveals that the input variable X1, denoted as temperature, seems to be more effective variable for predicting H2S emission.
Regarding the ANFIS-GP models that use more than two input variables (models M7, M8,…,M56), model M24 (shown in bold italic in Table 4) whose input is a set of X1, X2, X5, and X7 variables yields the smallest RMSE of 2.459 for the testing phase (R2 = 0.9943).
It can be concluded that the prediction performance of the ANFIS-GP model M1 whose input variables are X1 and X7 is enhanced by the addition of variables X2 and X5 to the input data.
A scatter diagram of the measured and the predicted values of H2S emission for testing subset using the ANFIS-GP model M24 is shown in Figure 2.
(a) Scatter plot and (b) comparison of the measured and predicted H2S emission from the gravity sewer using the ANFIS-GP model M24 for the testing subset.
(a) Scatter plot and (b) comparison of the measured and predicted H2S emission from the gravity sewer using the ANFIS-GP model M24 for the testing subset.
It is evident from Figure 2(a) that the data points on the plot dispersed close to the 45° line (often called the 1:1 line or the 100% correlation line) with the R2 value of 0.994. This indicates that only 0.6% of the total variability in the response could not be explained by the model. In addition, the measured data and the model predictions are plotted versus the number of patterns, as illustrated in Figure 2(b). It is clear from Figure 2(b) that there is a small discrepancy between the measured data and the predictions, which confirms high predictive ability of the ANFIS-GP model M24.
ANFIS-SC model
A total of 63 ANFIS-SC models were developed to predict H2S emission. The performance of each model through training, validation, and testing phases in terms of R2 and RMSE is presented in Table 5. The number of fuzzy rules and the total number of parameters (linear and nonlinear) for each model are also provided.
Performance of the ANFIS-SC models developed in this study
Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|
. | . | Training . | Validation . | Testing . | ||||
Nr . | Ntp . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | |
M1 | 4 | 28 | 0.9711 | 5.7001 | 0.9749 | 5.7599 | 0.9748 | 5.1555 |
M2 | 7 | 49 | 0.9610 | 6.6173 | 0.9582 | 7.4337 | 0.9522 | 7.1025 |
M3 | 7 | 49 | 0.9794 | 4.8054 | 0.9792 | 5.2359 | 0.9737 | 5.2725 |
M4 | 2 | 14 | 0.9489 | 7.5731 | 0.9456 | 8.4789 | 0.9382 | 8.0740 |
M5 | 6 | 42 | 0.9771 | 5.0685 | 0.9697 | 6.3225 | 0.9685 | 5.7641 |
M6 | 2 | 14 | 0.9489 | 7.5706 | 0.9455 | 8.4804 | 0.9382 | 8.0732 |
M7 | 10 | 100 | 0.9841 | 4.2305 | 0.9874 | 4.0871 | 0.9822 | 4.3337 |
M8 | 10 | 100 | 0.9940 | 2.5938 | 0.9949 | 2.6027 | 0.9908 | 3.1224 |
M9 | 6 | 60 | 0.9720 | 5.6025 | 0.9683 | 6.4665 | 0.9802 | 4.5749 |
M10 | 9 | 90 | 0.9891 | 3.4987 | 0.9905 | 3.5509 | 0.9872 | 3.6789 |
M11 | 6 | 60 | 0.9711 | 5.6982 | 0.9666 | 6.6435 | 0.9781 | 4.8076 |
M12 | 14 | 140 | 0.9945 | 2.4838 | 0.9944 | 2.7249 | 0.9918 | 2.9414 |
M13 | 9 | 90 | 0.9647 | 6.2961 | 0.9594 | 7.3238 | 0.9626 | 6.2823 |
M14 | 12 | 120 | 0.9916 | 3.0655 | 0.9890 | 3.8067 | 0.9860 | 3.8426 |
M15 | 9 | 90 | 0.9648 | 6.2863 | 0.9596 | 7.3040 | 0.9621 | 6.3273 |
M16 | 8 | 80 | 0.9779 | 4.9797 | 0.9689 | 6.4049 | 0.9771 | 4.9141 |
M17 | 9 | 90 | 0.9910 | 3.1738 | 0.9896 | 3.7003 | 0.9875 | 3.6315 |
M18 | 8 | 80 | 0.9777 | 5.0027 | 0.9685 | 6.4472 | 0.9763 | 4.9964 |
M19 | 7 | 70 | 0.9737 | 5.4351 | 0.9604 | 7.2357 | 0.9691 | 5.7105 |
M20 | 2 | 20 | 0.9506 | 7.4443 | 0.9483 | 8.2636 | 0.9403 | 7.9356 |
M21 | 7 | 70 | 0.9758 | 5.2093 | 0.9633 | 6.9584 | 0.9741 | 5.2313 |
M22 | 24 | 312 | 0.9963 | 2.0429 | 0.9949 | 2.5829 | 0.9934 | 2.6326 |
M23 | 12 | 156 | 0.9876 | 3.7265 | 0.9805 | 5.0804 | 0.9922 | 2.8630 |
M24 | 17 | 221 | 0.9961 | 2.1057 | 0.9954 | 2.4745 | 0.9939 | 2.5304 |
M25 | 12 | 156 | 0.9851 | 4.0927 | 0.9780 | 5.3924 | 0.9781 | 4.8037 |
M26 | 11 | 143 | 0.9955 | 2.2369 | 0.9948 | 2.6148 | 0.9936 | 2.6006 |
M27 | 18 | 234 | 0.9964 | 2.0211 | 0.9953 | 2.4923 | 0.9927 | 2.7692 |
M28 | 15 | 195 | 0.9957 | 2.1946 | 0.9952 | 2.5257 | 0.9933 | 2.6522 |
M29 | 10 | 130 | 0.9877 | 3.7178 | 0.9815 | 4.9442 | 0.9927 | 2.7807 |
M30 | 6 | 78 | 0.9757 | 5.2227 | 0.9714 | 6.1470 | 0.9834 | 4.1897 |
M31 | 11 | 143 | 0.9874 | 3.7556 | 0.9811 | 4.9916 | 0.9926 | 2.7942 |
M32 | 15 | 195 | 0.9955 | 2.2508 | 0.9952 | 2.5263 | 0.9937 | 2.5777 |
M33 | 17 | 221 | 0.9958 | 2.1818 | 0.9944 | 2.7129 | 0.9936 | 2.6009 |
M34 | 16 | 208 | 0.9954 | 2.2802 | 0.9953 | 2.4796 | 0.9936 | 2.6005 |
M35 | 12 | 156 | 0.9873 | 3.7714 | 0.9806 | 5.0616 | 0.9917 | 2.9682 |
M36 | 9 | 117 | 0.9864 | 3.9120 | 0.9792 | 5.2411 | 0.9888 | 3.4428 |
M37 | 12 | 156 | 0.9872 | 3.7968 | 0.9795 | 5.2018 | 0.9901 | 3.2367 |
M38 | 11 | 143 | 0.9951 | 2.3478 | 0.9944 | 2.7154 | 0.9930 | 2.7272 |
M39 | 8 | 104 | 0.9848 | 4.1349 | 0.9780 | 5.3900 | 0.9889 | 3.4282 |
M40 | 11 | 143 | 0.9945 | 2.4850 | 0.9939 | 2.8298 | 0.9921 | 2.8812 |
M41 | 9 | 117 | 0.9850 | 4.1054 | 0.9784 | 5.3444 | 0.9849 | 3.9936 |
M42 | 26 | 416 | 0.9963 | 2.0509 | 0.9952 | 2.5301 | 0.9934 | 2.6396 |
M43* | 27 | 432 | – | – | – | – | – | – |
M44 | 26 | 416 | 0.9964 | 2.0040 | 0.9940 | 2.8086 | 0.9928 | 2.7604 |
M45 | 17 | 272 | 0.9961 | 2.1048 | 0.9954 | 2.4640 | 0.9930 | 2.7202 |
M46 | 14 | 224 | 0.9884 | 3.6116 | 0.9812 | 4.9854 | 0.9891 | 3.3883 |
M47 | 18 | 288 | 0.9962 | 2.055 | 0.9950 | 2.5649 | 0.9938 | 2.5487 |
M48 | 20 | 320 | 0.9961 | 2.0895 | 0.9956 | 2.4049 | 0.9940 | 2.5217 |
M49 | 15 | 240 | 0.9884 | 3.6021 | 0.9808 | 5.0392 | 0.9923 | 2.8493 |
M50 | 20 | 320 | 0.9961 | 2.0949 | 0.9955 | 2.4288 | 0.9940 | 2.5143 |
M51 | 12 | 192 | 0.9876 | 3.7328 | 0.9806 | 5.0594 | 0.9925 | 2.8186 |
M52 | 18 | 288 | 0.9956 | 2.2199 | 0.9943 | 2.7480 | 0.9936 | 2.5970 |
M53 | 16 | 256 | 0.9956 | 2.2251 | 0.9953 | 2.4842 | 0.9940 | 2.5219 |
M54 | 18 | 288 | 0.9949 | 2.3940 | 0.9948 | 2.6291 | 0.9923 | 2.8553 |
M55 | 13 | 208 | 0.9944 | 2.4961 | 0.9934 | 2.9428 | 0.9898 | 3.2747 |
M56 | 12 | 192 | 0.9881 | 3.6537 | 0.9808 | 5.0343 | 0.9920 | 2.8982 |
M57* | 29 | 551 | – | – | – | – | – | – |
M58* | 26 | 494 | – | – | – | – | – | – |
M59* | 29 | 551 | – | – | – | – | – | – |
M60 | 19 | 361 | 0.9963 | 2.0513 | 0.9950 | 2.5784 | 0.9931 | 2.6957 |
M61 | 19 | 361 | 0.9961 | 2.0971 | 0.9954 | 2.4667 | 0.9936 | 2.5952 |
M62 | 19 | 361 | 0.9951 | 2.3574 | 0.9947 | 2.6547 | 0.9931 | 2.6969 |
M63* | 29 | 638 | – | – | – | – | – | – |
Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|
. | . | Training . | Validation . | Testing . | ||||
Nr . | Ntp . | R2 . | RMSE . | R2 . | RMSE . | R2 . | RMSE . | |
M1 | 4 | 28 | 0.9711 | 5.7001 | 0.9749 | 5.7599 | 0.9748 | 5.1555 |
M2 | 7 | 49 | 0.9610 | 6.6173 | 0.9582 | 7.4337 | 0.9522 | 7.1025 |
M3 | 7 | 49 | 0.9794 | 4.8054 | 0.9792 | 5.2359 | 0.9737 | 5.2725 |
M4 | 2 | 14 | 0.9489 | 7.5731 | 0.9456 | 8.4789 | 0.9382 | 8.0740 |
M5 | 6 | 42 | 0.9771 | 5.0685 | 0.9697 | 6.3225 | 0.9685 | 5.7641 |
M6 | 2 | 14 | 0.9489 | 7.5706 | 0.9455 | 8.4804 | 0.9382 | 8.0732 |
M7 | 10 | 100 | 0.9841 | 4.2305 | 0.9874 | 4.0871 | 0.9822 | 4.3337 |
M8 | 10 | 100 | 0.9940 | 2.5938 | 0.9949 | 2.6027 | 0.9908 | 3.1224 |
M9 | 6 | 60 | 0.9720 | 5.6025 | 0.9683 | 6.4665 | 0.9802 | 4.5749 |
M10 | 9 | 90 | 0.9891 | 3.4987 | 0.9905 | 3.5509 | 0.9872 | 3.6789 |
M11 | 6 | 60 | 0.9711 | 5.6982 | 0.9666 | 6.6435 | 0.9781 | 4.8076 |
M12 | 14 | 140 | 0.9945 | 2.4838 | 0.9944 | 2.7249 | 0.9918 | 2.9414 |
M13 | 9 | 90 | 0.9647 | 6.2961 | 0.9594 | 7.3238 | 0.9626 | 6.2823 |
M14 | 12 | 120 | 0.9916 | 3.0655 | 0.9890 | 3.8067 | 0.9860 | 3.8426 |
M15 | 9 | 90 | 0.9648 | 6.2863 | 0.9596 | 7.3040 | 0.9621 | 6.3273 |
M16 | 8 | 80 | 0.9779 | 4.9797 | 0.9689 | 6.4049 | 0.9771 | 4.9141 |
M17 | 9 | 90 | 0.9910 | 3.1738 | 0.9896 | 3.7003 | 0.9875 | 3.6315 |
M18 | 8 | 80 | 0.9777 | 5.0027 | 0.9685 | 6.4472 | 0.9763 | 4.9964 |
M19 | 7 | 70 | 0.9737 | 5.4351 | 0.9604 | 7.2357 | 0.9691 | 5.7105 |
M20 | 2 | 20 | 0.9506 | 7.4443 | 0.9483 | 8.2636 | 0.9403 | 7.9356 |
M21 | 7 | 70 | 0.9758 | 5.2093 | 0.9633 | 6.9584 | 0.9741 | 5.2313 |
M22 | 24 | 312 | 0.9963 | 2.0429 | 0.9949 | 2.5829 | 0.9934 | 2.6326 |
M23 | 12 | 156 | 0.9876 | 3.7265 | 0.9805 | 5.0804 | 0.9922 | 2.8630 |
M24 | 17 | 221 | 0.9961 | 2.1057 | 0.9954 | 2.4745 | 0.9939 | 2.5304 |
M25 | 12 | 156 | 0.9851 | 4.0927 | 0.9780 | 5.3924 | 0.9781 | 4.8037 |
M26 | 11 | 143 | 0.9955 | 2.2369 | 0.9948 | 2.6148 | 0.9936 | 2.6006 |
M27 | 18 | 234 | 0.9964 | 2.0211 | 0.9953 | 2.4923 | 0.9927 | 2.7692 |
M28 | 15 | 195 | 0.9957 | 2.1946 | 0.9952 | 2.5257 | 0.9933 | 2.6522 |
M29 | 10 | 130 | 0.9877 | 3.7178 | 0.9815 | 4.9442 | 0.9927 | 2.7807 |
M30 | 6 | 78 | 0.9757 | 5.2227 | 0.9714 | 6.1470 | 0.9834 | 4.1897 |
M31 | 11 | 143 | 0.9874 | 3.7556 | 0.9811 | 4.9916 | 0.9926 | 2.7942 |
M32 | 15 | 195 | 0.9955 | 2.2508 | 0.9952 | 2.5263 | 0.9937 | 2.5777 |
M33 | 17 | 221 | 0.9958 | 2.1818 | 0.9944 | 2.7129 | 0.9936 | 2.6009 |
M34 | 16 | 208 | 0.9954 | 2.2802 | 0.9953 | 2.4796 | 0.9936 | 2.6005 |
M35 | 12 | 156 | 0.9873 | 3.7714 | 0.9806 | 5.0616 | 0.9917 | 2.9682 |
M36 | 9 | 117 | 0.9864 | 3.9120 | 0.9792 | 5.2411 | 0.9888 | 3.4428 |
M37 | 12 | 156 | 0.9872 | 3.7968 | 0.9795 | 5.2018 | 0.9901 | 3.2367 |
M38 | 11 | 143 | 0.9951 | 2.3478 | 0.9944 | 2.7154 | 0.9930 | 2.7272 |
M39 | 8 | 104 | 0.9848 | 4.1349 | 0.9780 | 5.3900 | 0.9889 | 3.4282 |
M40 | 11 | 143 | 0.9945 | 2.4850 | 0.9939 | 2.8298 | 0.9921 | 2.8812 |
M41 | 9 | 117 | 0.9850 | 4.1054 | 0.9784 | 5.3444 | 0.9849 | 3.9936 |
M42 | 26 | 416 | 0.9963 | 2.0509 | 0.9952 | 2.5301 | 0.9934 | 2.6396 |
M43* | 27 | 432 | – | – | – | – | – | – |
M44 | 26 | 416 | 0.9964 | 2.0040 | 0.9940 | 2.8086 | 0.9928 | 2.7604 |
M45 | 17 | 272 | 0.9961 | 2.1048 | 0.9954 | 2.4640 | 0.9930 | 2.7202 |
M46 | 14 | 224 | 0.9884 | 3.6116 | 0.9812 | 4.9854 | 0.9891 | 3.3883 |
M47 | 18 | 288 | 0.9962 | 2.055 | 0.9950 | 2.5649 | 0.9938 | 2.5487 |
M48 | 20 | 320 | 0.9961 | 2.0895 | 0.9956 | 2.4049 | 0.9940 | 2.5217 |
M49 | 15 | 240 | 0.9884 | 3.6021 | 0.9808 | 5.0392 | 0.9923 | 2.8493 |
M50 | 20 | 320 | 0.9961 | 2.0949 | 0.9955 | 2.4288 | 0.9940 | 2.5143 |
M51 | 12 | 192 | 0.9876 | 3.7328 | 0.9806 | 5.0594 | 0.9925 | 2.8186 |
M52 | 18 | 288 | 0.9956 | 2.2199 | 0.9943 | 2.7480 | 0.9936 | 2.5970 |
M53 | 16 | 256 | 0.9956 | 2.2251 | 0.9953 | 2.4842 | 0.9940 | 2.5219 |
M54 | 18 | 288 | 0.9949 | 2.3940 | 0.9948 | 2.6291 | 0.9923 | 2.8553 |
M55 | 13 | 208 | 0.9944 | 2.4961 | 0.9934 | 2.9428 | 0.9898 | 3.2747 |
M56 | 12 | 192 | 0.9881 | 3.6537 | 0.9808 | 5.0343 | 0.9920 | 2.8982 |
M57* | 29 | 551 | – | – | – | – | – | – |
M58* | 26 | 494 | – | – | – | – | – | – |
M59* | 29 | 551 | – | – | – | – | – | – |
M60 | 19 | 361 | 0.9963 | 2.0513 | 0.9950 | 2.5784 | 0.9931 | 2.6957 |
M61 | 19 | 361 | 0.9961 | 2.0971 | 0.9954 | 2.4667 | 0.9936 | 2.5952 |
M62 | 19 | 361 | 0.9951 | 2.3574 | 0.9947 | 2.6547 | 0.9931 | 2.6969 |
M63* | 29 | 638 | – | – | – | – | – | – |
The number of parameters for model M* was greater than the number of patterns for training the models, and hence, these models were not considered in this study.
See Table 3 for models’ inputs.
Nr, number of fuzzy rules; Ntp, number of the total parameters; R2, coefficient of determination; RMSE, root-mean-squared error.
As seen in Table 5, the ANFIS-SC models (M1, M2,…,M6), which deal with only one of the input variables (X1, X2, or X6) in combination with the input variable X7, displays good prediction accuracy for estimating H2S emission.
Among these models, the model M1 which uses {X1 and X7} as input sets was found to be the best model, which yields R2 and RMSE values of 0.9749 and 5.760, respectively, for the validation phase, and the R2 value of 0.9748 and the RMSE value of 5.156 for the testing phase. This finding is in good agreement with that of the obtained results using the ANFIS-GP model M1, suggesting that the input variable X1 (temperature) is the major variable for predicting H2S emission for the system under consideration.
Concerning the ANFIS-SC models that have more than two input variables (models M7, M8,…, M63), the model M50 (shown in bold italic in Table 5) whose input is a set of X1, X3, X5, X6, and X7 variables produces the smallest RMSE value of 2.514 for the testing phase (R2= 0.9940). This indicates that {X1, X3, X5, X6, and X7} is the optimum set of input variables for estimating H2S emission from the gravity-flow sewer pipe under the conditions considered in this study.
The prediction performance of the trained model (ANFIS-SC model M50) against the testing subset is visualized in Figure 3. This figure indicates an excellent agreement between the measured data and the model-predicted values with the R2 value of 0.9940.
(a) Scatter plot and (b) comparison of the measured and predicted H2S emission from the gravity sewer using the ANFIS-SC model M50 for the testing subset.
(a) Scatter plot and (b) comparison of the measured and predicted H2S emission from the gravity sewer using the ANFIS-SC model M50 for the testing subset.
Multiple regression-based model
In the regression analysis, among 242 regression models examined for predicting H2S emission from the gravity sewer, an exponential model (called Model 1) – whose mathematical definition is given in Table 6 – offered the best performance. A summary of the regression analysis of Model 1, including the standard error of the estimate, residual sum of squares (RSS), and R2, is tabulated in Table 6. In addition, Table 6 represents a summary of the regression analysis of two first-order polynomial models (called Models 2 and 3). As seen in this table, Models 1 clearly outperformed the Models 2 and 3 with R2 and RSS values of 0.966 and 1.235, respectively, against the R2 value of 0.742 and the RSS value of 9.458 achieved by Model 2, and the R2 value of 0.547 and the RSS value of 16.601 achieved by Model 3 (the smaller RSS, the better the model performs, and vice versa in the case of R2). The estimated regression coefficient values together with standard error, t-ratios, and the corresponding p-values for the best-fit model (Model 1) are summarized in Table 7. The larger t-ratio, the more significant parameter in the regression model. In addition, the parameter whose p-value is the least is considered the most significant parameter affecting the model response. As seen in Table 7, parameters X1, X3, X4, X6, and X7 showed a significant influence (p < 0.05) on the model response, among which X1 (temperature) and X7 (time) had more importance than the other parameters, whereas parameters X2 and X5 were found to be statistically non-significant (p > 0.05).
Multiple regression results for the prediction of H2S emission from the gravity sewer under consideration in this study
Rank . | Model definition . | SEE . | RSS . | R2 . |
---|---|---|---|---|
1 | ![]() | 0.0550 | 1.2352 | 0.9663 |
2 | ![]() | 0.1523 | 9.4583 | 0.7420 |
3 | ![]() | 0.2015 | 16.601 | 0.5471 |
Rank . | Model definition . | SEE . | RSS . | R2 . |
---|---|---|---|---|
1 | ![]() | 0.0550 | 1.2352 | 0.9663 |
2 | ![]() | 0.1523 | 9.4583 | 0.7420 |
3 | ![]() | 0.2015 | 16.601 | 0.5471 |
See Table 7 for the values of the coefficients a0–a7.
The row shown in bold represents the best-fit model.
The training subset was served to construct the regression models.
SEE, standard error of the estimate; RSS, residual sum of squares; R2, coefficient of multiple determination.
Y represents H2S emission from the sewer, and Xi (i= 1, 2, …, 7) are the values of the input variables (see Table 2 for the notations).
Estimated regression coefficient values for the best-fit model
# Modela . | Coefficient . | SEE . | t-ratio . | p-value . |
---|---|---|---|---|
1 | a0 = −0.0650 | 0.0449 | −1.4465 | 0.1488 |
a1 = −0.2943 | 0.0337 | −8.7284 | 0.0000 | |
a2= 0.1698 | 0.0870 | 1.9520 | 0.0516 | |
a3 = −0.4884 | 0.1645 | −2.9698 | 0.0032 | |
a4 = −4.1215 | 1.0987 | −3.7513 | 0.0002 | |
a5 = −0.0582 | 0.1613 | −0.3610 | 0.7183 | |
a6 = 4.7302 | 1.2447 | 3.8002 | 0.0002 | |
a7 = −7.9190 | 0.1315 | −60.243 | 0.0000 |
# Modela . | Coefficient . | SEE . | t-ratio . | p-value . |
---|---|---|---|---|
1 | a0 = −0.0650 | 0.0449 | −1.4465 | 0.1488 |
a1 = −0.2943 | 0.0337 | −8.7284 | 0.0000 | |
a2= 0.1698 | 0.0870 | 1.9520 | 0.0516 | |
a3 = −0.4884 | 0.1645 | −2.9698 | 0.0032 | |
a4 = −4.1215 | 1.0987 | −3.7513 | 0.0002 | |
a5 = −0.0582 | 0.1613 | −0.3610 | 0.7183 | |
a6 = 4.7302 | 1.2447 | 3.8002 | 0.0002 | |
a7 = −7.9190 | 0.1315 | −60.243 | 0.0000 |
Values of p < 0.05 were considered statistically significant.
SEE, standard error of the estimate.
aSee Table 6 for the model equations.
The scatter diagram of the measured data and the predicted values using Model 1 (see Tables 6 and 7 for the model equation and the estimated coefficients values) for the testing subsets is depicted in Figure 4. As seen in Figure 4(a), a good linear correlation (R2= 0.964 and RMSE = 6.14) was obtained between the measured data and the model-predicted values. This indicates that only about 3.6% of the viability in the response could not be explained by the model.
(a) Scatter plot and (b) comparison of the measured and predicted H2S emission from the gravity sewer using Model 1 for the testing subset.
(a) Scatter plot and (b) comparison of the measured and predicted H2S emission from the gravity sewer using Model 1 for the testing subset.
In addition, a comparative graphical representation for the measured data and the model-predicted values is displayed in Figure 4(b). It appears from Figure 4(b) that there is small difference between the measured predicted values. It can be deduced that Model 1 could be accurate enough to correctly predict H2S emission from the gravity sewer under consideration in this study.
Comparison of the models
It can be seen from Tables 4 and 5 that both the best ANFIS-GP model M24 and the best ANFIS-SC model M50 tested on the testing subset produced R2 and RMSE values of 0.99 and approximately 2.5, respectively, whereas the best nonlinear regression model (Model 1) – tested against the same dataset – provided R2 of 0.96 and RMSE of 6.14; the smaller RMSE, the better the model performs, and vice versa in the case of R2. Therefore, the proposed ANFIS-GP models M24 and ANFIS-SC M50 performed better than Model 1 in predicting H2S emission from the sewer. When comparing the results obtained from the ANFIS-GP model M24 and the ANFIS-SC model M50 (c.f. Tables 4 and 5), it is clear that the R2 and RMSE values of the two models are relatively similar. However, the ANFIS-GP model M24 possessed 16 fuzzy rules and 96 parameters, which were fewer than those of the ANFIS-SC model M50 (number of fuzzy rules = 20; number of parameters = 320). This implies that the complexity level of the structure of the ANFIS-GP model M24 was simpler than that of the ANFIS-GP model M50. Therefore, between these two models, the ANFIS-GP model M24 was a better choice for the prediction of H2S emission.
Implication of the models
For such a gravity-flow sewer system as considered in this study, the relationship between the input variables and the output (H2S emission) is described by nonlinear complex mathematical formulas, which are often expensive to solve. In addition to the complexity, the overall mass transfer coefficient of H2S is quite difficult to be determined accurately and is usually involved in laboratory and pilot trials. Hence, some simplifying assumptions are incorporated which may result in an underestimation of the H2S emission rate. The proposed ANFIS models here, possessed strong generalization and prediction ability, were established based on only an actual measured set of input variables and the corresponding output, without taking into account any information regarding the relationship between the input variables and the H2S emission rate. This implies that these models, called easy-to-use black-box models, could be an attractive and useful tool that is worth considering for predicting H2S emission from gravity sewer. These models could offer great benefits because using which engineers and asset managers can evaluate the possible odor and corrosion problems through the design phase and operation of sewers. In addition, it enables them to formulate appropriate strategies to control and mitigate H2S emission to the air or buildup in the sewers atmosphere in order to reduce health risk and minimize sewers corrosion.
CONCLUSIONS
This study successfully demonstrated the construction of the competing models to predict H2S emission from a gravity sewer without the need for in-depth knowledge of H2S emission mechanism. For the first time, the ability of the ANFIS-GP/ANFIS-SC approaches was revealed for this application. The ANFIS-GP and ANFIS-SC models developed were compared with the (non-)linear regression models. ANFIS-GP and ANFIS-SC models performed better than the (non-)linear regression models with a prediction accuracy of >99%. However, the ANFIS-GP model was found to be much simpler due to the creation of fewer fuzzy rules. This validates the ANFIS-GP model as a valuable computational tool for predicting H2S emission from gravity sewers.
ACKNOWLEDGEMENTS
This research was funded by Prince of Songkla University and the Ministry of Higher Education, Science, Research and Innovation, Thailand, under the Reinventing University Project (Grant Number REV64061). The authors thank the support from the Department of Civil and Environmental Engineering, and Research and Development Office, Prince of Songkla University, Thailand. We also thank the support from Biogas and Biorefinery Laboratory at the Faculty of Engineering, and PSU Energy Systems Research Institute, Prince of Songkla University, Thailand.
AUTHOR CONTRIBUTIONS
R.S. (Ph.D., a postdoctoral fellow) developed the models, analyzed and interpreted the results, and wrote the manuscript. S.C. (Professor) reviewed and edited the manuscript.
CONFLICTS OF INTEREST
The authors declare no conflicts of interest.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.