## Abstract

A predictive model to estimate hydrogen sulfide (H_{2}S) emission from sewers would offer engineers and asset managers the ability to evaluate the possible odor/corrosion problems during the design and operation of sewers to avoid in-sewer complications. This study aimed to model and forecast H_{2}S emission from a gravity sewer, as a function of temperature and hydraulic conditions, without requiring prior knowledge of H_{2}S emission mechanism. Two different adaptive neuro-fuzzy inference system (ANFIS) models using grid partitioning (GP) and subtractive clustering (SC) approaches were developed, validated, and tested. The ANFIS-GP model was constructed with two Gaussian membership functions for each input. For the development of the ANFIS-SC model, the MATLAB default values for clustering parameters were selected. Results clearly indicated that both the best ANFIS-GP and ANFIS-SC models produced smaller error compared with the multiple regression models and demonstrated a superior predictive performance on forecasting H_{2}S emission with an excellent *R*^{2} value of >0.99. However, the ANFIS-GP model possessed fewer rules and parameters than the ANFIS-SC model. These findings validate the ANFIS-GP model as a potent tool for predicting H_{2}S emission from gravity sewers.

## HIGHLIGHTS

An adaptive neuro-fuzzy inference system (ANFIS) model was proposed to predict H

_{2}S emission from gravity sewers.The ANFIS model was constructed based on grid partitioning (GP) and subtractive clustering (SC).

The ANFIS-GP/ANFIS-SC model indicated an excellent prediction accuracy (>99%) for H

_{2}S emission.The ANFIS-GP model was found to be computationally simpler than the ANFIS-SC model due to the creation of fewer fuzzy rules.

### Graphical Abstract

## INTRODUCTION

Domestic wastewater is transported to water resource recovery facilities via the underground pipelines (or tunnels) called sanitary sewers. During transportation, the attachment and subsequent growth of microorganisms on the submerged surfaces of the sewer occur, which leads to the formation of a biological slime layer, more often named biofilm. The biofilm's deeper regions (closer to the sewer walls) are mostly inhabited by anaerobic microorganisms because oxygen penetration is limited by the outer region of the biofilm. Microbial activity of sulfate-reducing organisms – obligate anaerobes – within the biofilm's deeper regions reduces sulfate (SO_{4}^{2−}) to hydrogen sulfide (H_{2}S) (Gutierrez *et al.* 2016; Li *et al.* 2019). The formation of H_{2}S depends on the presence of readily biodegradable chemical oxygen demand and sulfur compounds in the sewers (Park *et al.* 2014). According to Carrera *et al.* (2016), domestic wastewater usually contains SO_{4}^{2−} in a concentration range of 40–200 mg SO_{4}^{2−}/L (equals to 13.3–66.7 mg SO_{4}^{2−}-S/L).

The H_{2}S product in the biofilm can be diffused to the bulk liquid phase and – when the sewer runs partially full – can then be released to the overlaying gas phase if the supporting factors such as high-turbulence flow, low pH, and high temperature of the liquid phase are prevailed (Yongsiri *et al.* 2004a; Jiang *et al.* 2017; Wu *et al.* 2018; Zuo *et al*. 2019). In the case of sufficient ventilation of sewer, H_{2}S can be released to the air; this gas has an extremely unpleasant smell (rotten egg odor) with a very low odor detection threshold (0.0001–0.03 ppm) and causes a detrimental impact on human health, ranging from eye irritation and headaches at lower levels (1–10 ppm) and sudden death at levels >1,000 ppm (Salehi & Chaiprapat 2019). In the case of insufficient ventilation of sewer, H_{2}S is accumulated in the sewer atmosphere (the confined air space above the flow surface) and subsequently adsorbed on the sewer moist walls (unsubmerged walls) where it can be oxidized by the action of *Thiobacilli*-generating sulfuric acid (H_{2}SO_{4}) (Huber *et al.* 2016; Jiang *et al.* 2017). The H_{2}SO_{4} product is the major cause of the corrosion of cement and metal materials in sewers (Jiang *et al.* 2015, 2016; Wu *et al.* 2018; Fytianos *et al.* 2020). The corrosion of sewer networks is an increasing universal concern because it has the potential to cost the wastewater industry billions of dollars to maintain and rehabilitate the damaged sewers. As an example, the Sanitation Districts of Los Angeles County estimated an annual total cost of US$13.75 billion to perform rehabilitation and the maintenance of the corroded sanitary sewers in the United States (García *et al.* 2020).

Several predictive models to estimate H_{2}S emission from gravity sewers have been developed since the early 1970s. They can be categorized into the following two main groups: (1) modified sewer re-aeration models in which the overall mass transfer coefficient of H_{2}S replaces the overall mass transfer coefficient of oxygen and (2) models estimating H_{2}S emission rate as a function of hydraulic conditions (Carrera *et al.* 2016). Some examples of the first group are those developed by Parkhurst & Pomeroy (1972), United States Environmental Protection Agency (USEPA) (1974), Taghizadeh-Nasser (1986), Jensen (1995), and Yongsiri *et al.* (2003, 2004a, 2004b, 2005). Examples of the second group of the models are the ones proposed by Lahav *et al.* (2004, 2006). The critical parameter in these second group models is velocity gradient – adapted from the mixing theory – that is linked to the head loss in the sewer.

Yet, despite all these developments and efforts made to model H_{2}S emission from gravity sewers, to the best of the authors' knowledge, the application of an adaptive neuro-fuzzy inference system (ANFIS) has never been exploited. The ANFIS models have advantages over the theoretical/empirical models for complex nonlinear systems because they are constructed based only on a dataset of the input and output variables of the system under consideration, without requiring prior knowledge of the interrelationship between the variables. In addition, they have been proved as a robust modeling tool for nonlinear complex systems with high generalization power. Therefore, the present study aimed – for the first time – to fill this research gap. An ANFIS model based on grid partitioning (GP)/subtractive clustering (SC) of the input domain (will be referred to hereafter as ANFIS-GP and ANFIS-SC, respectively) was developed to predict H_{2}S emission from a gravity sewer using the experimental data reported in Lahav *et al.* (2006). The predictive abilities of the ANFIS-GP and ANFIS-SC models were compared with each other and with that of the conventional multiple regression by means of two descriptive statistics, including the coefficient of determination (*R*^{2}) and root-mean-squared error (*RMSE*). In addition, the two ANFIS models were compared with each other in terms of complexity (i.e., number of rules and parameters).

This paper will present the ANFIS model structure followed by the methodology describing the data processing and modeling approach for the ANFIS-GP, ANFIS-SC, and multiple regression models. Then, results of the three models are compared and discussed, and finally, the paper ends with the conclusions.

## ANFIS MODELING STRUCTURE

ANFIS, originally introduced by Jang (1993), is a knowledge-based system that deals with a number of conditional statements (commonly known as IF-THEN rules) and a set of input–output data of the system under consideration. It is used to describe the input–output behavior of complex nonlinear and uncertain problems that are difficult or complicated to be modeled using the mathematical approaches. The ANFIS model combines two soft computing techniques – artificial neural network (ANN) and fuzzy inference system (FIS) – in a single framework. In fact, the ANFIS model has a potential to capture the strengths of these two techniques: the adaptive and learning ability of ANN and the ability of FIS in knowledge representation and interpretation with the help of fuzzy IF-THEN rules (Jang 1993; Jang *et al.* 1997). The mathematical background of ANFIS model is described in detail below.

For simplicity, let us suppose that the FIS under consideration has two inputs (each with two fuzzy sets) and one output. For the first-order Takagi–Sugeno FIS (Takagi & Sugeno 1985), a typical rule set with two fuzzy IF-THEN rules can be expressed in the following form:

*x*

_{1}and

*x*

_{2}are the inputs;

*A*and

_{i}*B*represent the fuzzy sets of

_{i}*x*(

_{i}*i*= 1 and 2); the fuzzy sets are characterized by appropriate membership functions (MFs), noting that an MF is a curve representing how each data point within the input space is mapped to a value between 0 and 1;

*y*(

_{i}*i*

*=*1 and 2) stands for the output function of the

*i*th rule; and

*α*

_{i},

*β*

_{i}, and

*γ*

_{i}are the parameters of the output function

*y*.

_{i}As seen from expressions (1) and (2), each fuzzy IF-THEN rule is composed of two parts, namely the IF part, which is known as antecedent (or premise) part, and the THEN part, which is called the consequent (or conclusion) part. Hence, the parameters pertaining to the fuzzy sets *A _{i}* and

*B*are referred to as antecedent parameters, and the set {

_{i}*α*

_{i},

*β*

_{i},

*γ*

_{i}} is referred to as the consequent parameters.

Figure 1 illustrates the architecture of a typical ANFIS model, which is functionally equivalent to a two input single-output first-order Takagi–Sugeno FIS with two fuzzy rules.

As seen in Figure 1, the ANFIS model is a feed-forward neural network consisting of six distinct functional blocks (more often called layers), in which each layer is composed of a set of processing elements named nodes (neurons or elements). The nodes are connected together by links that indicate the signal flow direction from one node to another. Each node applies a specific function on its incoming signals and sends out the product, which is an input for the next layer. Owing to space limitation, the detailed description of each layer is provided in Supplementary Material.

### ANFIS structure identification

One of the most crucial steps in the establishment of an ANFIS model is the structure identification, which involves the selection of an appropriate MF, the assignment of an optimal number of MFs to each input variable, and the determination of the number of fuzzy rules. To achieve this aim, two different techniques have been developed with focus on partitioning of the input space: (i) GP and (ii) SC. A brief description of the theoretical background of these two techniques is presented in the following subsections.

#### Grid partitioning

A GP technique refers to dividing a multi-dimensional input space into rectangular subspaces using an axis-paralleled partition with respect to the number and the type of MFs in each dimension, which are set by the user. Each rectangular subspace, called the fuzzy region, is specified by a fuzzy rule. The total number of fuzzy rules is equal to the number of all possible combinations of the MFs of all inputs (Benmouiza & Cheknane 2019; Babanezhad *et al.* 2020). For instance, applying GP on a data space comprised of *N* inputs (*x*_{1}, *x*_{2},…, *x _{N}*) generates

*M*

_{1}

*×*

*M*

_{2}

*×*

*…×*

*M*fuzzy rules, where

_{N}*M*denotes the number of MFs for the

_{i}*i*th input

*x*for

_{i}*i*

*=*1 to

*N*. Therefore, for a given input space, an increase in the number of MFs per input leads to a remarkable increase in the number of fuzzy rules accordingly, and consequently, an increase in the number of trainable parameters so that the computational load of the model raises.

#### Subtractive clustering

SC, originally proposed by Chiu (1994), is an effective and rapid one-pass technique by which a large dataset is organized and categorized. In other words, the dataset is split into a number of homogeneous groups, named as clusters, such that the identical data points belong to the same cluster. The number of generated clusters is the same as the number of fuzzy rules (Rahnema *et al.* 2019). Applying the SC technique on a data space creates a number of clusters whose centers are used to identify the MFs, in the form of ‘gaussmf’, for the inputs (Asadi *et al.* 2020). This technique has gained remarkable attention because the fuzzy rules can be automatically created rather than the GP technique using which the manual selection of the number of input MFs in advance is a prerequisite to generate fuzzy rules. In addition, the SC technique prevents the problem of combinatorial explosion of the fuzzy rules in the case of a high-dimensional dataset. In other words, it reduces the rule-base complexity of the ANFIS models (Kumar *et al.* 2017). A brief description of the SC technique is given below (Heddam *et al.* 2019).

*n*data points {

*x*

_{1},

*x*

_{2}, …,

*x*} in a multi-dimensional space. Each data point is a candidate to be a cluster center, with respect to the fact that a data point with more neighboring data points has a greater chance to become a cluster center compared with a data point with less neighboring data points. To define the first cluster center, the density measure of each data point is computed according to the following equation:where

_{n}*D*is the density measure of the

_{i}*i*th data point (

*x*),

_{i}*n*is the total number of data points, ‖. ‖ is the Euclidean distance between

*x*(the first cluster center) and the

_{i}*j*th data point (

*x*),

_{j}*r*is the positive constant representing a neighborhood for each cluster; that is so-called the range of influence of a cluster center or the cluster radius. ‘

*r*’ is set to a value between 0 and 1 by the user, considering that a large value of ‘

*r*’ generates fewer clusters in the data space resulting in fewer rules, and vice versa (Banda

*et al.*2018; Heddam

*et al.*2019).

After the density measure of each data point is determined, the data point with the highest density measure is selected to be the first cluster center.

*x*

_{1}

^{*}has been identified as the first cluster center and

*D*

_{1}

^{*}as its density measure. The density measure of each data point

*x*is then revised in accordance with the following equation:where

_{i}*r*′ is the constant value, greater than ‘

*r*’, which is used to prevent obtaining closely spaced cluster centers. In other words, it causes the density measure reduction of data points near the cluster center

*x*

_{1}

^{*}.

*λ*is the squash factor to multiply the radius value. A smaller

*λ*usually yields more (and smaller) clusters because the potential for outlying points as part of a cluster is decreased. According to the literature, a recommended value for

*λ*is in the range of 1.25–1.50 (Chiu 1994; Astari 2018).

This process – searching new cluster centers – is repeated with respect to the three criteria as given in Table 1. Once the clustering process is complete, a fuzzy IF-THEN rule is assigned to each cluster; the antecedent of the fuzzy rules are created by projecting the clusters onto each dimension in the input space (Chiu 1994, 1997).

Criterion . | Description . | Remarks . | |
---|---|---|---|

I | x is accepted as a cluster center, and the process is continued to search new cluster centers _{k}^{*} | ||

II | x is rejected as a cluster center, and the process is terminated _{k}^{*} | ||

III | x is accepted as a cluster center, and the process is continued _{k}^{*} | ||

x is rejected as a cluster center, its density measure is set to 0, and the process is continued _{k}^{*} |

Criterion . | Description . | Remarks . | |
---|---|---|---|

I | x is accepted as a cluster center, and the process is continued to search new cluster centers _{k}^{*} | ||

II | x is rejected as a cluster center, and the process is terminated _{k}^{*} | ||

III | x is accepted as a cluster center, and the process is continued _{k}^{*} | ||

x is rejected as a cluster center, its density measure is set to 0, and the process is continued _{k}^{*} |

SC, subtractive clustering.

*D*_{1}^{*} and *D _{k}^{*}* are the density measures corresponding to

*x*

_{1}

^{*}and

*x*, respectively, where

_{k}^{*}*x*

_{1}

^{*}is the first cluster center and

*x*is the cluster center candidate at step

_{k}^{*}*K*.

*AR*, a threshold for the density measure above which the data point is accepted as a cluster center. This parameter ranges between 0 and 1 with an optimal value of 0.5 (note: the larger *AR*, the fewer cluster centers (Chiu 1994)).

*RR*, a threshold for the density measure below which the data point is rejected as a cluster center. This parameter ranges between 0 and 1 with an optimal value of 0.15 (note: the smaller *RR*, the more cluster centers (Chiu 1994)).

‘*L*’, minimal distance between *x _{k}^{*}* and all previously defined cluster center (

*x*

_{1}

^{*},

*x*

_{2}

^{*},…,

*x*

^{*}

_{k−}_{1}).

‘*r*’, cluster radius.

### ANFIS parameter identification

*N*is the number of antecedent parameters;

_{a}*N*

_{p}_{,MF}is the number of parameters pertaining to the MF; it depends on the type of MF. For instance, in the case of ‘gaussmf’ and ‘gbellmf’,

*N*

_{p}_{,MF}is equal to 2 and 3, respectively;

*N*is the number of inputs to the ANFIS model;

_{input}*N*

_{MF}_{,input}is the number of MFs assigned to each input;

*N*is the number of consequent parameters; and

_{c}*N*is the number of the rules.

_{r}It has been proven by numerous studies that the hybrid algorithm – that integrates the error backpropagation and the least squares estimation (LSE) methods – is highly efficient in optimizing ANFIS parameters (Chen *et al.* 2018; Kashyap *et al.* 2019). Each iteration (epoch) of the hybrid algorithm is composed of two passes, including a forward pass and a backward pass. In the forward pass, the functional signals (the nodes output) go forward until the defuzzification layer wherein the LSE method is used to determine the consequent parameters under the condition that the antecedent parameters are held fixed. Then, the error between target and ANFIS-predicted values is computed. If this error is greater than the pre-specified threshold, the backward pass starts. In the backward pass, the error signals are propagated from the output layer backward to the input layer and gradient descent is used to tune the antecedent parameters, while the consequent parameters remain fixed. The output of the ANFIS is computed by employing the consequent parameters determined in the forward pass. The details and mathematical background of this algorithm can be found in Jang (1993) and Jang *et al.* (1997).

## METHODOLOGY

### Dataset

The data used in this study were obtained from the study of Lahav *et al.* (2006). Briefly, Lahav *et al.* (2006) operated an artificial gravity-flow sewer (a 27-m-long PVC pipe with an internal diameter of 0.16 m) using sulfide-containing water with an initial sulfide concentration of 20.4–27.2 mg S/L. The authors established an empirical equation describing dissolved sulfide in the aqueous phase of the sewer pipe as a function of temperature and hydraulic conditions (c.f. Table 2 presenting the range of temperature and hydraulic conditions used in the experimental runs).

Parameters . | Symbol . | Operating range . | Units . |
---|---|---|---|

Temperature | x_{1} | 16.8–31.3 | °C |

Pipe's slope | x_{2} | 1–3 | % m/m |

Flow rate | x_{3} | 0.0014–0.0078 | m^{3}/s |

Hydraulic depth^{a} | x_{4} | 0.014–0.089 | m^{2}/m |

Mean flow velocity | x_{5} | 0.65–1.55 | m/s |

Liquid volume fraction in the pipe^{b} | x_{6} | 0.024–0.183 | m^{3}/m^{3} |

Time | x_{7} | 0–410 | min |

Parameters . | Symbol . | Operating range . | Units . |
---|---|---|---|

Temperature | x_{1} | 16.8–31.3 | °C |

Pipe's slope | x_{2} | 1–3 | % m/m |

Flow rate | x_{3} | 0.0014–0.0078 | m^{3}/s |

Hydraulic depth^{a} | x_{4} | 0.014–0.089 | m^{2}/m |

Mean flow velocity | x_{5} | 0.65–1.55 | m/s |

Liquid volume fraction in the pipe^{b} | x_{6} | 0.024–0.183 | m^{3}/m^{3} |

Time | x_{7} | 0–410 | min |

^{a}Hydraulic depth represents the cross-sectional area of the flow in the pipe divided by the flow surface width.

^{b}Volume of liquid in the pipe divided by the total volume of liquid in the system; the system includes the pipe, an upstream container, and a downstream container (for more information, see section 2 (Materials and Methods) in Lahav *et al.* 2006).

To construct an ANFIS/MLR model in this study, *x*_{1}–*x*_{7} were considered as the input variables, while the total sulfide in the aqueous phase of the sewer pipe (*y*) was the only output variable (c.f. Table 2 for the notation *x _{i}* for

*i*

*=*1–7). The dataset consists of a total number of 596 input–output data pairs as given in Supplementary Material, Table S1. The input–output data pairs will be referred to hereafter as patterns); the

*j*th pattern contains a collection of eight data points as {

*x*

_{1j},

*x*

_{2j},…,

*x*

_{7j},

*y*

_{j}} for

*j*

*=*1–596.

### Data pre-processing

Before using the original (raw) dataset to construct ANFIS models, a two-step data pre-processing approach was applied. The first step involved normalizing the entire data points, while the second step dealt with splitting the normalized dataset. A detailed description of each respective step is given below.

#### Step 1: data normalization

*z*’ is the actual value of variable ‘

*x*

_{ij}’ (and ‘

*y*

_{j}’) in the dataset (

*x*

_{ij}denotes the value of the

*i*th input variable in the

*j*th pattern and ‘

*y*

_{j}’ represents the value of the output variable in the

*j*th pattern;

*i*

*=*1–7,

*j*

*=*1–596); ‘

*z*’ and ‘

_{min}*z*’ represent the minimum and maximum values of ‘

_{max}*z*’, respectively;

*Z*denotes the normalized value of ‘

_{N}*z*’.

#### Step 2: data splitting

After obtaining the normalized dataset, it was randomized and subsequently divided into three disjoint subsets including training, validation, and testing subsets. In this study, 416 patterns corresponding to about 70% of the dataset were assigned to training subset, while the remaining 30% of the dataset (i.e., 180 patterns) were split into two equal halves, termed validation and testing subsets. The training subset was used to develop the ANFIS models. The validation subset was served in conjunction with the training subset to prevent the ANFIS models from overfitting the training data. The testing subset was utilized to assess the accuracy and effectiveness of the trained (developed) ANFIS models for predicting the output.

The training, validation, and testing subsets were stored in the workspace of MATLAB^{®} (version 8.3.0.532, R2014a) (The MathWorks Inc., Natick, MA, USA) in the form of arrays, in which each row consists of a pattern with the last column (from left to right) indicating the output value and the remaining columns representing inputs' values.

### Modeling approaches

This section presents three different modeling approaches, including ANFIS-GP, ANFIS-SC, and multiple regression models, applied for predicting H_{2}S emission from the gravity sewer.

#### ANFIS-GP model

To develop the ANFIS-GP model, the ANFIS Editor GUI (graphical user interface) of the Fuzzy Logic Toolbox in the framework of MATLAB^{®} (version 8.3.0.532, R2014a) was used. The ‘*anfisedit*’ was typed in the MATLAB command window to display the ANFIS Editor GUI, which includes four distinct panels, namely (i) load data, (ii) generate FIS, (iii) train FIS, and (iv) test FIS. Initially, the training, validation, and testing subsets were loaded from the MTALAB workspace into the ANFIS Editor GUI, and then, an initial FIS was created by choosing the GP technique. For the system under consideration in this study, various ANFIS-GP models were tried. It is important to note that the following points were taken into account to choose the best model as the one with minimum error:

All possible combinations of the input variables were tested. Table 3 presents the combinations of input variables used for the model's development.

The ‘gaussmf’, one of the most commonly used MFs in the literature, was tested for each input (note that only two MFs were assigned to each input in order to make the models as simple as possible).

Takagi–Sugeno FIS in the form of a first-order polynomial function was used.

Model . | Input combination . | Model . | Input combination . | Model . | Input combination . |
---|---|---|---|---|---|

M_{1} | [X_{1}, X_{7}] | M_{22} | [X_{1}, X_{2}, X_{3}, X_{7}] | M_{43} | [X_{1}, X_{2}, X_{3}, X_{5}, X_{7}] |

M_{2} | [X_{2}, X_{7}] | M_{23} | [X_{1}, X_{2}, X_{4}, X_{7}] | M_{44} | [X_{1}, X_{2}, X_{3}, X_{6}, X_{7}] |

M_{3} | [X_{3}, X_{7}] | M_{24} | [X_{1}, X_{2}, X_{5}, X_{7}] | M_{45} | [X_{1}, X_{2}, X_{4}, X_{5}, X_{7}] |

M_{4} | [X_{4}, X_{7}] | M_{25} | [X_{1}, X_{2}, X_{6}, X_{7}] | M_{46} | [X_{1}, X_{2}, X_{4}, X_{6}, X_{7}] |

M_{5} | [X_{5}, X_{7}] | M_{26} | [X_{1}, X_{3}, X_{4}, X_{7}] | M_{47} | [X_{1}, X_{2}, X_{5}, X_{6}, X_{7}] |

M_{6} | [X_{6}, X_{7}] | M_{27} | [X_{1}, X_{3}, X_{5}, X_{7}] | M_{48} | [X_{1}, X_{3}, X_{4}, X_{5}, X_{7}] |

M_{7} | [X_{1}, X_{2}, X_{7}] | M_{28} | [X_{1}, X_{3}, X_{6}, X_{7}] | M_{49} | [X_{1}, X_{3}, X_{4}, X_{6}, X_{7}] |

M_{8} | [X_{1}, X_{3}, X_{7}] | M_{29} | [X_{1}, X_{4}, X_{5}, X_{7}] | M_{50} | [X_{1}, X_{3}, X_{5}, X_{6}, X_{7}] |

M_{9} | [X_{1}, X_{4}, X_{7}] | M_{30} | [X_{1}, X_{4}, X_{6}, X_{7}] | M_{51} | [X_{1}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{10} | [X_{1}, X_{5}, X_{7}] | M_{31} | [X_{1}, X_{5}, X_{6}, X_{7}] | M_{52} | [X_{2}, X_{3}, X_{4}, X_{5}, X_{7}] |

M_{11} | [X_{1}, X_{6}, X_{7}] | M_{32} | [X_{2}, X_{3}, X_{4}, X_{7}] | M_{53} | [X_{2}, X_{3}, X_{4}, X_{6}, X_{7}] |

M_{12} | [X_{2}, X_{3}, X_{7}] | M_{33} | [X_{2}, X_{3}, X_{5}, X_{7}] | M_{54} | [X_{2}, X_{3}, X_{5}, X_{6}, X_{7}] |

M_{13} | [X_{2}, X_{4}, X_{7}] | M_{34} | [X_{2}, X_{3}, X_{6}, X_{7}] | M_{55} | [X_{2}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{14} | [X_{2}, X_{5}, X_{7}] | M_{35} | [X_{2}, X_{4}, X_{5}, X_{7}] | M_{56} | [X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{15} | [X_{2}, X_{6}, X_{7}] | M_{36} | [X_{2}, X_{4}, X_{6}, X_{7}] | M_{57} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{5}, X_{7}] |

M_{16} | [X_{3}, X_{4}, X_{7}] | M_{37} | [X_{2}, X_{5}, X_{6}, X_{7}] | M_{58} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{6}, X_{7}] |

M_{17} | [X_{3}, X_{5}, X_{7}] | M_{38} | [X_{3}, X_{4}, X_{5}, X_{7}] | M_{59} | [X_{1}, X_{2}, X_{3}, X_{5}, X_{6}, X_{7}] |

M_{18} | [X_{3}, X_{6}, X_{7}] | M_{39} | [X_{3}, X_{4}, X_{6}, X_{7}] | M_{60} | [X_{1}, X_{2}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{19} | [X_{4}, X_{5}, X_{7}] | M_{40} | [X_{3}, X_{5}, X_{6}, X_{7}] | M_{61} | [X_{1}, X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{20} | [X_{4}, X_{6}, X_{7}] | M_{41} | [X_{4}, X_{5}, X_{6}, X_{7}] | M_{62} | [X_{2}, X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{21} | [X_{5}, X_{6}, X_{7}] | M_{42} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{7}] | M_{63} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

Model . | Input combination . | Model . | Input combination . | Model . | Input combination . |
---|---|---|---|---|---|

M_{1} | [X_{1}, X_{7}] | M_{22} | [X_{1}, X_{2}, X_{3}, X_{7}] | M_{43} | [X_{1}, X_{2}, X_{3}, X_{5}, X_{7}] |

M_{2} | [X_{2}, X_{7}] | M_{23} | [X_{1}, X_{2}, X_{4}, X_{7}] | M_{44} | [X_{1}, X_{2}, X_{3}, X_{6}, X_{7}] |

M_{3} | [X_{3}, X_{7}] | M_{24} | [X_{1}, X_{2}, X_{5}, X_{7}] | M_{45} | [X_{1}, X_{2}, X_{4}, X_{5}, X_{7}] |

M_{4} | [X_{4}, X_{7}] | M_{25} | [X_{1}, X_{2}, X_{6}, X_{7}] | M_{46} | [X_{1}, X_{2}, X_{4}, X_{6}, X_{7}] |

M_{5} | [X_{5}, X_{7}] | M_{26} | [X_{1}, X_{3}, X_{4}, X_{7}] | M_{47} | [X_{1}, X_{2}, X_{5}, X_{6}, X_{7}] |

M_{6} | [X_{6}, X_{7}] | M_{27} | [X_{1}, X_{3}, X_{5}, X_{7}] | M_{48} | [X_{1}, X_{3}, X_{4}, X_{5}, X_{7}] |

M_{7} | [X_{1}, X_{2}, X_{7}] | M_{28} | [X_{1}, X_{3}, X_{6}, X_{7}] | M_{49} | [X_{1}, X_{3}, X_{4}, X_{6}, X_{7}] |

M_{8} | [X_{1}, X_{3}, X_{7}] | M_{29} | [X_{1}, X_{4}, X_{5}, X_{7}] | M_{50} | [X_{1}, X_{3}, X_{5}, X_{6}, X_{7}] |

M_{9} | [X_{1}, X_{4}, X_{7}] | M_{30} | [X_{1}, X_{4}, X_{6}, X_{7}] | M_{51} | [X_{1}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{10} | [X_{1}, X_{5}, X_{7}] | M_{31} | [X_{1}, X_{5}, X_{6}, X_{7}] | M_{52} | [X_{2}, X_{3}, X_{4}, X_{5}, X_{7}] |

M_{11} | [X_{1}, X_{6}, X_{7}] | M_{32} | [X_{2}, X_{3}, X_{4}, X_{7}] | M_{53} | [X_{2}, X_{3}, X_{4}, X_{6}, X_{7}] |

M_{12} | [X_{2}, X_{3}, X_{7}] | M_{33} | [X_{2}, X_{3}, X_{5}, X_{7}] | M_{54} | [X_{2}, X_{3}, X_{5}, X_{6}, X_{7}] |

M_{13} | [X_{2}, X_{4}, X_{7}] | M_{34} | [X_{2}, X_{3}, X_{6}, X_{7}] | M_{55} | [X_{2}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{14} | [X_{2}, X_{5}, X_{7}] | M_{35} | [X_{2}, X_{4}, X_{5}, X_{7}] | M_{56} | [X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{15} | [X_{2}, X_{6}, X_{7}] | M_{36} | [X_{2}, X_{4}, X_{6}, X_{7}] | M_{57} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{5}, X_{7}] |

M_{16} | [X_{3}, X_{4}, X_{7}] | M_{37} | [X_{2}, X_{5}, X_{6}, X_{7}] | M_{58} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{6}, X_{7}] |

M_{17} | [X_{3}, X_{5}, X_{7}] | M_{38} | [X_{3}, X_{4}, X_{5}, X_{7}] | M_{59} | [X_{1}, X_{2}, X_{3}, X_{5}, X_{6}, X_{7}] |

M_{18} | [X_{3}, X_{6}, X_{7}] | M_{39} | [X_{3}, X_{4}, X_{6}, X_{7}] | M_{60} | [X_{1}, X_{2}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{19} | [X_{4}, X_{5}, X_{7}] | M_{40} | [X_{3}, X_{5}, X_{6}, X_{7}] | M_{61} | [X_{1}, X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{20} | [X_{4}, X_{6}, X_{7}] | M_{41} | [X_{4}, X_{5}, X_{6}, X_{7}] | M_{62} | [X_{2}, X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

M_{21} | [X_{5}, X_{6}, X_{7}] | M_{42} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{7}] | M_{63} | [X_{1}, X_{2}, X_{3}, X_{4}, X_{5}, X_{6}, X_{7}] |

*Note*: see Table 2 for the notations *X*_{1}, *X*_{2}, …, *X*_{7}.

Each ANFIS-GP model was trained with the training subset using the hybrid algorithm, and the model error (called training error) was determined; the training error goal was set to zero. At each epoch, the model validation error was also calculated. An epoch, at which the validation error started to rise while the training error continued to decrease, was considered as a sign to terminate iterating the training algorithm because of the occurrence of overfitting. In the case that both the training and validation errors continued to decrease while the number of the epochs increased, the learning algorithm stopped iterating whenever the training error goal achieved; otherwise, the iterating progressed up to an epoch beyond which the training error value remained constant. When none of these three stopping criteria was met, the learning algorithm continued iterating up to a pre-defined number of epochs (100 epochs).

*evalfis*’ function in the MATLAB command window. Equation (8) describes the ‘

*evalfis*’ function.where ‘

*output*’ is a vector specifying the model output; ‘

*input*’ is a matrix indicating the input values (‘

*evalfis*’ function takes each row of the ‘

*input*’ as an input vector and returns a vector to the ‘

*output*’); and ‘

*fismat*’ represents the created FIS.

*R*

^{2}) and

*RMSE*.

*R*

^{2}, which indicates the goodness of fit between the actual (target) values and the model-predicted values, is given by the following Equation:

*RMSE*, which measures the average magnitude of the error, is defined by the following Equation:where

*Y*represents the actual (target) value of the output and is the corresponding model predictions for the

_{i}*i*th pattern; is the average value of

*Y*

_{i}(

*i*

*=*1, 2,…,

*n*); ‘

*n*’ is the total number of patterns (in the training, validation, or testing subset), on which the

*R*

^{2}and

*RMSE*are computed.

From Equations (9) and (10), it can be seen that the closer the *RMSE* to zero and *R*^{2} to unity, the smaller the difference between and . In other words, the model perfectly fits the data when the *R*^{2} value is equal to unity and the *RMSE* value is equal to zero.

Once the training process was complete, the trained (developed) model was subjected to the testing phase, so that the testing subset (unseen data during the training/validation process) was fed to the model in order to assess the predictive ability of the model by means of the two aforementioned statistical indices (*R*^{2} and *RMSE*; see Equations (9) and (10)).

Note that when the model training and testing phases were complete, the normalized output values of the model were anti-normalized to their original values by reversing transformation of Equation (7).

#### ANFIS-SC model

The ANFIS-SC model was constructed in the same manner as the ANFIS-GP, except that the number of MFs per input was automatically determined instead of specifying by the user. In the FIS generation panel of the ANFIS Editor GUI, the SC technique was selected and the MATLAB default values for the clustering parameters were selected (*r**=* 0.50, *λ* = 1.25, *AR**=* 0.50, *RR**=* 0.15; MATLAB^{®} version 8.3.0.532, R2014a). Note that the training phase parameters, including the number of training epochs, training error goal, the type of output MF, training algorithm, and training stopping criteria, were the same as those used for the ANFIS-GP model. In addition, the model testing phase was performed similar to that for the ANFIS-GP model.

#### Multiple regression-based approach

The multiple regression-based analysis was also performed for predicting H_{2}S emission from the gravity sewer. The experimental data were evaluated by means of DataFit software (trial version 9.1.32, Copyright^{©} 1995–2014, Oakdale Engineering, PA, USA), which contains 242 types of regression models. As the regression models were solved, they were automatically ranked based on the goodness of fit. In addition, *t*-ratios and *p*-values were estimated to assess the significance of the model coefficients (*p* < 0.05 was considered statistically significant). It should be pointed out that the training subset was served to estimate the regression model coefficients, while the testing subset was applied to evaluate the models’ prediction accuracy in terms of *R*^{2} and *RMSE* given by Equations (9) and (10).

## RESULTS AND DISCUSSIONS

In this study, various modeling approaches were applied to predict H_{2}S emission from a gravity-flow sewer as a function of temperature and hydraulic conditions.

First, two different ANFIS models, namely ANFIS-GP and ANFIS-SC, were constructed based on the first-order Takagi–Sugeno FIS. The hybrid-learning algorithm was employed to train the models. The inputs to the models were temperature, pipe's slope, flow rate, hydraulic depth, mean flow velocity, liquid volume fraction in the pipe, and time, while models’ output was H_{2}S emission from the aqueous phase of the sewer. The data used were taken from the experimental study of Lahav *et al.* (2006). Models’ performances were assessed with two descriptive statistical indicators, namely *R*^{2} and *RMSE*.

Second, multiple regression models were established whose results were compared with those of the ANFIS-GP and ANFIS-SC models.

### ANFIS-GP model

To estimate H_{2}S emission from the gravity-flow sewer pipe, 63 ANFIS-GP models were developed whose training, validation, and testing results in terms of *R*^{2} and *RMSE* are presented in Table 4. In addition, the number of fuzzy rules and the total number of parameters (linear and nonlinear) for each model are given.

Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|

. | . | Training . | Validation . | Testing . | ||||

N
. _{r} | N
. _{tp} | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | |

M_{1} | 4 | 20 | 0.9692 | 5.8842 | 0.9714 | 6.1405 | 0.9715 | 5.4812 |

M_{2} | 0.9606 | 6.6505 | 0.9585 | 7.4065 | 0.9516 | 7.1492 | ||

M_{3} | 0.9685 | 5.9450 | 0.9647 | 6.8247 | 0.9597 | 6.5223 | ||

M_{4} | 0.9549 | 7.1114 | 0.9514 | 8.0133 | 0.9457 | 7.5703 | ||

M_{5} | 0.9681 | 5.9855 | 0.9682 | 6.4825 | 0.9605 | 6.4558 | ||

M_{6} | 0.9542 | 7.1701 | 0.9516 | 7.9945 | 0.9442 | 7.6705 | ||

M_{7} | 8 | 44 | 0.9824 | 4.4410 | 0.9860 | 4.2984 | 0.9810 | 4.4800 |

M_{8} | 0.9878 | 3.6949 | 0.9887 | 3.8674 | 0.9868 | 3.7291 | ||

M_{9} | 0.9838 | 4.2709 | 0.9824 | 4.8159 | 0.9798 | 4.6155 | ||

M_{10} | 0.9884 | 3.6029 | 0.9920 | 3.2604 | 0.9888 | 3.4450 | ||

M_{11} | 0.9836 | 4.2853 | 0.9824 | 4.8207 | 0.9798 | 4.6151 | ||

M_{12} | 0.9855 | 4.0379 | 0.9815 | 4.9393 | 0.9775 | 4.8711 | ||

M_{13} | 0.9823 | 4.4597 | 0.9787 | 5.3068 | 0.9714 | 5.4918 | ||

M_{14} | 0.9860 | 3.9614 | 0.9817 | 4.9285 | 0.9810 | 4.4883 | ||

M_{15} | 0.9825 | 4.4293 | 0.9785 | 5.3297 | 0.9716 | 5.4740 | ||

M_{16} | 0.9835 | 4.3069 | 0.9861 | 4.2828 | 0.9780 | 4.8210 | ||

M_{17} | 0.9876 | 3.7322 | 0.9889 | 3.8281 | 0.9843 | 4.0680 | ||

M_{18} | 0.9837 | 4.2730 | 0.9860 | 4.2935 | 0.9782 | 4.7954 | ||

M_{19} | 0.9825 | 4.4277 | 0.9816 | 4.9270 | 0.9776 | 4.8661 | ||

M_{20} | 0.9698 | 5.8235 | 0.9741 | 5.8510 | 0.9638 | 6.1810 | ||

M_{21} | 0.9828 | 4.4001 | 0.9822 | 4.8446 | 0.9776 | 4.8566 | ||

M_{22} | 16 | 96 | 0.9953 | 2.2945 | 0.9956 | 2.4106 | 0.9940 | 2.5171 |

M_{23} | 0.9953 | 2.2992 | 0.9957 | 2.3785 | 0.9939 | 2.5363 | ||

M_{24} | 0.9953 | 2.2883 | 0.9955 | 2.4467 | 0.9943 | 2.4593 | ||

M_{25} | 0.9953 | 2.2989 | 0.9957 | 2.3871 | 0.9939 | 2.5353 | ||

M_{26} | 0.9953 | 2.3026 | 0.9949 | 2.5971 | 0.9930 | 2.7272 | ||

M_{27} | 0.9953 | 2.2953 | 0.9953 | 2.4959 | 0.9940 | 2.5183 | ||

M_{28} | 0.9954 | 2.2810 | 0.9950 | 2.5813 | 0.9931 | 2.7065 | ||

M_{29} | 0.9954 | 2.2659 | 0.9954 | 2.4595 | 0.9939 | 2.5334 | ||

M_{30} | 0.9929 | 2.8208 | 0.9916 | 3.3308 | 0.9897 | 3.2941 | ||

M_{31} | 0.9954 | 2.2640 | 0.9953 | 2.4783 | 0.9939 | 2.5381 | ||

M_{32} | 0.9954 | 2.2640 | 0.9954 | 2.4783 | 0.9939 | 2.5381 | ||

M_{33} | 0.9950 | 2.3711 | 0.9956 | 2.4070 | 0.9939 | 2.5289 | ||

M_{34} | 0.9948 | 2.4067 | 0.9956 | 2.4083 | 0.9933 | 2.6596 | ||

M_{35} | 0.9946 | 2.4538 | 0.9949 | 2.5868 | 0.9933 | 2.6673 | ||

M_{36} | 0.9943 | 2.5198 | 0.9952 | 2.5154 | 0.9921 | 2.8802 | ||

M_{37} | 16 | 96 | 0.9948 | 2.4097 | 0.9955 | 2.4339 | 0.9936 | 2.6090 |

M_{38} | 0.9939 | 2.6197 | 0.9948 | 2.6299 | 0.9915 | 2.9984 | ||

M_{39} | 0.9934 | 2.7167 | 0.9937 | 2.8775 | 0.9884 | 3.4951 | ||

M_{40} | 0.9945 | 2.4938 | 0.9946 | 2.6683 | 0.9925 | 2.8217 | ||

M_{41} | 0.9941 | 2.5667 | 0.9937 | 2.8764 | 0.9904 | 3.1762 | ||

M_{42} | 32 | 212 | 0.9956 | 2.2328 | 0.9951 | 2.5358 | 0.9939 | 2.5465 |

M_{43} | 0.9958 | 2.1752 | 0.9954 | 2.4536 | 0.9939 | 2.5311 | ||

M_{44} | 0.9956 | 2.2249 | 0.9951 | 2.5518 | 0.9938 | 2.5509 | ||

M_{45} | 0.9956 | 2.2227 | 0.9954 | 2.4580 | 0.9940 | 2.5198 | ||

M_{46} | 0.9956 | 2.2317 | 0.9952 | 2.5211 | 0.9838 | 2.5484 | ||

M_{47} | 0.9956 | 2.2159 | 0.9953 | 2.4816 | 0.9939 | 2.5463 | ||

M_{48} | 0.9957 | 2.2003 | 0.9952 | 2.5153 | 0.9936 | 2.6085 | ||

M_{49} | 0.9956 | 2.2294 | 0.9953 | 2.4952 | 0.9930 | 2.7269 | ||

M_{50} | 0.9957 | 2.1974 | 0.9952 | 2.5273 | 0.9936 | 2.6060 | ||

M_{51} | 0.9957 | 2.2091 | 0.9953 | 2.4998 | 0.9933 | 2.6591 | ||

M_{52} | 0.9955 | 2.2479 | 0.9953 | 2.4950 | 0.9936 | 2.6081 | ||

M_{53} | 0.9952 | 2.3314 | 0.9954 | 2.4741 | 0.9934 | 2.6466 | ||

M_{54} | 0.9955 | 2.2395 | 0.9953 | 2.4876 | 0.9935 | 2.6228 | ||

M_{55} | 0.9953 | 2.3068 | 0.9953 | 2.5042 | 0.9930 | 2.7091 | ||

M_{56} | 0.9953 | 2.3026 | 0.9960 | 2.3061 | 0.9938 | 2.5606 | ||

M_{57}^{*} | 64 | 472 | – | – | – | – | – | – |

M_{58}^{*} | ||||||||

M_{59}^{*} | ||||||||

M_{60}^{*} | ||||||||

M_{61}^{*} | ||||||||

M_{62}^{*} | ||||||||

M_{63}^{*} | 128 | 1,052 |

Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|

. | . | Training . | Validation . | Testing . | ||||

N
. _{r} | N
. _{tp} | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | |

M_{1} | 4 | 20 | 0.9692 | 5.8842 | 0.9714 | 6.1405 | 0.9715 | 5.4812 |

M_{2} | 0.9606 | 6.6505 | 0.9585 | 7.4065 | 0.9516 | 7.1492 | ||

M_{3} | 0.9685 | 5.9450 | 0.9647 | 6.8247 | 0.9597 | 6.5223 | ||

M_{4} | 0.9549 | 7.1114 | 0.9514 | 8.0133 | 0.9457 | 7.5703 | ||

M_{5} | 0.9681 | 5.9855 | 0.9682 | 6.4825 | 0.9605 | 6.4558 | ||

M_{6} | 0.9542 | 7.1701 | 0.9516 | 7.9945 | 0.9442 | 7.6705 | ||

M_{7} | 8 | 44 | 0.9824 | 4.4410 | 0.9860 | 4.2984 | 0.9810 | 4.4800 |

M_{8} | 0.9878 | 3.6949 | 0.9887 | 3.8674 | 0.9868 | 3.7291 | ||

M_{9} | 0.9838 | 4.2709 | 0.9824 | 4.8159 | 0.9798 | 4.6155 | ||

M_{10} | 0.9884 | 3.6029 | 0.9920 | 3.2604 | 0.9888 | 3.4450 | ||

M_{11} | 0.9836 | 4.2853 | 0.9824 | 4.8207 | 0.9798 | 4.6151 | ||

M_{12} | 0.9855 | 4.0379 | 0.9815 | 4.9393 | 0.9775 | 4.8711 | ||

M_{13} | 0.9823 | 4.4597 | 0.9787 | 5.3068 | 0.9714 | 5.4918 | ||

M_{14} | 0.9860 | 3.9614 | 0.9817 | 4.9285 | 0.9810 | 4.4883 | ||

M_{15} | 0.9825 | 4.4293 | 0.9785 | 5.3297 | 0.9716 | 5.4740 | ||

M_{16} | 0.9835 | 4.3069 | 0.9861 | 4.2828 | 0.9780 | 4.8210 | ||

M_{17} | 0.9876 | 3.7322 | 0.9889 | 3.8281 | 0.9843 | 4.0680 | ||

M_{18} | 0.9837 | 4.2730 | 0.9860 | 4.2935 | 0.9782 | 4.7954 | ||

M_{19} | 0.9825 | 4.4277 | 0.9816 | 4.9270 | 0.9776 | 4.8661 | ||

M_{20} | 0.9698 | 5.8235 | 0.9741 | 5.8510 | 0.9638 | 6.1810 | ||

M_{21} | 0.9828 | 4.4001 | 0.9822 | 4.8446 | 0.9776 | 4.8566 | ||

M_{22} | 16 | 96 | 0.9953 | 2.2945 | 0.9956 | 2.4106 | 0.9940 | 2.5171 |

M_{23} | 0.9953 | 2.2992 | 0.9957 | 2.3785 | 0.9939 | 2.5363 | ||

M_{24} | 0.9953 | 2.2883 | 0.9955 | 2.4467 | 0.9943 | 2.4593 | ||

M_{25} | 0.9953 | 2.2989 | 0.9957 | 2.3871 | 0.9939 | 2.5353 | ||

M_{26} | 0.9953 | 2.3026 | 0.9949 | 2.5971 | 0.9930 | 2.7272 | ||

M_{27} | 0.9953 | 2.2953 | 0.9953 | 2.4959 | 0.9940 | 2.5183 | ||

M_{28} | 0.9954 | 2.2810 | 0.9950 | 2.5813 | 0.9931 | 2.7065 | ||

M_{29} | 0.9954 | 2.2659 | 0.9954 | 2.4595 | 0.9939 | 2.5334 | ||

M_{30} | 0.9929 | 2.8208 | 0.9916 | 3.3308 | 0.9897 | 3.2941 | ||

M_{31} | 0.9954 | 2.2640 | 0.9953 | 2.4783 | 0.9939 | 2.5381 | ||

M_{32} | 0.9954 | 2.2640 | 0.9954 | 2.4783 | 0.9939 | 2.5381 | ||

M_{33} | 0.9950 | 2.3711 | 0.9956 | 2.4070 | 0.9939 | 2.5289 | ||

M_{34} | 0.9948 | 2.4067 | 0.9956 | 2.4083 | 0.9933 | 2.6596 | ||

M_{35} | 0.9946 | 2.4538 | 0.9949 | 2.5868 | 0.9933 | 2.6673 | ||

M_{36} | 0.9943 | 2.5198 | 0.9952 | 2.5154 | 0.9921 | 2.8802 | ||

M_{37} | 16 | 96 | 0.9948 | 2.4097 | 0.9955 | 2.4339 | 0.9936 | 2.6090 |

M_{38} | 0.9939 | 2.6197 | 0.9948 | 2.6299 | 0.9915 | 2.9984 | ||

M_{39} | 0.9934 | 2.7167 | 0.9937 | 2.8775 | 0.9884 | 3.4951 | ||

M_{40} | 0.9945 | 2.4938 | 0.9946 | 2.6683 | 0.9925 | 2.8217 | ||

M_{41} | 0.9941 | 2.5667 | 0.9937 | 2.8764 | 0.9904 | 3.1762 | ||

M_{42} | 32 | 212 | 0.9956 | 2.2328 | 0.9951 | 2.5358 | 0.9939 | 2.5465 |

M_{43} | 0.9958 | 2.1752 | 0.9954 | 2.4536 | 0.9939 | 2.5311 | ||

M_{44} | 0.9956 | 2.2249 | 0.9951 | 2.5518 | 0.9938 | 2.5509 | ||

M_{45} | 0.9956 | 2.2227 | 0.9954 | 2.4580 | 0.9940 | 2.5198 | ||

M_{46} | 0.9956 | 2.2317 | 0.9952 | 2.5211 | 0.9838 | 2.5484 | ||

M_{47} | 0.9956 | 2.2159 | 0.9953 | 2.4816 | 0.9939 | 2.5463 | ||

M_{48} | 0.9957 | 2.2003 | 0.9952 | 2.5153 | 0.9936 | 2.6085 | ||

M_{49} | 0.9956 | 2.2294 | 0.9953 | 2.4952 | 0.9930 | 2.7269 | ||

M_{50} | 0.9957 | 2.1974 | 0.9952 | 2.5273 | 0.9936 | 2.6060 | ||

M_{51} | 0.9957 | 2.2091 | 0.9953 | 2.4998 | 0.9933 | 2.6591 | ||

M_{52} | 0.9955 | 2.2479 | 0.9953 | 2.4950 | 0.9936 | 2.6081 | ||

M_{53} | 0.9952 | 2.3314 | 0.9954 | 2.4741 | 0.9934 | 2.6466 | ||

M_{54} | 0.9955 | 2.2395 | 0.9953 | 2.4876 | 0.9935 | 2.6228 | ||

M_{55} | 0.9953 | 2.3068 | 0.9953 | 2.5042 | 0.9930 | 2.7091 | ||

M_{56} | 0.9953 | 2.3026 | 0.9960 | 2.3061 | 0.9938 | 2.5606 | ||

M_{57}^{*} | 64 | 472 | – | – | – | – | – | – |

M_{58}^{*} | ||||||||

M_{59}^{*} | ||||||||

M_{60}^{*} | ||||||||

M_{61}^{*} | ||||||||

M_{62}^{*} | ||||||||

M_{63}^{*} | 128 | 1,052 |

The number of parameters for model M* was greater than the number of patterns for training the models, and hence, these models were not considered in this study.

See Table 3 for the model inputs.

*N _{r}*, number of fuzzy rules;

*N*, number of the total parameters;

_{tp}*R*, coefficient of determination;

^{2}*RMSE*, root-mean-squared error.

It can be seen from Table 4 that the ANFIS-GP models M_{1}, M_{2},…,M_{56} offer good performance with *R*^{2} values in the range of 0.9514–0.9960 and 0.9442–0.9943, and *RMSE* values in the range of 2.306–8.013 and 2.460–7.671, for validation and testing phases, respectively. Note that models M_{57} to M_{63} were found to be inappropriate because the creation of models M_{57} to M_{62} and model M_{63} resulted in a total number of 472 and 1,052 parameters, respectively, which was greater than the size of the training subset (equals to 416 input–output data pairs). From the results in Table 4, the ANFIS-GP models (M_{1}, M_{2},…,M_{6}), which use only one of the input variables (*X*_{1}, *X*_{2}, or *X*_{6}) in combination with the input variable *X*_{7}, can efficiently estimate H_{2}S emission. Among these models, model M_{1} is ranked as the best ANFIS-GP model, which gives *R*^{2} and *RMSE* values of 0.9714 and 6.141, respectively, for the validation phase, and the *R*^{2} value of 0.9715 and the *RMSE* value of 5.481 for the testing phase. This reveals that the input variable *X*_{1}, denoted as temperature, seems to be more effective variable for predicting H_{2}S emission.

Regarding the ANFIS-GP models that use more than two input variables (models M_{7}, M_{8},…,M_{56}), model M_{24} (shown in bold italic in Table 4) whose input is a set of *X*_{1}, *X*_{2}, *X*_{5}, and *X*_{7} variables yields the smallest *RMSE* of 2.459 for the testing phase (*R*^{2} = 0.9943).

It can be concluded that the prediction performance of the ANFIS-GP model M_{1} whose input variables are *X*_{1} and *X*_{7} is enhanced by the addition of variables *X*_{2} and *X*_{5} to the input data.

A scatter diagram of the measured and the predicted values of H_{2}S emission for testing subset using the ANFIS-GP model M_{24} is shown in Figure 2.

It is evident from Figure 2(a) that the data points on the plot dispersed close to the 45° line (often called the 1:1 line or the 100% correlation line) with the *R*^{2} value of 0.994. This indicates that only 0.6% of the total variability in the response could not be explained by the model. In addition, the measured data and the model predictions are plotted versus the number of patterns, as illustrated in Figure 2(b). It is clear from Figure 2(b) that there is a small discrepancy between the measured data and the predictions, which confirms high predictive ability of the ANFIS-GP model M_{24}.

### ANFIS-SC model

A total of 63 ANFIS-SC models were developed to predict H_{2}S emission. The performance of each model through training, validation, and testing phases in terms of *R*^{2} and *RMSE* is presented in Table 5. The number of fuzzy rules and the total number of parameters (linear and nonlinear) for each model are also provided.

Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|

. | . | Training . | Validation . | Testing . | ||||

N
. _{r} | N
. _{tp} | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | |

M_{1} | 4 | 28 | 0.9711 | 5.7001 | 0.9749 | 5.7599 | 0.9748 | 5.1555 |

M_{2} | 7 | 49 | 0.9610 | 6.6173 | 0.9582 | 7.4337 | 0.9522 | 7.1025 |

M_{3} | 7 | 49 | 0.9794 | 4.8054 | 0.9792 | 5.2359 | 0.9737 | 5.2725 |

M_{4} | 2 | 14 | 0.9489 | 7.5731 | 0.9456 | 8.4789 | 0.9382 | 8.0740 |

M_{5} | 6 | 42 | 0.9771 | 5.0685 | 0.9697 | 6.3225 | 0.9685 | 5.7641 |

M_{6} | 2 | 14 | 0.9489 | 7.5706 | 0.9455 | 8.4804 | 0.9382 | 8.0732 |

M_{7} | 10 | 100 | 0.9841 | 4.2305 | 0.9874 | 4.0871 | 0.9822 | 4.3337 |

M_{8} | 10 | 100 | 0.9940 | 2.5938 | 0.9949 | 2.6027 | 0.9908 | 3.1224 |

M_{9} | 6 | 60 | 0.9720 | 5.6025 | 0.9683 | 6.4665 | 0.9802 | 4.5749 |

M_{10} | 9 | 90 | 0.9891 | 3.4987 | 0.9905 | 3.5509 | 0.9872 | 3.6789 |

M_{11} | 6 | 60 | 0.9711 | 5.6982 | 0.9666 | 6.6435 | 0.9781 | 4.8076 |

M_{12} | 14 | 140 | 0.9945 | 2.4838 | 0.9944 | 2.7249 | 0.9918 | 2.9414 |

M_{13} | 9 | 90 | 0.9647 | 6.2961 | 0.9594 | 7.3238 | 0.9626 | 6.2823 |

M_{14} | 12 | 120 | 0.9916 | 3.0655 | 0.9890 | 3.8067 | 0.9860 | 3.8426 |

M_{15} | 9 | 90 | 0.9648 | 6.2863 | 0.9596 | 7.3040 | 0.9621 | 6.3273 |

M_{16} | 8 | 80 | 0.9779 | 4.9797 | 0.9689 | 6.4049 | 0.9771 | 4.9141 |

M_{17} | 9 | 90 | 0.9910 | 3.1738 | 0.9896 | 3.7003 | 0.9875 | 3.6315 |

M_{18} | 8 | 80 | 0.9777 | 5.0027 | 0.9685 | 6.4472 | 0.9763 | 4.9964 |

M_{19} | 7 | 70 | 0.9737 | 5.4351 | 0.9604 | 7.2357 | 0.9691 | 5.7105 |

M_{20} | 2 | 20 | 0.9506 | 7.4443 | 0.9483 | 8.2636 | 0.9403 | 7.9356 |

M_{21} | 7 | 70 | 0.9758 | 5.2093 | 0.9633 | 6.9584 | 0.9741 | 5.2313 |

M_{22} | 24 | 312 | 0.9963 | 2.0429 | 0.9949 | 2.5829 | 0.9934 | 2.6326 |

M_{23} | 12 | 156 | 0.9876 | 3.7265 | 0.9805 | 5.0804 | 0.9922 | 2.8630 |

M_{24} | 17 | 221 | 0.9961 | 2.1057 | 0.9954 | 2.4745 | 0.9939 | 2.5304 |

M_{25} | 12 | 156 | 0.9851 | 4.0927 | 0.9780 | 5.3924 | 0.9781 | 4.8037 |

M_{26} | 11 | 143 | 0.9955 | 2.2369 | 0.9948 | 2.6148 | 0.9936 | 2.6006 |

M_{27} | 18 | 234 | 0.9964 | 2.0211 | 0.9953 | 2.4923 | 0.9927 | 2.7692 |

M_{28} | 15 | 195 | 0.9957 | 2.1946 | 0.9952 | 2.5257 | 0.9933 | 2.6522 |

M_{29} | 10 | 130 | 0.9877 | 3.7178 | 0.9815 | 4.9442 | 0.9927 | 2.7807 |

M_{30} | 6 | 78 | 0.9757 | 5.2227 | 0.9714 | 6.1470 | 0.9834 | 4.1897 |

M_{31} | 11 | 143 | 0.9874 | 3.7556 | 0.9811 | 4.9916 | 0.9926 | 2.7942 |

M_{32} | 15 | 195 | 0.9955 | 2.2508 | 0.9952 | 2.5263 | 0.9937 | 2.5777 |

M_{33} | 17 | 221 | 0.9958 | 2.1818 | 0.9944 | 2.7129 | 0.9936 | 2.6009 |

M_{34} | 16 | 208 | 0.9954 | 2.2802 | 0.9953 | 2.4796 | 0.9936 | 2.6005 |

M_{35} | 12 | 156 | 0.9873 | 3.7714 | 0.9806 | 5.0616 | 0.9917 | 2.9682 |

M_{36} | 9 | 117 | 0.9864 | 3.9120 | 0.9792 | 5.2411 | 0.9888 | 3.4428 |

M_{37} | 12 | 156 | 0.9872 | 3.7968 | 0.9795 | 5.2018 | 0.9901 | 3.2367 |

M_{38} | 11 | 143 | 0.9951 | 2.3478 | 0.9944 | 2.7154 | 0.9930 | 2.7272 |

M_{39} | 8 | 104 | 0.9848 | 4.1349 | 0.9780 | 5.3900 | 0.9889 | 3.4282 |

M_{40} | 11 | 143 | 0.9945 | 2.4850 | 0.9939 | 2.8298 | 0.9921 | 2.8812 |

M_{41} | 9 | 117 | 0.9850 | 4.1054 | 0.9784 | 5.3444 | 0.9849 | 3.9936 |

M_{42} | 26 | 416 | 0.9963 | 2.0509 | 0.9952 | 2.5301 | 0.9934 | 2.6396 |

M_{43}^{*} | 27 | 432 | – | – | – | – | – | – |

M_{44} | 26 | 416 | 0.9964 | 2.0040 | 0.9940 | 2.8086 | 0.9928 | 2.7604 |

M_{45} | 17 | 272 | 0.9961 | 2.1048 | 0.9954 | 2.4640 | 0.9930 | 2.7202 |

M_{46} | 14 | 224 | 0.9884 | 3.6116 | 0.9812 | 4.9854 | 0.9891 | 3.3883 |

M_{47} | 18 | 288 | 0.9962 | 2.055 | 0.9950 | 2.5649 | 0.9938 | 2.5487 |

M_{48} | 20 | 320 | 0.9961 | 2.0895 | 0.9956 | 2.4049 | 0.9940 | 2.5217 |

M_{49} | 15 | 240 | 0.9884 | 3.6021 | 0.9808 | 5.0392 | 0.9923 | 2.8493 |

M_{50} | 20 | 320 | 0.9961 | 2.0949 | 0.9955 | 2.4288 | 0.9940 | 2.5143 |

M_{51} | 12 | 192 | 0.9876 | 3.7328 | 0.9806 | 5.0594 | 0.9925 | 2.8186 |

M_{52} | 18 | 288 | 0.9956 | 2.2199 | 0.9943 | 2.7480 | 0.9936 | 2.5970 |

M_{53} | 16 | 256 | 0.9956 | 2.2251 | 0.9953 | 2.4842 | 0.9940 | 2.5219 |

M_{54} | 18 | 288 | 0.9949 | 2.3940 | 0.9948 | 2.6291 | 0.9923 | 2.8553 |

M_{55} | 13 | 208 | 0.9944 | 2.4961 | 0.9934 | 2.9428 | 0.9898 | 3.2747 |

M_{56} | 12 | 192 | 0.9881 | 3.6537 | 0.9808 | 5.0343 | 0.9920 | 2.8982 |

M_{57}^{*} | 29 | 551 | – | – | – | – | – | – |

M_{58}^{*} | 26 | 494 | – | – | – | – | – | – |

M_{59}^{*} | 29 | 551 | – | – | – | – | – | – |

M_{60} | 19 | 361 | 0.9963 | 2.0513 | 0.9950 | 2.5784 | 0.9931 | 2.6957 |

M_{61} | 19 | 361 | 0.9961 | 2.0971 | 0.9954 | 2.4667 | 0.9936 | 2.5952 |

M_{62} | 19 | 361 | 0.9951 | 2.3574 | 0.9947 | 2.6547 | 0.9931 | 2.6969 |

M_{63}^{*} | 29 | 638 | – | – | – | – | – | – |

Model . | Model's specifications/performance . | |||||||
---|---|---|---|---|---|---|---|---|

. | . | Training . | Validation . | Testing . | ||||

N
. _{r} | N
. _{tp} | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | R
. ^{2} | RMSE
. | |

M_{1} | 4 | 28 | 0.9711 | 5.7001 | 0.9749 | 5.7599 | 0.9748 | 5.1555 |

M_{2} | 7 | 49 | 0.9610 | 6.6173 | 0.9582 | 7.4337 | 0.9522 | 7.1025 |

M_{3} | 7 | 49 | 0.9794 | 4.8054 | 0.9792 | 5.2359 | 0.9737 | 5.2725 |

M_{4} | 2 | 14 | 0.9489 | 7.5731 | 0.9456 | 8.4789 | 0.9382 | 8.0740 |

M_{5} | 6 | 42 | 0.9771 | 5.0685 | 0.9697 | 6.3225 | 0.9685 | 5.7641 |

M_{6} | 2 | 14 | 0.9489 | 7.5706 | 0.9455 | 8.4804 | 0.9382 | 8.0732 |

M_{7} | 10 | 100 | 0.9841 | 4.2305 | 0.9874 | 4.0871 | 0.9822 | 4.3337 |

M_{8} | 10 | 100 | 0.9940 | 2.5938 | 0.9949 | 2.6027 | 0.9908 | 3.1224 |

M_{9} | 6 | 60 | 0.9720 | 5.6025 | 0.9683 | 6.4665 | 0.9802 | 4.5749 |

M_{10} | 9 | 90 | 0.9891 | 3.4987 | 0.9905 | 3.5509 | 0.9872 | 3.6789 |

M_{11} | 6 | 60 | 0.9711 | 5.6982 | 0.9666 | 6.6435 | 0.9781 | 4.8076 |

M_{12} | 14 | 140 | 0.9945 | 2.4838 | 0.9944 | 2.7249 | 0.9918 | 2.9414 |

M_{13} | 9 | 90 | 0.9647 | 6.2961 | 0.9594 | 7.3238 | 0.9626 | 6.2823 |

M_{14} | 12 | 120 | 0.9916 | 3.0655 | 0.9890 | 3.8067 | 0.9860 | 3.8426 |

M_{15} | 9 | 90 | 0.9648 | 6.2863 | 0.9596 | 7.3040 | 0.9621 | 6.3273 |

M_{16} | 8 | 80 | 0.9779 | 4.9797 | 0.9689 | 6.4049 | 0.9771 | 4.9141 |

M_{17} | 9 | 90 | 0.9910 | 3.1738 | 0.9896 | 3.7003 | 0.9875 | 3.6315 |

M_{18} | 8 | 80 | 0.9777 | 5.0027 | 0.9685 | 6.4472 | 0.9763 | 4.9964 |

M_{19} | 7 | 70 | 0.9737 | 5.4351 | 0.9604 | 7.2357 | 0.9691 | 5.7105 |

M_{20} | 2 | 20 | 0.9506 | 7.4443 | 0.9483 | 8.2636 | 0.9403 | 7.9356 |

M_{21} | 7 | 70 | 0.9758 | 5.2093 | 0.9633 | 6.9584 | 0.9741 | 5.2313 |

M_{22} | 24 | 312 | 0.9963 | 2.0429 | 0.9949 | 2.5829 | 0.9934 | 2.6326 |

M_{23} | 12 | 156 | 0.9876 | 3.7265 | 0.9805 | 5.0804 | 0.9922 | 2.8630 |

M_{24} | 17 | 221 | 0.9961 | 2.1057 | 0.9954 | 2.4745 | 0.9939 | 2.5304 |

M_{25} | 12 | 156 | 0.9851 | 4.0927 | 0.9780 | 5.3924 | 0.9781 | 4.8037 |

M_{26} | 11 | 143 | 0.9955 | 2.2369 | 0.9948 | 2.6148 | 0.9936 | 2.6006 |

M_{27} | 18 | 234 | 0.9964 | 2.0211 | 0.9953 | 2.4923 | 0.9927 | 2.7692 |

M_{28} | 15 | 195 | 0.9957 | 2.1946 | 0.9952 | 2.5257 | 0.9933 | 2.6522 |

M_{29} | 10 | 130 | 0.9877 | 3.7178 | 0.9815 | 4.9442 | 0.9927 | 2.7807 |

M_{30} | 6 | 78 | 0.9757 | 5.2227 | 0.9714 | 6.1470 | 0.9834 | 4.1897 |

M_{31} | 11 | 143 | 0.9874 | 3.7556 | 0.9811 | 4.9916 | 0.9926 | 2.7942 |

M_{32} | 15 | 195 | 0.9955 | 2.2508 | 0.9952 | 2.5263 | 0.9937 | 2.5777 |

M_{33} | 17 | 221 | 0.9958 | 2.1818 | 0.9944 | 2.7129 | 0.9936 | 2.6009 |

M_{34} | 16 | 208 | 0.9954 | 2.2802 | 0.9953 | 2.4796 | 0.9936 | 2.6005 |

M_{35} | 12 | 156 | 0.9873 | 3.7714 | 0.9806 | 5.0616 | 0.9917 | 2.9682 |

M_{36} | 9 | 117 | 0.9864 | 3.9120 | 0.9792 | 5.2411 | 0.9888 | 3.4428 |

M_{37} | 12 | 156 | 0.9872 | 3.7968 | 0.9795 | 5.2018 | 0.9901 | 3.2367 |

M_{38} | 11 | 143 | 0.9951 | 2.3478 | 0.9944 | 2.7154 | 0.9930 | 2.7272 |

M_{39} | 8 | 104 | 0.9848 | 4.1349 | 0.9780 | 5.3900 | 0.9889 | 3.4282 |

M_{40} | 11 | 143 | 0.9945 | 2.4850 | 0.9939 | 2.8298 | 0.9921 | 2.8812 |

M_{41} | 9 | 117 | 0.9850 | 4.1054 | 0.9784 | 5.3444 | 0.9849 | 3.9936 |

M_{42} | 26 | 416 | 0.9963 | 2.0509 | 0.9952 | 2.5301 | 0.9934 | 2.6396 |

M_{43}^{*} | 27 | 432 | – | – | – | – | – | – |

M_{44} | 26 | 416 | 0.9964 | 2.0040 | 0.9940 | 2.8086 | 0.9928 | 2.7604 |

M_{45} | 17 | 272 | 0.9961 | 2.1048 | 0.9954 | 2.4640 | 0.9930 | 2.7202 |

M_{46} | 14 | 224 | 0.9884 | 3.6116 | 0.9812 | 4.9854 | 0.9891 | 3.3883 |

M_{47} | 18 | 288 | 0.9962 | 2.055 | 0.9950 | 2.5649 | 0.9938 | 2.5487 |

M_{48} | 20 | 320 | 0.9961 | 2.0895 | 0.9956 | 2.4049 | 0.9940 | 2.5217 |

M_{49} | 15 | 240 | 0.9884 | 3.6021 | 0.9808 | 5.0392 | 0.9923 | 2.8493 |

M_{50} | 20 | 320 | 0.9961 | 2.0949 | 0.9955 | 2.4288 | 0.9940 | 2.5143 |

M_{51} | 12 | 192 | 0.9876 | 3.7328 | 0.9806 | 5.0594 | 0.9925 | 2.8186 |

M_{52} | 18 | 288 | 0.9956 | 2.2199 | 0.9943 | 2.7480 | 0.9936 | 2.5970 |

M_{53} | 16 | 256 | 0.9956 | 2.2251 | 0.9953 | 2.4842 | 0.9940 | 2.5219 |

M_{54} | 18 | 288 | 0.9949 | 2.3940 | 0.9948 | 2.6291 | 0.9923 | 2.8553 |

M_{55} | 13 | 208 | 0.9944 | 2.4961 | 0.9934 | 2.9428 | 0.9898 | 3.2747 |

M_{56} | 12 | 192 | 0.9881 | 3.6537 | 0.9808 | 5.0343 | 0.9920 | 2.8982 |

M_{57}^{*} | 29 | 551 | – | – | – | – | – | – |

M_{58}^{*} | 26 | 494 | – | – | – | – | – | – |

M_{59}^{*} | 29 | 551 | – | – | – | – | – | – |

M_{60} | 19 | 361 | 0.9963 | 2.0513 | 0.9950 | 2.5784 | 0.9931 | 2.6957 |

M_{61} | 19 | 361 | 0.9961 | 2.0971 | 0.9954 | 2.4667 | 0.9936 | 2.5952 |

M_{62} | 19 | 361 | 0.9951 | 2.3574 | 0.9947 | 2.6547 | 0.9931 | 2.6969 |

M_{63}^{*} | 29 | 638 | – | – | – | – | – | – |

The number of parameters for model M* was greater than the number of patterns for training the models, and hence, these models were not considered in this study.

See Table 3 for models’ inputs.

*N _{r}*, number of fuzzy rules;

*N*, number of the total parameters;

_{tp}*R*, coefficient of determination;

^{2}*RMSE*, root-mean-squared error.

As seen in Table 5, the ANFIS-SC models (M_{1}, M_{2},…,M_{6}), which deal with only one of the input variables (*X*_{1}, *X*_{2}, or *X*_{6}) in combination with the input variable *X*_{7}, displays good prediction accuracy for estimating H_{2}S emission.

Among these models, the model M_{1} which uses {*X*_{1} and *X*_{7}} as input sets was found to be the best model, which yields *R*^{2} and *RMSE* values of 0.9749 and 5.760, respectively, for the validation phase, and the *R*^{2} value of 0.9748 and the *RMSE* value of 5.156 for the testing phase. This finding is in good agreement with that of the obtained results using the ANFIS-GP model M1, suggesting that the input variable *X*_{1} (temperature) is the major variable for predicting H_{2}S emission for the system under consideration.

Concerning the ANFIS-SC models that have more than two input variables (models M_{7}, M_{8},…, M_{63}), the model M_{50} (shown in bold italic in Table 5) whose input is a set of *X*_{1}, *X*_{3}, *X*_{5}, *X*_{6}, and *X*_{7} variables produces the smallest *RMSE* value of 2.514 for the testing phase (*R*^{2}*=* 0.9940). This indicates that {*X*_{1}, *X*_{3}, *X*_{5}, *X*_{6}, and *X*_{7}} is the optimum set of input variables for estimating H_{2}S emission from the gravity-flow sewer pipe under the conditions considered in this study.

The prediction performance of the trained model (ANFIS-SC model M_{50}) against the testing subset is visualized in Figure 3. This figure indicates an excellent agreement between the measured data and the model-predicted values with the *R*^{2} value of 0.9940.

### Multiple regression-based model

In the regression analysis, among 242 regression models examined for predicting H_{2}S emission from the gravity sewer, an exponential model (called Model 1) – whose mathematical definition is given in Table 6 – offered the best performance. A summary of the regression analysis of Model 1, including the standard error of the estimate, residual sum of squares (*RSS*), and *R*^{2}, is tabulated in Table 6. In addition, Table 6 represents a summary of the regression analysis of two first-order polynomial models (called Models 2 and 3). As seen in this table, Models 1 clearly outperformed the Models 2 and 3 with *R*^{2} and *RSS* values of 0.966 and 1.235, respectively, against the *R*^{2} value of 0.742 and the *RSS* value of 9.458 achieved by Model 2, and the *R*^{2} value of 0.547 and the *RSS* value of 16.601 achieved by Model 3 (the smaller *RSS*, the better the model performs, and vice versa in the case of *R*^{2}). The estimated regression coefficient values together with standard error, *t*-ratios, and the corresponding *p-*values for the best-fit model (Model 1) are summarized in Table 7. The larger *t*-ratio, the more significant parameter in the regression model. In addition, the parameter whose *p*-value is the least is considered the most significant parameter affecting the model response. As seen in Table 7, parameters *X*_{1}, *X*_{3}, *X*_{4}, *X*_{6}, and *X*_{7} showed a significant influence (*p* < 0.05) on the model response, among which *X*_{1} (temperature) and *X*_{7} (time) had more importance than the other parameters, whereas parameters *X*_{2} and *X*_{5} were found to be statistically non-significant (*p* > 0.05).

Rank . | Model definition . | SEE
. | RSS
. | R
. ^{2} |
---|---|---|---|---|

1 | 0.0550 | 1.2352 | 0.9663 | |

2 | 0.1523 | 9.4583 | 0.7420 | |

3 | 0.2015 | 16.601 | 0.5471 |

Rank . | Model definition . | SEE
. | RSS
. | R
. ^{2} |
---|---|---|---|---|

1 | 0.0550 | 1.2352 | 0.9663 | |

2 | 0.1523 | 9.4583 | 0.7420 | |

3 | 0.2015 | 16.601 | 0.5471 |

See Table 7 for the values of the coefficients *a*_{0}–*a*_{7}.

The row shown in bold represents the best-fit model.

The training subset was served to construct the regression models.

*SEE*, standard error of the estimate; *RSS*, residual sum of squares; *R ^{2}*, coefficient of multiple determination.

*Y* represents H_{2}S emission from the sewer, and *X _{i}* (

*i*

*=*1, 2, …, 7) are the values of the input variables (see Table 2 for the notations).

# Model^{a}
. | Coefficient . | SEE
. | t-ratio
. | p-value
. |
---|---|---|---|---|

1 | a_{0} = −0.0650 | 0.0449 | −1.4465 | 0.1488 |

a_{1} = −0.2943 | 0.0337 | −8.7284 | 0.0000 | |

a_{2}= 0.1698 | 0.0870 | 1.9520 | 0.0516 | |

a_{3} = −0.4884 | 0.1645 | −2.9698 | 0.0032 | |

a_{4} = −4.1215 | 1.0987 | −3.7513 | 0.0002 | |

a_{5} = −0.0582 | 0.1613 | −0.3610 | 0.7183 | |

a_{6} = 4.7302 | 1.2447 | 3.8002 | 0.0002 | |

a_{7} = −7.9190 | 0.1315 | −60.243 | 0.0000 |

# Model^{a}
. | Coefficient . | SEE
. | t-ratio
. | p-value
. |
---|---|---|---|---|

1 | a_{0} = −0.0650 | 0.0449 | −1.4465 | 0.1488 |

a_{1} = −0.2943 | 0.0337 | −8.7284 | 0.0000 | |

a_{2}= 0.1698 | 0.0870 | 1.9520 | 0.0516 | |

a_{3} = −0.4884 | 0.1645 | −2.9698 | 0.0032 | |

a_{4} = −4.1215 | 1.0987 | −3.7513 | 0.0002 | |

a_{5} = −0.0582 | 0.1613 | −0.3610 | 0.7183 | |

a_{6} = 4.7302 | 1.2447 | 3.8002 | 0.0002 | |

a_{7} = −7.9190 | 0.1315 | −60.243 | 0.0000 |

Values of *p* < 0.05 were considered statistically significant.

*SEE*, standard error of the estimate.

^{a}See Table 6 for the model equations.

The scatter diagram of the measured data and the predicted values using Model 1 (see Tables 6 and 7 for the model equation and the estimated coefficients values) for the testing subsets is depicted in Figure 4. As seen in Figure 4(a), a good linear correlation (*R*^{2}*=* 0.964 and *RMSE* = 6.14) was obtained between the measured data and the model-predicted values. This indicates that only about 3.6% of the viability in the response could not be explained by the model.

In addition, a comparative graphical representation for the measured data and the model-predicted values is displayed in Figure 4(b). It appears from Figure 4(b) that there is small difference between the measured predicted values. It can be deduced that Model 1 could be accurate enough to correctly predict H_{2}S emission from the gravity sewer under consideration in this study.

### Comparison of the models

It can be seen from Tables 4 and 5 that both the best ANFIS-GP model M_{24} and the best ANFIS-SC model M_{50} tested on the testing subset produced *R*^{2} and *RMSE* values of 0.99 and approximately 2.5, respectively, whereas the best nonlinear regression model (Model 1) – tested against the same dataset – provided *R*^{2} of 0.96 and *RMSE* of 6.14; the smaller *RMSE*, the better the model performs, and vice versa in the case of *R*^{2}. Therefore, the proposed ANFIS-GP models M_{24} and ANFIS-SC M_{50} performed better than Model 1 in predicting H_{2}S emission from the sewer. When comparing the results obtained from the ANFIS-GP model M_{24} and the ANFIS-SC model M_{50} (c.f. Tables 4 and 5), it is clear that the *R*^{2} and *RMSE* values of the two models are relatively similar. However, the ANFIS-GP model M_{24} possessed 16 fuzzy rules and 96 parameters, which were fewer than those of the ANFIS-SC model M_{50} (number of fuzzy rules = 20; number of parameters = 320). This implies that the complexity level of the structure of the ANFIS-GP model M24 was simpler than that of the ANFIS-GP model M_{50}. Therefore, between these two models, the ANFIS-GP model M_{24} was a better choice for the prediction of H_{2}S emission.

### Implication of the models

For such a gravity-flow sewer system as considered in this study, the relationship between the input variables and the output (H_{2}S emission) is described by nonlinear complex mathematical formulas, which are often expensive to solve. In addition to the complexity, the overall mass transfer coefficient of H_{2}S is quite difficult to be determined accurately and is usually involved in laboratory and pilot trials. Hence, some simplifying assumptions are incorporated which may result in an underestimation of the H_{2}S emission rate. The proposed ANFIS models here, possessed strong generalization and prediction ability, were established based on only an actual measured set of input variables and the corresponding output, without taking into account any information regarding the relationship between the input variables and the H_{2}S emission rate. This implies that these models, called easy-to-use black-box models, could be an attractive and useful tool that is worth considering for predicting H_{2}S emission from gravity sewer. These models could offer great benefits because using which engineers and asset managers can evaluate the possible odor and corrosion problems through the design phase and operation of sewers. In addition, it enables them to formulate appropriate strategies to control and mitigate H_{2}S emission to the air or buildup in the sewers atmosphere in order to reduce health risk and minimize sewers corrosion.

## CONCLUSIONS

This study successfully demonstrated the construction of the competing models to predict H_{2}S emission from a gravity sewer without the need for in-depth knowledge of H_{2}S emission mechanism. For the first time, the ability of the ANFIS-GP/ANFIS-SC approaches was revealed for this application. The ANFIS-GP and ANFIS-SC models developed were compared with the (non-)linear regression models. ANFIS-GP and ANFIS-SC models performed better than the (non-)linear regression models with a prediction accuracy of >99%. However, the ANFIS-GP model was found to be much simpler due to the creation of fewer fuzzy rules. This validates the ANFIS-GP model as a valuable computational tool for predicting H_{2}S emission from gravity sewers.

## ACKNOWLEDGEMENTS

This research was funded by Prince of Songkla University and the Ministry of Higher Education, Science, Research and Innovation, Thailand, under the Reinventing University Project (Grant Number REV64061). The authors thank the support from the Department of Civil and Environmental Engineering, and Research and Development Office, Prince of Songkla University, Thailand. We also thank the support from Biogas and Biorefinery Laboratory at the Faculty of Engineering, and PSU Energy Systems Research Institute, Prince of Songkla University, Thailand.

## AUTHOR CONTRIBUTIONS

R.S. (Ph.D., a postdoctoral fellow) developed the models, analyzed and interpreted the results, and wrote the manuscript. S.C. (Professor) reviewed and edited the manuscript.

## CONFLICTS OF INTEREST

The authors declare no conflicts of interest.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.

## REFERENCES

*Comparison Study of Fuzzy c-Means and Fuzzy Subtractive Clustering Implementation in Quality of Indi-Home Fiber Optic Network (Case Study in PT. Telkom Indonesia)*

*B.Sc Thesis*

*Materieöverföring gas-vätska I avloppsledningar (Gas-Liquid Mass Transfer in Sewers)*

*Licentiate Thesis*