## Abstract

Determination of the longitudinal dispersion coefficient (LDC) is fundamental to the development of strategies for environmental management of river systems. This paper presents an integrated model for an estimation of the longitudinal dispersion coefficient by a fusion of optimized intelligent models (optimized neural network (ONN), optimized fuzzy inference system (OFIS), and optimized support vector regression (OSVR)) via committee machine (CM), with optimization done by the Bat-inspired algorithm (BA). The optimization eliminates the associated loss of accuracy of the intelligent models, which is a direct consequence of an improper adjustment of parameters (weights and biases in the neural network, membership's functions in the fuzzy inference system, and user-defined parameters in support vector regression). Data gathered from literature are employed to validate the proposed integrated model. A comparison between the optimized models and a committee machine, based on statistical parameters, shows that the committee machine model can attain high accuracy. Sensitivity analysis (SA) shows the contribution of each optimized model to the committee machine and ranks the contribution of the optimized models in ascending order as optimized neural network, optimized fuzzy inference system, and optimized support vector regression, each significantly correlated with the accuracy of longitudinal dispersion coefficient prediction.

## HIGHLIGHTS

In this study, the longitudinal dispersion coefficient (LDC) is modeled.

Optimized models (OMs; OSVR, ONN, and OFIS) are employed for modeling purposes.

The bat-inspired algorithm (BA) is utilized for improving intelligent models.

The OMs are integrated by a committee machine (CM) with a BA combiner.

The CM generates outstanding results.

### Graphical Abstract

## INTRODUCTION

*et al.*2009; Huai

*et al.*2018) as well as human health (Tayfur & Singh 2005). Contaminants flow in vertical, longitudinal, and transverse directions. However, away from the source, in one-dimensional (1D) river water quality modeling, the longitudinal dispersion of pollutants is usually considered (Chatila & Townsend 1998; Alizadeh

*et al.*2017), which assumes that the longitudinal dispersion coefficient is the most important parameter. Several water quality models such as QUAL2E, QUAL2 K, SIAQUA, and WASP employ the 1D advection–dispersion equation with the longitudinal dispersion coefficient as the principal parameter (Fisher 1967; Wang

*et al.*2012; Hadgu

*et al.*2014; Moses

*et al.*2016; Noori

*et al.*2016; Parveen & Singh 2016; Dehghani

*et al.*2020):where

*C*,

*u*, LDC,

*x*, and

*t*, respectively, are the cross-sectionally averaged concentration, the longitudinal velocity, the longitudinal dispersion coefficient, the longitudinal coordinate parallel to the mean flow direction, and the time. For selecting the best strategy for environmental management of river systems, an accurate determination of this parameter is vital (Perucca

*et al.*2009; Shen

*et al.*2010; Wang & Chen 2017; Wang & Huai 2016).

Researchers have endeavored to obtain the longitudinal dispersion coefficient value for natural rivers through field experimentation (Gaganis *et al.* 2005; Perucca *et al.* 2009; Wang & Huai 2016). Field measurements of the longitudinal dispersion coefficient have four limitations: (1) There is the environmental concern mainly because of the adverse impacts of the soluble tracer (fluorescent dyes, sulfur hexafluoride, and natural–artificial radiotracers) on aquatic ecosystems; (2) they are time-consuming; (3) their cost is high; and (4) their execution is arduous (Smart & Laidlaw 1977; Clark *et al.* 1996; Ho *et al.* 2002, 2006). The question arises: What happens if, for the abovementioned reasons, field measurements of the longitudinal dispersion coefficient are not feasible? In this situation, experimental investigation must be supplanted with predictive models with good estimation capability. A literature survey shows that two groups of models have been proposed for the determination of the longitudinal dispersion coefficient. The first group is based on empirical correlation (Elder 1959; Fischer *et al.* 1979; Fisher *et al.* 1979; Seo & Cheong 1998; Deng *et al.* 2001; Kashefipour & Falconer 2002; Seo & Bake 2004; Tayfur & Singh 2005; Li *et al.* 2013). An example includes an inundation of studies performed for developing empirical correlations for the estimation of the longitudinal dispersion coefficient. Although these models produce valuable information about the longitudinal dispersion coefficient, they underperform because of a high combinatorial of regression tasks of longitudinal dispersion coefficient modeling and a lack of mathematical foundation (Dehghani *et al.* 2020). Thus, these types of models may be prone to producing wrong values. The second group consists of intelligent models (Sattar & Gharabaghi 2015; Alizadeh *et al.* 2017; Rezaie Balf *et al.* 2018; Riahi-Madvar *et al.* 2019; Memarzadeh *et al.* 2020). These models have recently emerged as an alternative to regression and are being increasingly used for solving complex input–output problems as well as for modeling different phenomena in water engineering (Riahi-Madvar & Seifi 2018; Emamgholizadeh & Demneh 2019; Kisi & Yaseen 2019; Roushangar & Shahnazi 2020; Zounemat-Kermani *et al.* 2020). From statistical indices, it becomes apparent that the results obtained from these studies are satisfactory, leading to the conclusion that intelligent-based models are suitable for the prediction of the longitudinal dispersion coefficient. Nevertheless, it is worth developing a more robust mathematical model for computing the longitudinal dispersion coefficient. This study, therefore, developed an integrated model by a fusion of three optimized intelligent models through the Bat-inspired algorithm for achieving an improved estimation of the longitudinal dispersion coefficient. To this end, optimized models first extracted the relationship between the longitudinal dispersion coefficient and its input parameters and compared the results with the actual values. Then, a committee machine was employed for obtaining the integrated model by integrating the outputs of the optimized models for computing the longitudinal dispersion coefficient. A comparison with the predictions of the optimized models showed that the predictions of the committee machine were more accurate. Finally, a simple sensitivity analysis was performed to obtain the percent contribution of each optimized model in the committee machine. Figure 1 displays a flowchart for this model implementation.

## MODEL DESCRIPTION

This paper aimed at addressing the performance of a committee machine by integrating three optimized models to achieve a combined model with the capability to calculate the longitudinal dispersion coefficient with satisfactory accuracy. First, all intelligent models were optimized through the Bat-inspired algorithm. Then, their outputs were used for building the committee machine in which this algorithm was employed for achieving two objectives: (1) optimizing the intelligent models by obtaining the optimal values of their parameters and (2) calculating the optimum weight of each optimized model in the final prediction. All these models were constructed with MATLAB.

### Optimized neural network

Neural networks (NNs), inspired by the biological neural networks of the brain (Golden & Golden 1996), are widely employed to extract complex relationships between input and output variables (Haddadchi *et al.* 2013; Kakaei Lafdani *et al.* 2013; Afshar *et al.* 2014; Lian *et al.* 2015; Kisi *et al.* 2017). The artificial neural networks method can be used for approximating a mapping function but may get trapped into local minima instead of global minima (Asoodeh *et al.* 2014a; Zargar *et al.* 2015; Gholami *et al.* 2020). In order to overcome this shortcoming, a hybridization of the optimization algorithm and the artificial neural network has been suggested (Gholami & Ansari 2017). Therefore, in this study, the Bat-inspired algorithm was included in the artificial neural network structure for searching for the global minima that led to the optimal values of its weights and biases.

### Optimized fuzzy inference system

The fuzzy inference system emanates from Zadeh's fuzzy sets theory, whereby nonlinear relationships between inputs and outputs are appropriately mapped (Zadeh 1965). In this method, every fuzzy set is represented by a membership function. In the fuzzy inference system, three components, namely, fuzzifier, inference engine (or fuzzy rule base), and defuzzifier, together form a useful combination of features to tackle complex problems. The fuzzifier employs a set of input membership functions to transform input data into space with a 0 to 1 range. The inference engine is another part of the fuzzy inference system with the objective of applying fuzzy rules to the fuzzifier's transformed data. Outputs of the inference engine are fused into one fuzzy output distribution and eventually transferred into a crisp output by virtue of output MF in the defuzziﬁer process. Mamdani and Sugeno are two types of the fuzzy inference system that are known for their capability. In solving regression problems by the fuzzy inference system, one of the important tasks that impact model accuracy is finding the optimum values for membership functions (Asoodeh *et al.* 2014a; Asoodeh *et al.* 2015; Gholami & Bodaghi 2016; Ahmadi *et al.* 2017). In this study, the Bat-inspired algorithm was merged with the fuzzy inference system for achieving the best value of membership function, which is in the structure of Sugeno, therefore furthering its ability to map the functional dependency between the longitudinal dispersion coefficient and its influencing parameters. The principle behind such hybridization is illustrated by Gholami *et al.* (2020).

### Optimized support vector regression

Support vector regression is a type of intelligent model that was proposed by Vapnik (1995) to determine a regression function for the relationship between independent variables and the output–response variable. This method is a modification of the support vector machine for accomplishing the regression. Recently, this method has been used for solving different regression problems and has been found to produce output close to the actual output (Asoodeh *et al.* 2014b; Gholami *et al.* 2014a, 2016; Bagheripour *et al.* 2015; Gholami 2016). This method is more conducive to generalization than artificial neural networks. Support vector regression uses the structural risk minimization principle for lessening the upper bound on expected risk, whereas the neural network empirical risk minimization principle is for reducing the error in the training data (Gholami *et al.* 2020). Furthermore, in the support vector regression method, a kernel function is imbedded, whereby a nonlinear learning problem is transformed into a linear learning problem. The type of kernel function employed for this transformation is a linear, polynomial, radial basis function (RBF) or/and a sigmoid of which the RBF is more suitable. One of the key issues during the construction and training of the support vector regression model is the adjustment of penalty parameters because the effect of these parameters is very high (Ansari & Gholami 2015a, 2015b; Fattahi *et al.* 2015; Gholami & Bodaghi 2017; Gholami *et al.* 2017). Therefore, a determination of the optimal values of these parameters is vital. In this study, the Bat-inspired algorithm was hybridized with support vector regression for realizing improved performance by extracting the optimum values of free parameters. Optimized support vector regression is one of the committee machine optimized elements that was used.

### Committee machine

Committee machine is a combined model that integrates outputs containing useful information of individual models to achieve a single response (Chen & Lin 2006). Therefore, when different predictive models are available for the estimation of the response variable, this method produces combined models reaping the benefits of individual models. Studies have indicated that the committee machine is more accurate than individual models (Gholami *et al.* 2016, 2018, 2020; Gholami & Ansari 2017).

### Bat-inspired algorithm

The Bat-inspired algorithm, introduced by Yang (2010), is a meta-heuristic algorithm with excellent global-search performance for handling diverse sets of optimization problems. This method aims to determine the best solution of an optimization problem by simulating the echolocation behavior of bats in search of food and prey (Yang 2010). In nature, hearing is utilized by bats for estimating the extent and locus of circumambient matters. By using the echo that is sent out from their prey, bats precisely locate the object. This approach to surveying the surroundings is basic for developing the Bat-inspired algorithm method (Yang 2013). For full details on this subject, one can refer to Yang (2010). A plurality of research studies is being done to compare the performance of the Bat-inspired algorithm with other optimization approaches in terms of speed, ability to achieve the optimal value, and simplicity. Studies have indicated that the Bat-inspired algorithm has outperformed other techniques when judged in terms of the comparison criteria (Ansari & Gholami 2015a) and has significantly reduced the output error (Yang 2010, 2013; Ansari & Gholami 2015a; Gholami & Ansari 2017). Hence, this algorithm was employed here for achieving two objectives: first, for improving the efficiency of intelligent models by finding the optimal values of the parameters and, second, for extracting the optimal weights of each optimized model in the final prediction.

## DATA INPUT/OUTPUT SPACE

This study aimed at demonstrating the efficiency of the committee machine in the integration of optimized models and achieved a combined model for the quantitative estimation of the longitudinal dispersion coefficient. For longitudinal dispersion coefficient modeling considering interrelated influencing parameters, a proper selection of input parameters is important (Dehghani *et al.* 2020). Since an intelligent model learns the relationship between input variables and the output variable, it can estimate the target value in unseen data (Gholami *et al.* 2014b, 2018). For training and testing, it requires data points. In this study, the data from 30 streams was collected from the open-source literature (Deng *et al.* 2001; Kashefipour & Falconer 2002; Carr & Rehmann 2007; Riahi-Madvar *et al.* 2009; Ahmad 2013). These streams had 495 sample points that were split into a training set with 396 sample points and a testing set with 99 sample points. Every sample contained the influencing factors as well as the target variable. The results of statistical analysis for the longitudinal dispersion coefficient and its influencing factors, including minimum value, maximum value, mean value, mode, standard deviation (SD), skewness, kurtosis, and coefficient of variation (CV) of used data for constructing and evaluating committee machine models, are shown in Table 1. In order to gain a better insight into the correlation between used parameters, correlation matrix and histogram for input parameters and the target value of training, testing, and all data are shown in Figures 2–4, respectively. In these figures, scatter plots are displayed in the lower left triangle, correlation coefficients are demonstrated in the upper right triangle, and the remainder are histograms.

Parameter . | Unit . | Allocation . | Maximum . | Minimum . | Mean . | Mode . | SD . | Skewness . | Kurtosis . | CV . |
---|---|---|---|---|---|---|---|---|---|---|

Channel width (B) | m | Total | 867.000 | 0.200 | 55.042 | 0.200 | 109.248 | 5.047 | 30.450 | 198.480 |

Training | 867.000 | 0.200 | 56.061 | 0.200 | 110.329 | 4.874 | 28.658 | 196.803 | ||

Testing | 867.000 | 0.200 | 50.969 | 0.200 | 105.259 | 5.929 | 41.156 | 206.516 | ||

Flow depth (H) | m | Total | 19.900 | 0.034 | 1.389 | 0.400 | 2.285 | 4.778 | 29.159 | 164.568 |

Training | 19.900 | 0.034 | 1.372 | 0.400 | 2.289 | 4.965 | 31.230 | 166.862 | ||

Testing | 16.760 | 0.049 | 1.455 | 0.410 | 2.279 | 4.084 | 22.044 | 156.619 | ||

Flow velocity (U) | m/s | Total | 1.740 | 0.022 | 0.480 | 0.210 | 0.303 | 1.244 | 2.357 | 63.226 |

Training | 1.710 | 0.023 | 0.479 | 0.320 | 0.299 | 1.186 | 2.243 | 62.312 | ||

Testing | 1.740 | 0.022 | 0.483 | 0.210 | 0.324 | 1.438 | 2.765 | 67.020 | ||

Shear velocity (U_{*}) | m/s | Total | 0.990 | 0.001 | 0.065 | 0.070 | 0.068 | 7.512 | 83.984 | 104.552 |

Training | 0.990 | 0.002 | 0.064 | 0.070 | 0.069 | 7.963 | 90.899 | 109.215 | ||

Testing | 0.510 | 0.001 | 0.069 | 0.055 | 0.060 | 4.650 | 30.798 | 86.728 | ||

Longitudinal Dispersion Coefficients (LDC) | m/s2 | Total | 1,490.000 | 0.005 | 60.730 | 13.940 | 157.256 | 5.975 | 44.915 | 258.944 |

Training | 1,490.000 | 0.008 | 64.248 | 13.940 | 169.215 | 5.777 | 40.802 | 263.376 | ||

Testing | 668.880 | 0.005 | 46.655 | 9.100 | 94.842 | 4.274 | 22.387 | 203.283 |

Parameter . | Unit . | Allocation . | Maximum . | Minimum . | Mean . | Mode . | SD . | Skewness . | Kurtosis . | CV . |
---|---|---|---|---|---|---|---|---|---|---|

Channel width (B) | m | Total | 867.000 | 0.200 | 55.042 | 0.200 | 109.248 | 5.047 | 30.450 | 198.480 |

Training | 867.000 | 0.200 | 56.061 | 0.200 | 110.329 | 4.874 | 28.658 | 196.803 | ||

Testing | 867.000 | 0.200 | 50.969 | 0.200 | 105.259 | 5.929 | 41.156 | 206.516 | ||

Flow depth (H) | m | Total | 19.900 | 0.034 | 1.389 | 0.400 | 2.285 | 4.778 | 29.159 | 164.568 |

Training | 19.900 | 0.034 | 1.372 | 0.400 | 2.289 | 4.965 | 31.230 | 166.862 | ||

Testing | 16.760 | 0.049 | 1.455 | 0.410 | 2.279 | 4.084 | 22.044 | 156.619 | ||

Flow velocity (U) | m/s | Total | 1.740 | 0.022 | 0.480 | 0.210 | 0.303 | 1.244 | 2.357 | 63.226 |

Training | 1.710 | 0.023 | 0.479 | 0.320 | 0.299 | 1.186 | 2.243 | 62.312 | ||

Testing | 1.740 | 0.022 | 0.483 | 0.210 | 0.324 | 1.438 | 2.765 | 67.020 | ||

Shear velocity (U_{*}) | m/s | Total | 0.990 | 0.001 | 0.065 | 0.070 | 0.068 | 7.512 | 83.984 | 104.552 |

Training | 0.990 | 0.002 | 0.064 | 0.070 | 0.069 | 7.963 | 90.899 | 109.215 | ||

Testing | 0.510 | 0.001 | 0.069 | 0.055 | 0.060 | 4.650 | 30.798 | 86.728 | ||

Longitudinal Dispersion Coefficients (LDC) | m/s2 | Total | 1,490.000 | 0.005 | 60.730 | 13.940 | 157.256 | 5.975 | 44.915 | 258.944 |

Training | 1,490.000 | 0.008 | 64.248 | 13.940 | 169.215 | 5.777 | 40.802 | 263.376 | ||

Testing | 668.880 | 0.005 | 46.655 | 9.100 | 94.842 | 4.274 | 22.387 | 203.283 |

The performance in computing the longitudinal dispersion coefficient was judged by using two criteria as given by Equations (3)–(6):

- 1.
- 2.
- 3.
- 4.Percent bias (PB)where
*Y*_{i}_{obs}is the measured value of sample*i*,*Y*_{i}_{pred}is the estimated value of sample*i*, is the average of real values, and*n*is the number of samples. When the values of MSE, MAE, and PB are close to zero and the value of*R*^{2}is close to 1, a model with superior performance is achieved.

## RESULTS AND DISCUSSION

This paper investigated the efficiency of three Bat-inspired algorithm optimized intelligent models, i.e. optimized neural network, optimized fuzzy inference system, and optimized support vector regression, and their combinations in estimating the longitudinal dispersion coefficient. The Bat-inspired algorithm served as an optimizer to improve the robustness of the intelligent models as well as the role of combination to determine the optimal contribution of the optimized models to the committee machine. Prior to optimizing by the Bat-inspired algorithm, the regulation parameters of the Bat-inspired algorithm must be adjusted. The adjustment of the parameters of the Bat-inspired algorithm for optimizing the optimized models is tabulated in Table 2. MSE was adopted as the fitness function for optimization.

Parameter . | Value . | |||
---|---|---|---|---|

ONN . | OFIS . | OSVR . | CM . | |

Number of variables for optimization | 43 | 26 | 3 | 6 |

Population size | 200 | 100 | 10 | 10 |

Maximum iteration | 1,000 | 1,000 | 1,000 | 1,000 |

0.5 | 0.5 | 0.5 | 0.5 | |

0.5 | 0.5 | 0.5 | 0.5 | |

0 | 0 | 0 | 0 | |

2 | 2 | 2 | 2 |

Parameter . | Value . | |||
---|---|---|---|---|

ONN . | OFIS . | OSVR . | CM . | |

Number of variables for optimization | 43 | 26 | 3 | 6 |

Population size | 200 | 100 | 10 | 10 |

Maximum iteration | 1,000 | 1,000 | 1,000 | 1,000 |

0.5 | 0.5 | 0.5 | 0.5 | |

0.5 | 0.5 | 0.5 | 0.5 | |

0 | 0 | 0 | 0 | |

2 | 2 | 2 | 2 |

*In the training state of the three optimized models (optimized neural network, optimized fuzzy inference system, and optimized support vector regression), a 4-fold cross-validation technique was used in order to train intelligent models and generate an accurately predictive model with more stable results.* Two performance evaluation criteria were used to verify the validity of the constructed models. Moreover, the results of the committee machine were compared with the outputs of the available empirical correlations based on statistical parameters.

### Optimized neural network

In the first step, the neural network optimized by the Bat-inspired algorithm was utilized for mapping the functionality between input parameters and the longitudinal dispersion coefficient, which gave the neural network a wide berth from trapping in local minima in lieu of global minima. Therefore, the training dataset was enacted in a fully supervised manner, whereupon the optimized neural network model was constructed in which the hyperbolic tangent sigmoid (TANSIG) was adopted as a transfer function in the hidden and output layers. Figure 5(a) shows the process optimizing the proposed neural network by the Bat-inspired algorithm. All optimal weights and biases of the optimized neural network are presented in Table 3. Finding the optimum number of neurons in the hidden layer is one of the critical concerns in developing the neural network model for performing the task of regression. When evaluating the neural network model with different numbers of neurons in the hidden layer based on statistical criteria, this display exhibits the number of neurons that would be required to obtain models with superior performance, as shown in Figure 6. It was concluded that 7 was the best value for the number of neurons in the hidden layer. Figure 7(a) compares the outputs of the optimized neural network with real values in the cross-plot graph. In this figure, the optimized neural network has a reasonable goodness-of-fit in the prediction of the longitudinal dispersion coefficient. Figure 8 compares each sample. The optimized neural network produced computed values close to observed data. This is also shown statistically in Table 4. In conclusion, the optimized neural network was considered an excellent technique for approximating the value of the longitudinal dispersion coefficient.

Layer . | . | Weights . | Biases . | ||||||
---|---|---|---|---|---|---|---|---|---|

Input 1 . | Input 2 . | Input 3 . | Input 4 . | ||||||

Hidden layer | Node 1 | − 1.6126 | − 1.7624 | 0.6688 | − 0.3623 | 4.0429 | |||

Node 2 | − 1.2181 | − 6.5410 | 6.3940 | − 5.6203 | 9.4622 | ||||

Node 3 | 2.1314 | 0.1241 | −0.0212 | − 2.7719 | −1.1187 | ||||

Node 4 | 9.1947 | 0.1701 | 0.8137 | − 4.3928 | 1.9225 | ||||

Node 5 | 4.4683 | 0.2569 | 0.0292 | − 3.8549 | 0.6129 | ||||

Node 6 | 3.8795 | − 0.5350 | 3.5005 | − 2.0780 | −2.1947 | ||||

Node 7 | − 3.4348 | 1.3214 | −1.2613 | 4.3989 | −0.0965 | ||||

Weights
. | . | ||||||||

Output layer . | . | Node 1
. | Node 2
. | Node 3
. | Node 4
. | Node 5
. | Node 6
. | Node 7
. | Bias
. |

Node 1 | −0.5800 | −1.3987 | −1.5259 | 0.3967 | 0.7158 | 0.5926 | −0.0418 | 1.0183 |

Layer . | . | Weights . | Biases . | ||||||
---|---|---|---|---|---|---|---|---|---|

Input 1 . | Input 2 . | Input 3 . | Input 4 . | ||||||

Hidden layer | Node 1 | − 1.6126 | − 1.7624 | 0.6688 | − 0.3623 | 4.0429 | |||

Node 2 | − 1.2181 | − 6.5410 | 6.3940 | − 5.6203 | 9.4622 | ||||

Node 3 | 2.1314 | 0.1241 | −0.0212 | − 2.7719 | −1.1187 | ||||

Node 4 | 9.1947 | 0.1701 | 0.8137 | − 4.3928 | 1.9225 | ||||

Node 5 | 4.4683 | 0.2569 | 0.0292 | − 3.8549 | 0.6129 | ||||

Node 6 | 3.8795 | − 0.5350 | 3.5005 | − 2.0780 | −2.1947 | ||||

Node 7 | − 3.4348 | 1.3214 | −1.2613 | 4.3989 | −0.0965 | ||||

Weights
. | . | ||||||||

Output layer . | . | Node 1
. | Node 2
. | Node 3
. | Node 4
. | Node 5
. | Node 6
. | Node 7
. | Bias
. |

Node 1 | −0.5800 | −1.3987 | −1.5259 | 0.3967 | 0.7158 | 0.5926 | −0.0418 | 1.0183 |

Model . | Allocation . | R^{2}
. | MSE . |
---|---|---|---|

ONN | Training | 0.91788 | 2,346 |

Testing | 0.89107 | 970 | |

Total | 0.91611 | 2,070 | |

OFIS | Training | 0.90047 | 2,843 |

Testing | 0.88820 | 995 | |

Total | 0.89978 | 2,473 | |

OSVR | Training | 0.94202 | 1,656 |

Testing | 0.89197 | 962 | |

Total | 0.93853 | 1,517 | |

CM | Training | 0.94507 | 1,569 |

Testing | 0.90252 | 868 | |

Total | 0.94211 | 1,429 |

Model . | Allocation . | R^{2}
. | MSE . |
---|---|---|---|

ONN | Training | 0.91788 | 2,346 |

Testing | 0.89107 | 970 | |

Total | 0.91611 | 2,070 | |

OFIS | Training | 0.90047 | 2,843 |

Testing | 0.88820 | 995 | |

Total | 0.89978 | 2,473 | |

OSVR | Training | 0.94202 | 1,656 |

Testing | 0.89197 | 962 | |

Total | 0.93853 | 1,517 | |

CM | Training | 0.94507 | 1,569 |

Testing | 0.90252 | 868 | |

Total | 0.94211 | 1,429 |

### Optimized fuzzy inference system

Inappropriate adjustment of membership functions emasculates the performance of the fuzzy inference system. To avoid this problem, the Bat-inspired algorithm was hybridized with the fuzzy inference system to achieve optimal values of its membership function. This algorithm was able to extract proper values, as shown in Figure 5(b); also, the values of the parameters achieved by optimization are given in Table 5. The estimated values of the longitudinal dispersion coefficient were plotted against real values, as shown in Figure 7(b), which shows that the optimized fuzzy inference system was capable of producing promising results. For each sample, the prediction values were compared with the real values, as shown in Figure 9, which highlighted the performance of the optimized fuzzy inference system in correlating the longitudinal dispersion coefficient with its conditioning factors. The statistical indexes of this model are indicated in Table 4. The results revealed that the optimized fuzzy inference system was well suited for reflecting the functional dependency between the longitudinal dispersion coefficient and its influencing parameters.

Layer . | Variable name . | MF parameters . | |||
---|---|---|---|---|---|

MF1 . | MF2 . | ||||

σ
. | mean . | σ
. | Mean . | ||

Input layer | B | 0.5020 | 1.2702 | 1.0687 | 0.9034 |

H | 0.8778 | 1.0922 | 1.1244 | 1.0764 | |

U | 0.4592 | 0.6913 | 0.4637 | 1.8889 | |

V | 1.1431 | 0.7386 | 0.8648 | 1.2587 | |

Output layer | Longitudinal dispersion coefficient | [0.1664, −0.0237, 0.3311, 0.0112, −0.5640] | [1.5586, 0.9912, 1.1790, −0.5717, 0.7711] |

Layer . | Variable name . | MF parameters . | |||
---|---|---|---|---|---|

MF1 . | MF2 . | ||||

σ
. | mean . | σ
. | Mean . | ||

Input layer | B | 0.5020 | 1.2702 | 1.0687 | 0.9034 |

H | 0.8778 | 1.0922 | 1.1244 | 1.0764 | |

U | 0.4592 | 0.6913 | 0.4637 | 1.8889 | |

V | 1.1431 | 0.7386 | 0.8648 | 1.2587 | |

Output layer | Longitudinal dispersion coefficient | [0.1664, −0.0237, 0.3311, 0.0112, −0.5640] | [1.5586, 0.9912, 1.1790, −0.5717, 0.7711] |

### Optimized support vector regression

As in the previous cases, the Bat-inspired algorithm was integrated with support vector regression to determine the optimum values of parameters and improve accuracy, as shown in Figure 5(c), and the model optimal parameters are given in Table 6. Figure 7(c) plots the values generated by optimized support vector regression versus the target values. As seen from this figure, the prediction accuracy of the optimized support vector regression was much higher than that of the other optimized models when judged by the correlation coefficient. Figure 10 compares the predicted and observed values against the sample number. Statistical criteria showed that the optimized support vector regression was both viable and effective for the prediction of the longitudinal dispersion coefficient, as shown in Table 4. A comparison between the optimized models indicated that although the three optimized models were capable of modeling the longitudinal dispersion coefficient, the optimized support vector regression outclassed the other two optimized models (the optimized neural network and the optimized fuzzy inference system) in finding the logical relationship between the longitudinal dispersion coefficient and its conditioning parameters.

Parameter . | Value . |
---|---|

Gamma | 2.2348 |

C | 19.5726 |

epsilon | 0.0010 |

Parameter . | Value . |
---|---|

Gamma | 2.2348 |

C | 19.5726 |

epsilon | 0.0010 |

### Committee machine

After calculating the outputs of the optimized models, the committee machine was employed for fuzzing these results and computing the values of the longitudinal dispersion coefficient. This integration made full use of all optimized models and got optimal contributions of those by the Bat-inspired algorithm. Therefore, this combination led to an improvement in the prediction accuracy when there were individual predictive models for computing the target. Equation (2) was introduced to the Bat-inspired algorithm for extracting optimal weights, as displayed in Figure 5(d). The Bat-inspired algorithm determined weights for each of the optimized models as tabulated in Table 7. According to these weights, the Bat-inspired algorithm reduced the share of the optimized neural network (optimized model with least accuracy) and augmented the contribution of the optimized support vector regression (the optimized model with the highest accuracy) for the final prediction of output by the committee machine. Figure 7(d) reveals the cross-plot between observed values versus predicted values. Figure 11 depicts the committee machine results with the optimized model elements for each sample, where it was observed that the committee machine estimated values were close to the real values. Table 4 shows the statistical measurements of each model, from which the effectiveness of the optimized models, as well as that of the committee machine, was evaluated. A comparison of statistical criteria in Table 4 indicated that the committee machine provided an advantage in terms of accuracy and reliability.

Parameter . | Value . |
---|---|

α_{1} | 0.9226 |

α_{2} | 0.8801 |

α_{3} | 0.7984 |

β_{1} | 0.7001 |

β_{2} | 0.2262 |

β_{3} | 1.0176 |

Parameter . | Value . |
---|---|

α_{1} | 0.9226 |

α_{2} | 0.8801 |

α_{3} | 0.7984 |

β_{1} | 0.7001 |

β_{2} | 0.2262 |

β_{3} | 1.0176 |

### Sensitivity analysis

*et al.*2013):

In the above formulae, the maximum and minimum values of the committee machine outputs in the *i*th optimized model outputs are denoted by and , respectively. During the calculation of the contribution of each optimized model to the committee machine, the other optimized models must correspond to their mean values. The result of this contribution analysis is shown in Table 8, in which the optimized support vector regression has the greatest effect on the final prediction of the longitudinal dispersion coefficient. This result is in accord with the capability of optimized models. Accordingly, the optimized model with higher accuracy had a further weight and contribution to the committee machine.

Model . | ONN . | OFIS . | OSVR . |
---|---|---|---|

Sensitivity (%) | 9.0011 | 2.0775 | 88.9214 |

Model . | ONN . | OFIS . | OSVR . |
---|---|---|---|

Sensitivity (%) | 9.0011 | 2.0775 | 88.9214 |

### Comparative study

In order to demonstrate the superiority of the constructed model rather than the best previous published empirical correlations (linear and nonlinear models) (Table 9), their performance is analyzed in terms of statistical parameters (mean square error, mean absolute error, and percentage bias) and tabulated in Table 10. As seen in this table, compared with the empirical correlations developed in previous studies, the committee machine model has results with smaller values of MAE, MSE, and PB, which is an indication of its advantage.

No . | Author . | Equation . | Reference . |
---|---|---|---|

1 | Elder (1959) | Azamathulla & Ghani (2011) | |

2 | Iwasa & Aya (1991) | Azamathulla & Ghani (2011) | |

3 | Li et al. (1998) | Azamathulla & Ghani (2011) | |

4 | Memarzadeh et al. (2020) | Memarzadeh et al. (2020) | |

5 | Memarzadeh et al. (2020) | Memarzadeh et al. (2020) |

No . | Author . | Equation . | Reference . |
---|---|---|---|

1 | Elder (1959) | Azamathulla & Ghani (2011) | |

2 | Iwasa & Aya (1991) | Azamathulla & Ghani (2011) | |

3 | Li et al. (1998) | Azamathulla & Ghani (2011) | |

4 | Memarzadeh et al. (2020) | Memarzadeh et al. (2020) | |

5 | Memarzadeh et al. (2020) | Memarzadeh et al. (2020) |

Model . | MSE . | MAE . | PB . |
---|---|---|---|

CM | 1,428.630 | 19.026 | 3.670 |

Correlation (1) | 28,182.852 | 60.202 | −99.117 |

Correlation (2) | 26,903.214 | 53.997 | −6.506 |

Correlation (3) | 28,352.118 | 59.878 | −95.228 |

Correlation (4) | 20,370.532 | 45.868 | −25.238 |

Correlation (5) | 20,955.838 | 46.930 | −18.823 |

Model . | MSE . | MAE . | PB . |
---|---|---|---|

CM | 1,428.630 | 19.026 | 3.670 |

Correlation (1) | 28,182.852 | 60.202 | −99.117 |

Correlation (2) | 26,903.214 | 53.997 | −6.506 |

Correlation (3) | 28,352.118 | 59.878 | −95.228 |

Correlation (4) | 20,370.532 | 45.868 | −25.238 |

Correlation (5) | 20,955.838 | 46.930 | −18.823 |

## CONCLUSION

The current study aimed at optimizing three intelligent models, namely, the neural network, the fuzzy inference system, and support vector regression, by the Bat-inspired algorithm for calculating the longitudinal dispersion coefficient from influencing factors. The following conclusions are drawn from the study:

- 1.
The bat-inspired algorithm improves the intelligent models as well as accurately computing the share of each optimized model in the final prediction.

- 2.
The optimized models map the interactions and patterns between the longitudinal dispersion coefficient and the influencing factors.

- 3.
The optimized support vector regression is more dependable than the optimized neural network and the optimized fuzzy inference system and has the greatest share in the prediction of the longitudinal dispersion coefficient.

- 4.
The committee machine predicts the longitudinal dispersion coefficient with higher reliability than its optimized elements (the optimized neural network, the optimized fuzzy inference system, and the optimized support vector regression).

- 5.
Compared with models in the literature, the committee machine produces more accurate results.

- 6.
Sensitivity analysis indicated that the optimized support vector regression has the greatest effect on the final prediction of the longitudinal dispersion coefficient.

- 7.
By constructing the committee machine, the prediction accuracy is achieved with only minimal computation.

## ETHICAL APPROVAL

This paper does not contain any studies with animals performed by any of the authors.

## AUTHORS’ CONTRIBUTIONS

All authors contributed to the study conception, design, and revisions. Conceptualization and coding: Mahsa Gholami. Data and methods: Amin Gholami. Writing – original draft preparation: Mahsa Gholami and Amin Gholami. Writing – review and editing: Vijay P. Singh.

## FUNDING

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

## COMPETING INTERESTS

The authors declare that they have no conflict of interest.

## DATA AVAILABILITY STATEMENT

All relevant data are available from an online repository or repositories. Data was collected form open source literature (papers were cited in input/output data space).