## Abstract

In this research, the estimation of discharge in compound open channels with convergent and divergent floodplains using soft computing methods, including the neural fuzzy group method of data handling (NF-GMDH), support vector regression (SVR), and M5 tree algorithm were performed. For this purpose, the geometric and hydraulic characteristics of the flow, including relative roughness (*f*_{f}), relative area (*A*_{r}), relative hydraulic radius (*R*_{r}), relative dimension of the flow aspects (*δ**), relative width (*β*), relative flow depth (*D*_{r}), relative longitudinal distance (*X*_{r}), convergent or divergent angle (*θ*) of the floodplain and longitudinal slope (*S*_{o}) of the bed were used as input variables and discharge was considered as the target (output) variable. The results showed that the statistical indices of the NF-GMDH in the testing stage are RMSE_{NF-GMDH} = 0.004, *R*^{2}_{NF-GMDH} = 0.923 and in the same stage for SVR are RMSE_{SVR}= 0.002 and *R*^{2}_{SVR} = 0.941 and finally for M5 tree algorithm are RMSE_{M5} = 0.002, *R*^{2}_{M5}= 0.931. The evaluation of the structure of the M5 tree algorithm showed that the most effective parameters are *f*_{f}, *D*_{r}, *R*_{r}, *δ**, and *θ* which confirm the important parameters specified by MARS, GMDH, and GEP algorithms used by previous researchers.

## HIGHLIGHTS

Comparing the NF-GMDH, ANFIS, SVM, GEP, MARS, and M5 Algorithm for prediction of discharge in compound channels with convergent and divergent floodplains.

## NOTATION

*Q*flow discharge

- GMDH
group method of data handling

- NF-GMDH
neuro-fuzzy group method of data handling

- SVM
support vector machine

- SVR
support vector regression

- MLPNN
multilayer perceptron neural networks

- MARS
multivariate adaptive regression splines

- GEP
gene expression programming

*Dr*relative flow depth

relative longitudinal distance

relative area

relative roughness

relative hydraulic radius

relative dimension of the flow aspects

*θ*angle of divergence and convergence of the section

relative width

longitudinal slope

*R*^{2}coefficient of explanation

- RMSE
root mean square error

## INTRODUCTION

The study of the flow characteristics of rivers has always been one of the most important issues in hydraulic engineering. The section of rivers always changes as they pass through different paths, both in the mountains and in the plains. Normally, the flow in rivers is steady and nonuniform. However, with the occurrence of floods, the flow conditions become unsteady and nonuniform. This makes the hydraulic flow in rivers more complicated (Graf & Altinakar 1998). In addition, rivers, especially in the plains, twist and turn along their course, adding to the aforementioned complications. Nowadays, the concepts of the compound open channel are used for hydraulic modeling of the river, as this approach considers both parts of the main channel and the floodplain (Sahu 2011). Usually, the flow velocity in the floodplain is lower than in the main channel; this causes sedimentation and, as a result, the floodplain is rougher than the main channel. Due to the different velocities between the main channel and the floodplains, a shear stress is created at the border of them, which causes the formation of eddies (Mohanta *et al.* 2020). Several types of research have been carried out in the field of hydraulics of compound open channels (Singh & Tang 2020; Kumar Singh *et al.* 2022), starting with the study of the flow structure in open channels with prismatic floodplains (Naik *et al.* 2017), and then followed by the study of nonprismatic floodplains (Singh *et al.* 2019a, 2019b), including skewed, convergent, and divergent floodplains, and nowadays the meandering compound open channels are of interest. Sellin (1964) showed that the interaction of the flow in the main channel and the floodplains cause eddies at their boundaries, which results in a loss of flow energy and a corresponding decrease in total discharge.

Mohanty *et al.* (2011) studied the shear stress variations in the compound channel, focusing on the boundary between the main channel and the floodplain; their studies showed that the shear stress layer depends on the geometric and hydraulic conditions of the flow. They stated that as the relative width ratio (ratio of the width of the floodplain to the main channel) increases, the shear stress value decreases. Bousmar *et al.* (2006) studied the flow hydraulics in a compound open channel with convergent floodplains. The results of their research showed that at high relative depths, lateral mass transfer in the last half of the convergent region is greater than in the first half. Naik *et al.* (2017) studied the hydraulics of flow in a compound channel with a nonprismatic floodplain. The results of their research showed that the average depth velocity and boundary shear stress increase along the channel convergence. In addition to laboratory studies, the numerical modeling of flow in prismatic and nonprismatic open channels has also been noted by researchers. Rezaei & Knight (2009) investigated the accuracy of the SKM (Shiono and Knight model) in compound open channels with nonprismatic floodplains. They found that this method was not accurate enough for hydraulic modeling of flow in such sections. They modified the SKM model and presented the modified Shiono-Knight model (M-SKM) to estimate flow parameters, including depth average velocity and boundary shear stress, and to determine the stage–discharge relationship.

Nowadays, due to the weak accuracy of numerical models, researchers have used soft computing methods to model and estimate the hydraulic parameters of flow, especially in compound open channels with nonprismatic floodplains (Das & Khatua 2018; Kaushik & Kumar 2022, 2023; Naik *et al.* 2022; Bijanvand *et al.* 2023). For example, the flow discharge in the compound open channel with prismatic floodplains has been predicted by an artificial neural network (Sahu 2011), fuzzy adaptive neural network model (Parsaie *et al.* 2017; Das *et al.* 2020), multivariate adaptive regression splines (Parsaie & Haghiabi 2017), and gene expression programming (Das *et al.* 2021), and also the discharge in the meander open channel has been predicted by MARS model by Mohanta *et al.* (2020) and Pradhan & Khatua (2019) and finally, the discharge in the compound channel with convergent and divergent floodplains was estimated using soft computing models by Yonesi *et al.* (2022).

The literature review shows that the hydraulic study of the compound open channel is mainly based on laboratory experiments, but numerical modeling has been noted; however, according to the reports, its accuracy is not enough accurate in floodplains with complex geometry. On the other hand, researchers have tried to use soft computing methods to estimate the flow characteristics in such waterways. According to the reports, their accuracy was reasonable in all types of compound open channels. For example, MLPNN, ANFIS, MARS, and GEP models have been utilized for flow discharge prediction in compound open channels.

An essential point in the development of the neuro-fuzzy model is the use of fuzzy logic in the development of the neural network model to increase its reliability. The neuro-fuzzy and GMDH models have been successfully used separately to estimate flow in compound open channels with divergent and convergent floodplains. In addition, tree family algorithms such as MARS have been used successfully. The remarkable point in the development of the GMDH model is the simplicity and clarity of its structure. Considering the confirmation of the proper accuracy of the GMDH model, this research has tried to use the concept of fuzzy logic to increase its reliability. In addition to the tree models, the M5 model, which develops a simpler structure in modeling complex processes, was also investigated.

Therefore, in this research, the development of neural fuzzy group method of data handling (NF-GMDH), support vector machine (SVM) model, and M5 tree algorithm were considered to estimate the discharge in compound open channels with convergent and divergent floodplains. In this regard, two scenarios including the development of mentioned soft computing models based on important parameters and the development based on all involved parameters are considered.

## MATERIALS AND METHODS

In this part, the parameters involved in predicting the discharge in the compound open channels with divergent and convergent floodplains are reviewed. Then, the statistical characteristics of the collected data are calculated. The soft computing models used in this research including the NF-GMDH, the SVR, and the M5 algorithm are reviewed. Finally, the strategies considered for the modeling of discharges are presented.

### Compound open channels with convergent and divergent floodplains

To develop the mentioned soft computing methods, the data related to the mentioned parameters were collected from Bousmar (2002), Bousmar *et al.* (2006), Rezaei (2006), Yonesi *et al.* (2013) and Naik & Khatua (2016) and their statistical characteristics are given in Table 1.

Source . | Range . | f_{f}
. | A_{r}
. | R_{r}
. | D . | S_{0}10^{−3}
. | δ* . | . | . | . | Q . |
---|---|---|---|---|---|---|---|---|---|---|---|

Rezaei (2006) | Max | 0.830 | 9.760 | 4.590 | 0.522 | 2.003 | 6.540 | 3.020 | 1.000 | − 3.81 | 0.040 |

Min | 0.070 | 0.873 | 0.869 | 0.114 | 0.905 | 0.366 | 0.000 | 0.004 | |||

St div | 0.278 | 2.275 | 1.219 | 0.143 | 1.940 | 0.900 | 0.308 | 0.009 | |||

Avg | 0.591 | 2.716 | 2.505 | 0.305 | 4.313 | 2.043 | 0.482 | 0.018 | |||

Median | 0.719 | 2.240 | 2.610 | 0.348 | 5.150 | 2.263 | 0.500 | 0.017 | |||

Bousmar (2002) | Max | 0.837 | 10.720 | 4.400 | 0.538 | 0.9 | 6.360 | 3.000 | 0.833 | −11.3 | 0.020 |

Min | 0.059 | 0.930 | 0.591 | 0.101 | 0.808 | 0.571 | 0.000 | −3.81 | 0.003 | ||

St div | 0.295 | 3.505 | 1.185 | 0.161 | 1.831 | 0.866 | 0.253 | 0.005 | |||

Avg | 0.605 | 3.734 | 2.248 | 0.345 | 3.980 | 1.884 | 0.256 | 0.012 | |||

Median | 0.747 | 2.514 | 2.392 | 0.416 | 4.585 | 2.167 | 0.188 | 0.012 | |||

Bousmar et al. (2006) | Max | 0.832 | 12.800 | 4.200 | 0.539 | 0.9 | 6.290 | 3.000 | 1.000 | 5.71 | 0.020 |

Min | 0.052 | 0.950 | 0.561 | 0.102 | 0.819 | 0.380 | 0.167 | 3.81 | 0.003 | ||

St div | 0.289 | 4.084 | 1.187 | 0.152 | 1.926 | 0.805 | 0.297 | 0.006 | |||

Avg | 0.592 | 4.591 | 2.361 | 0.321 | 4.166 | 1.694 | 0.541 | 0.013 | |||

median | 0.722 | 3.016 | 2.663 | 0.347 | 4.931 | 1.834 | 0.583 | 0.016 | |||

Yonesi et al. (2013) | Max | 0.806 | 20.600 | 35.090 | 0.364 | 0.88 | 1.900 | 3.000 | 1.000 | 11.31 | 0.062 |

Min | 0.143 | 1.370 | 1.910 | 0.103 | 0.229 | 0.576 | 0.096 | 3.81 | 0.011 | ||

St div | 0.236 | 5.825 | 10.046 | 0.096 | 0.622 | 0.866 | 0.285 | 0.018 | |||

Avg | 0.482 | 6.128 | 10.192 | 0.224 | 1.373 | 1.903 | 0.359 | 0.043 | |||

Median | 0.552 | 4.224 | 6.600 | 0.252 | 1.656 | 2.165 | 0.257 | 0.051 | |||

Naik & Khatua (2016) | Max | 0.716 | 22.590 | 7.260 | 0.325 | 1.1 | 4.450 | 1.800 | 0.595 | −13.38 | 0.045 |

Min | 0.047 | 3.545 | 0.850 | 0.059 | 0.293 | 0.178 | 0.000 | −5 | 0.003 | ||

St div | 0.248 | 6.134 | 1.849 | 0.094 | 1.489 | 0.622 | 0.190 | 0.015 | |||

Avg | 0.519 | 8.665 | 3.604 | 0.199 | 3.138 | 1.353 | 0.212 | 0.031 | |||

Median | 0.634 | 7.060 | 3.595 | 0.227 | 3.795 | 1.647 | 0.193 | 0.037 |

Source . | Range . | f_{f}
. | A_{r}
. | R_{r}
. | D . | S_{0}10^{−3}
. | δ* . | . | . | . | Q . |
---|---|---|---|---|---|---|---|---|---|---|---|

Rezaei (2006) | Max | 0.830 | 9.760 | 4.590 | 0.522 | 2.003 | 6.540 | 3.020 | 1.000 | − 3.81 | 0.040 |

Min | 0.070 | 0.873 | 0.869 | 0.114 | 0.905 | 0.366 | 0.000 | 0.004 | |||

St div | 0.278 | 2.275 | 1.219 | 0.143 | 1.940 | 0.900 | 0.308 | 0.009 | |||

Avg | 0.591 | 2.716 | 2.505 | 0.305 | 4.313 | 2.043 | 0.482 | 0.018 | |||

Median | 0.719 | 2.240 | 2.610 | 0.348 | 5.150 | 2.263 | 0.500 | 0.017 | |||

Bousmar (2002) | Max | 0.837 | 10.720 | 4.400 | 0.538 | 0.9 | 6.360 | 3.000 | 0.833 | −11.3 | 0.020 |

Min | 0.059 | 0.930 | 0.591 | 0.101 | 0.808 | 0.571 | 0.000 | −3.81 | 0.003 | ||

St div | 0.295 | 3.505 | 1.185 | 0.161 | 1.831 | 0.866 | 0.253 | 0.005 | |||

Avg | 0.605 | 3.734 | 2.248 | 0.345 | 3.980 | 1.884 | 0.256 | 0.012 | |||

Median | 0.747 | 2.514 | 2.392 | 0.416 | 4.585 | 2.167 | 0.188 | 0.012 | |||

Bousmar et al. (2006) | Max | 0.832 | 12.800 | 4.200 | 0.539 | 0.9 | 6.290 | 3.000 | 1.000 | 5.71 | 0.020 |

Min | 0.052 | 0.950 | 0.561 | 0.102 | 0.819 | 0.380 | 0.167 | 3.81 | 0.003 | ||

St div | 0.289 | 4.084 | 1.187 | 0.152 | 1.926 | 0.805 | 0.297 | 0.006 | |||

Avg | 0.592 | 4.591 | 2.361 | 0.321 | 4.166 | 1.694 | 0.541 | 0.013 | |||

median | 0.722 | 3.016 | 2.663 | 0.347 | 4.931 | 1.834 | 0.583 | 0.016 | |||

Yonesi et al. (2013) | Max | 0.806 | 20.600 | 35.090 | 0.364 | 0.88 | 1.900 | 3.000 | 1.000 | 11.31 | 0.062 |

Min | 0.143 | 1.370 | 1.910 | 0.103 | 0.229 | 0.576 | 0.096 | 3.81 | 0.011 | ||

St div | 0.236 | 5.825 | 10.046 | 0.096 | 0.622 | 0.866 | 0.285 | 0.018 | |||

Avg | 0.482 | 6.128 | 10.192 | 0.224 | 1.373 | 1.903 | 0.359 | 0.043 | |||

Median | 0.552 | 4.224 | 6.600 | 0.252 | 1.656 | 2.165 | 0.257 | 0.051 | |||

Naik & Khatua (2016) | Max | 0.716 | 22.590 | 7.260 | 0.325 | 1.1 | 4.450 | 1.800 | 0.595 | −13.38 | 0.045 |

Min | 0.047 | 3.545 | 0.850 | 0.059 | 0.293 | 0.178 | 0.000 | −5 | 0.003 | ||

St div | 0.248 | 6.134 | 1.849 | 0.094 | 1.489 | 0.622 | 0.190 | 0.015 | |||

Avg | 0.519 | 8.665 | 3.604 | 0.199 | 3.138 | 1.353 | 0.212 | 0.031 | |||

Median | 0.634 | 7.060 | 3.595 | 0.227 | 3.795 | 1.647 | 0.193 | 0.037 |

### Neuro-fuzzy group method of data handling

*et al.*2023). Ivakhnenko (1971) developed the GMDH theory using Kolmogorov–Gabor polynomials. The relationship between the input and output parameters of each system can be expressed by a set of Volterra functions, which are similar to the discretized Kolmogorov–Gabor polynomials, as given in the following equationwhere and are the vectors of input parameters and weight coefficients, respectively. Using the power of MLPNN, a second-degree polynomial for each pair of input parameters was proposed. He also found that a quadratic polynomial in a network of perceptrons can form a Kolmogorov–Gabor polynomial. This method is more accurate than the MLPNN because, in the GMDH algorithm, the calculations performed in each neuron are classified as useful and nonuseful data. The GMDH structure is created in the form of a multilayer feed-forward neural network with some support neurons. Each neuron has two inputs. The relationship between the input and output variables in each neuron can be linear or nonlinear polynomial using the stimulus function described in the following equation.

### Support vector machine

*x*is first mapped onto an

*m*-dimensional feature space using some fixed (nonlinear) mapping, and then a linear model is constructed in this feature space. The naive way of making a nonlinear classifier out of a linear classifier is to map our data from the input space

*x*to a feature space F using a nonlinear function . In the space

*F*, the discriminant function is:Using mathematical notation, the linear model (in the feature space)

*f(x, w)*is given by

There are many kernel functions in SVM, so how to select a good kernel function is also a research issue. However, for general purposes, there are some popular kernel functions.

- I.
Linear kernel:

- II.
Polynomial kernel:

- III.
Radial basis function (RBF) kernel:

- IV.
Sigmoid kernel:

It is well known that SVM generalization performance (estimation accuracy) depends on a good setting of the meta-parameters *C*, *γ,* and r and the kernel parameters. The choice of *C*, *γ,* and *r* controls the complexity of the prediction (regression) model. The problem of optimal parameter selection is further complicated because the complexity of the SVM model (and hence its generalization performance) depends on all three parameters. Kernel functions are used to change the dimensionality of the input space to perform the classification.

### M5 tree model

*T*includes the samples that have reached the desired node,

*T*is the number of data obtained by dividing the desired node based on the selected attribute. Sd is also the standard deviation. The M5 tree algorithm examines all possible scenarios for creating a branch based on a specific attribute and finally selects an option that can increase the error function compared to other scenarios. Once the tree is complete, a multivariate linear regression model is fitted to the samples in each internal node subtree. Figure 2 shows examples of the M5 tree algorithm.

_{i}The *P*-value is the predicted value passed to the higher node. *P*′, in Equation (13), is the prediction value of the model passed from below to this node. k is the smoothing constant of the number of training samples that have reached the node and n is the corresponding node, which is 15 by default.

### Modeling strategies

As shown in Equation (1), nine parameters can be used as input variables to model and predict the flow discharge in compound open channels with nonprismatic floodplains using soft computing models. Therefore, a combination of one to nine parameters can be considered in designing the pattern of input variables. Different approaches can be used to reduce the operation in knowing the best input combination. One of the ways is to use the Gamma test previously applied by Das *et al.* (2020). The second method is to check the structure of the developed models, such as MARS, GEP, and GMDH, which identify the most important parameters and give them more weight during the development process of the mathematical formula. In this research, two scenarios are considered, i.e. development based on the most important parameters and development based on all parameters involved. Furthermore, the coefficient of determination () and the root mean square error (RMSE) were used to check the accuracy of the models used. To develop the aforementioned models, it is necessary to first divide the collected data into two categories: training and testing. It should be noted that the number of collected data is 196, and in this research, 80% of the data were allocated to training and the remaining 20% to testing. The training data are used for calibration and the test data are used for validation. Since the collected data do not have a time series nature, training and testing were randomly assigned to each group. The range of data allocated is shown in Table 2.

Stage . | Range . | f_{f}
. | A_{r}
. | R_{r}
. | β
. | S_{0}
. | δ*
. | α
. | x_{r}
. | θ
. | Q
. |
---|---|---|---|---|---|---|---|---|---|---|---|

Train | Minimum | 0.38 | 0.93 | 1.70 | 0.11 | 0.00 | 1.41 | 1.33 | 0.00 | −13.38 | 0.01 |

Maximum | 0.84 | 22.59 | 17.69 | 0.54 | 0.00 | 6.54 | 3.02 | 1.00 | 11.31 | 0.06 | |

Average | 0.70 | 4.45 | 3.30 | 0.34 | 0.00 | 4.35 | 2.08 | 0.44 | 0.02 | ||

Variance | 0.01 | 13.94 | 3.74 | 0.01 | 0.00 | 1.57 | 0.29 | 0.10 | 0.00 | ||

Test | Minimum | 0.31 | 0.94 | 1.72 | 0.15 | 0.00 | 1.44 | 1.33 | 0.00 | −13.38 | 0.01 |

Maximum | 0.84 | 20.60 | 35.09 | 0.53 | 0.00 | 6.43 | 3.02 | 1.00 | 11.31 | 0.06 | |

Average | 0.69 | 4.05 | 4.36 | 0.33 | 0.00 | 4.62 | 2.20 | 0.34 | 0.02 | ||

Variance | 0.01 | 13.43 | 33.74 | 0.01 | 0.00 | 1.73 | 0.31 | 0.10 | 0.00 |

Stage . | Range . | f_{f}
. | A_{r}
. | R_{r}
. | β
. | S_{0}
. | δ*
. | α
. | x_{r}
. | θ
. | Q
. |
---|---|---|---|---|---|---|---|---|---|---|---|

Train | Minimum | 0.38 | 0.93 | 1.70 | 0.11 | 0.00 | 1.41 | 1.33 | 0.00 | −13.38 | 0.01 |

Maximum | 0.84 | 22.59 | 17.69 | 0.54 | 0.00 | 6.54 | 3.02 | 1.00 | 11.31 | 0.06 | |

Average | 0.70 | 4.45 | 3.30 | 0.34 | 0.00 | 4.35 | 2.08 | 0.44 | 0.02 | ||

Variance | 0.01 | 13.94 | 3.74 | 0.01 | 0.00 | 1.57 | 0.29 | 0.10 | 0.00 | ||

Test | Minimum | 0.31 | 0.94 | 1.72 | 0.15 | 0.00 | 1.44 | 1.33 | 0.00 | −13.38 | 0.01 |

Maximum | 0.84 | 20.60 | 35.09 | 0.53 | 0.00 | 6.43 | 3.02 | 1.00 | 11.31 | 0.06 | |

Average | 0.69 | 4.05 | 4.36 | 0.33 | 0.00 | 4.62 | 2.20 | 0.34 | 0.02 | ||

Variance | 0.01 | 13.43 | 33.74 | 0.01 | 0.00 | 1.73 | 0.31 | 0.10 | 0.00 |

## RESULTS AND DISCUSSION

Firstly, the results of the M5 model are presented. This model was developed based on both scenarios (introduced in the modeling strategies) and its results are presented in Equations (14) and (15). The reason for the priority of presenting the results of the M5 model compared to other models used in this research is the identification of the most important effective parameters in the development process of the M5 model. The same feature can be seen in the GMDH model, but this feature is not seen in the NF-GMDH model. Of course, it is possible to implement the fuzzy adaptive model in the conventional GMDH model, which takes advantage of the two features of identifying the most important effective and adaptive parameters at the same time.

*θ*, which have been identified in the previous computational models including MARS, Gamma test, GMDH, and GEP used by previous researchers. This result shows that the modeling with the M5 model has been developed correctly. The statistical indices of the M5 model based on this first scene in the training phase are and and in the testing phase and . The statistical indices of the M5 model developed based on the second scenario in the training phase are and and in the testing phase and . Comparing the performance of the M5 model in both scenarios shows that the development based on the main effective parameters does not show significant changes in the error statistical indices in the two stages of testing and training. The development of the M5 tree model has 23 branches in the first scenario, while it has 18 branches in the second scenario. This means that in the first scenario, the developed model was based on important parameters, so in the second scenario only five branches of the initial model were pruned. The results of the M5 model in different stages of training and testing are shown in Figures 3 and 4. In these figures, the results of the M5 model in the training and testing stages are plotted against the observed data.

Model . | Senario . | Train . | Test . | ||
---|---|---|---|---|---|

R^{2}
. | RMSE . | . | RMSE . | ||

M5 | 1 | 0.979 | 0.002 | 0.957 | 0.002 |

2 | 0.955 | 0.002 | 0.931 | 0.002 | |

NF-GMDH | 1 | 0.927 | 0.004 | 0.934 | 0.003 |

2 | 0.931 | 0.004 | 0.923 | 0.004 | |

SVR | 1 | 0.982 | 0.002 | 0.965 | 0.002 |

2 | 0.971 | 0.002 | 0.941 | 0.002 |

Model . | Senario . | Train . | Test . | ||
---|---|---|---|---|---|

R^{2}
. | RMSE . | . | RMSE . | ||

M5 | 1 | 0.979 | 0.002 | 0.957 | 0.002 |

2 | 0.955 | 0.002 | 0.931 | 0.002 | |

NF-GMDH | 1 | 0.927 | 0.004 | 0.934 | 0.003 |

2 | 0.931 | 0.004 | 0.923 | 0.004 | |

SVR | 1 | 0.982 | 0.002 | 0.965 | 0.002 |

2 | 0.971 | 0.002 | 0.941 | 0.002 |

Following, the performance of the SVR model in estimating the flow discharge based on both scenarios was checked. The structure of the SVR model developed for the second scenario is shown in Figure 4. To develop the SVR model, different kernel functions (discussed in the Materials and Methods section) were investigated and the results showed that the radial function has better accuracy than others. The statistical indices of the SVR model in the training and testing phases are shown in Table 3. The statistical indices of the SVR model based on scenario two in the training phase are and and in the testing phase and . Comparing the performance of the NF-GMDH model with the M5 model shows that in both scenarios the accuracy of the SVR model is slightly higher than the NF-GMDH model and almost equal to the M5 model.

This section presents the performance of the SVR, NF-GMDH, and M5 tree algorithms and then compares their statistical indices with other models proposed by previous researchers. Das *et al.* (2020, 2021) developed ANFIS and GEP models to estimate the discharge in compound open channels with divergent and convergent floodplains. They used the Gamma test to determine the main effective parameters. The statistical indices of the ANFIS model at the test stage were and and the statistical indices of the GEP model at the same stage were and . Yonesi *et al.* (2022) developed the MARS, GMDH, and MLPNN models to estimate discharge in such waterways. Taylor's diagram was used to compare these models with those used in this study (SVR, NF-GMDH, and M5 algorithms). The study of the structure of the MARS and GMDH models showed that the most important parameters are , , , , and *θ*, which were confirmed by the Gamma test, and the structure obtained by the M5 model.

## CONCLUSIONS

In this research, the flow discharge in compound open channels with convergent and divergent floodplains was modeled and estimated using soft computing models including SVR, NF-GMDH, and M5 models. For this purpose, the geometric and hydraulic characteristics of the flow including relative roughness, relative area, relative hydraulic radius, relative dimensions of the flow aspects, relative width, relative depth, relative longitudinal distance, convergence or divergence angle, and longitudinal bed slope were used. The performance of the models was then compared with the ANFIS, MARS, MLPNN, and GEP models (developed by previous researchers). Two scenarios were considered for modeling and estimating flow in such watercourses. The first scenario included the development of the mentioned models based on all involved parameters and the second scenario included the development of the models based on the effective parameters. The results of this research showed that the error statistical indices of the NF-GMDH, SVR, and M5 models based on the first scenario at the testing stage are and , and , and and . Examination of the structure of MARS and GMDH, M5, and Gamma test models developed in the current research or previous research showed that the most important parameters involved in the estimation of discharge are the relative roughness, the relative depth, the relative radius, the ratio of flow dimensions aspect, and the angle of convergence or divergence of floodplains.

## ACKNOWLEDGEMENTS

We are grateful to the Research Council of Urmia University.

## DATA AVAILABILITY STATEMENT

All relevant data are available from https://cdnsciencepub.com/doi/abs/10.1139/cjce-2018-0038.

## CONFLICT OF INTEREST

The authors declare there is no conflict.