## Abstract

In this study, the least square support vector machines (LS-SVM) method was used to predict the longitudinal dispersion coefficient (D_{L}) in natural streams in comparison with the empirical equations in various datasets. To do this, three datasets of field data including hydraulic and geometrical characteristics of different rivers, with various statistical characteristics, were applied to evaluate the performance of LS-SVM and 15 empirical equations. The LS-SVM was evaluated and compared with developed empirical equations using statistical indices of root mean square error (RMSE), standard error (SE), mean bias error (MBE), discrepancy ratio (DR), Nash-Sutcliffe efficiency (NSE) and coefficient of determination (R^{2}). The results demonstrated that LS-SVM method has a high capability to predict the D_{L} in different datasets with RMSE = 58–82 m^{2} s^{−1}, SE = 24–39 m^{2} s^{−1}, MBE = −1.95–2.6 m^{2} s^{−1}, DR = 0.08–0.13, R^{2} = 0.76–0.88, and NSE = 0.75–0.87 as compared with previous empirical equations. It can be concluded that the proposed LS-SVM model can be successfully applied to predict the D_{L} for a wide range of river characteristics.

## HIGHLIGHTS

Least square support vector machines and 15 empirical equations were selected to predict longitudinal dispersion coefficient in natural streams.

Experimental datasets, consisting of the depth, width, mean velocity, shear velocity, and the longitudinal dispersion coefficient from various streams, were used from around the world.

Comprehensive statistical analysis was performed to evaluate the applied model accuracy.

## ABBREVIATIONS

Measured longitudinal dispersion

Predicted longitudinal dispersion

*C*Cross-sectional average concentration

*D*_{L}Longitudinal dispersion coefficient

- DR
Discrepancy ratio

*g*Acceleration due to gravity

- G1-65
Group 1 data including 65 data sets

- G2-116
Group 2 data including 116 data sets

- G3-188
Group 3 data including 188 data sets

*H*Mean cross-sectional depth

- LIU
- LS-SVM
Least square support vector machines

- MAE
mean absolute error

- MBE
Mean bias error

*N*Number of observations

- NSE
Nash-Sutcliffe efficiency

- POMGGP
Pareto-optimal-multigene genetic programming

*R*Hydraulic radius

*R*^{2}Coefficient of determination

- RMSE
Root mean square error

*S*Slope of the total energy line in downstream direction

*t*Time of observation

*U*Mean longitudinal velocity

*U*^{*}bottom shear friction velocity

*x*Longitudinal distance

## INTRODUCTION

*D*

_{L}) is a pre-condition for the solution of the advection-dispersion equation (ADE). The decision makers in field of environmental science consider the

*D*as the essential and critical parameter to study the longitudinal transport of pollution in the river systems. Accurate estimation of

_{L}*D*is required in several practical applications such as river engineering, design of water intakes, environmental engineering, and assessing the injection of hazardous contaminants in rivers (Alizadeh

_{L}*et al.*2017a; Memarzadeh

*et al.*2020). As a pioneering study (Taylor 1953, 1954) presented the concept of the longitudinal dispersion coefficient. He derived the following equation for one-dimension dispersion in a laminar pipe flow:where

*C*is the cross-sectional averaged pollutant concentration,

*t*is the time of observation,

*U*is cross-sectional averaged velocity,

*x*is the distance from the injection point, and

*D*is the longitudinal dispersion coefficient of the pollutant.

_{L}Until now, many analytical and numerical solutions have been developed for ADE (Equation (1)) with different boundary conditions. It was generally found that the value of *D _{L}* can be estimated by the pollutant concentration profile, stream velocity profile, or channel and flow parameters.

In general, studies on the prediction of the longitudinal dispersion coefficient could be divided into four main categories: tracing experiments, empirical equations, artificial intelligence (AI), and combining machine learning and evolutionary algorithms.

Prediction of *D _{L}* using trace experiments has been performed by many researchers (Palancar

*et al.*2003; Seo & Baek 2004; Disley

*et al.*2015). Despite the high accuracy of these methods in the prediction of the

*D*, there are many limitations for these methods such as non-uniform flow characteristics along the river and high variations of velocity and concentration in width and depth of flow; these methods are also costly and time-consuming in the field and experimental studies.

_{L}In the last six decades, since Elder (1959) presented the first empirical equations to predict the *D _{L}*, many researchers have proposed many methods for a high accurate prediction of the

*D*in natural steams. These methods include analytical, numerical and data driven methods. On the other hand, there has been a great deal of research over the last decade on machine learning approaches to predict the

_{L}*D*.

_{L}Fisher (1968) proposed the routing approach using an analytical solution of Equation (1) by numerical integration. Subsequently, the numerical solutions were applied for the prediction of *D _{L}* (Ramezani

*et al.*2019).

Since the publication of Fischer (1975), new and highly accurate equations have been introduced for determining *D _{L}* (Deng

*et al.*2002; Kashefipour & Falconer 2002; Sahay & Dutta 2009; Etemad-Shahidi & Taghipour 2012; Sattar & Gharabaghi 2015; Haghiabi 2016; Wang & Huai 2016; Alizadeh

*et al.*2017b).

In this study, by reviewing the literature, a complete set of empirical equations for predicting *D _{L}* were selected (Table 1) and their results were compared with the least square support vector machine (LS-SVM) method.

Model . | Developed equation . | #dataset . | Abbreviation . |
---|---|---|---|

Elder (1959) | – | EL | |

Fischer (1975) | – | FI | |

Liu (1977) | – | LIU | |

Seo & Cheong (1998) | 59 | SC | |

Deng et al. (2001) | – | DE | |

Kashefipour & Falconer (2002) | 81 | KF | |

Sahay & Dutta (2009) | 65 | SD | |

Etemad-Shahidi & Taghipour (2012) | 149 | ET | |

Li et al. (2013) | 65 | LI | |

Zeng & Huai (2014) | 116 | ZH | |

Disley et al. (2015) | 56 | DI | |

Sattar & Gharabaghi (2015) | 150 | SG | |

Wang et al. (2017) | 116 | WA | |

Alizadeh et al. (2017a) | 164 | AL | |

Riahi-Madvar et al. (2019) | 503 | RI |

Model . | Developed equation . | #dataset . | Abbreviation . |
---|---|---|---|

Elder (1959) | – | EL | |

Fischer (1975) | – | FI | |

Liu (1977) | – | LIU | |

Seo & Cheong (1998) | 59 | SC | |

Deng et al. (2001) | – | DE | |

Kashefipour & Falconer (2002) | 81 | KF | |

Sahay & Dutta (2009) | 65 | SD | |

Etemad-Shahidi & Taghipour (2012) | 149 | ET | |

Li et al. (2013) | 65 | LI | |

Zeng & Huai (2014) | 116 | ZH | |

Disley et al. (2015) | 56 | DI | |

Sattar & Gharabaghi (2015) | 150 | SG | |

Wang et al. (2017) | 116 | WA | |

Alizadeh et al. (2017a) | 164 | AL | |

Riahi-Madvar et al. (2019) | 503 | RI |

*Note*: *B*, *H*, *U*, *U**, *g*, *F _{r}*, and

*D*are, respectively, width, depth, cross-sectional averaged velocity, shear velocity, acceleration due to gravity, Froude number and longitudinal dispersion coefficient. In the equation of Table 1, , , , and .

_{L}In recent decades, artificial intelligence approaches have been used for predicting the *D _{L}*. For example, artificial neural network (ANN) (Toprak & Cigizoglu 2008; Alizadeh

*et al.*2017b; Riahi-Madvar

*et al.*2019), adaptive neuro-fuzzy inference system (ANFIS) (Riahi-Madvar

*et al.*2009; Noori

*et al.*2016), support vector machine (Noori 2009; Azamathulla & Wu 2011) and genetic programming (Sahay & Dutta 2009; Azamathulla & Ghani 2011) were used. These researches found satisfactory results in predicting the longitudinal dispersion coefficient.

On the other hand, such new methods as AI (Toprak & Cigizoglu 2008; Azamathulla & Wu 2011; Etemad-Shahidi & Taghipour 2012; Noori *et al.* 2016), evolutionary algorithms (Sahay & Dutta 2009; Li *et al.*, 2013; Sattar & Gharabaghi 2015; Alizadeh *et al.* 2017b; Riahi-Madvar *et al.* 2019) and hybrid models (Najafzadeh & Tafarojnoruz 2016; Alizadeh *et al.* 2017a; Seifi & Riahi-Madvar 2019) have been used for predicting the *D _{L}*.

It should be mentioned that previous research has mainly focused on developing a highly accurate method to predict the *D _{L}* in a special dataset, and the accuracy of each method has not been investigated in different datasets. There are two general aspects for the prediction of

*D*: (1) the method used for prediction and (2) the datasets for evaluating the accuracy of the used method.

_{L}In this study, the LS-SVM method was used to predict the *D _{L}* in natural streams. Furthermore, the results were compared with empirical equations (Table 1) in the various datasets. Although the new methods have increased the prediction accuracy of the longitudinal dispersion coefficient, the increase in accuracy greatly depends on the statistical characteristics of the used datasets. Therefore, one of the aims of this paper is to evaluate the accuracy of different methods for different datasets and to compare the results of previous methods with the new methods.

## MATERIAL AND METHODS

### Experimental data

To evaluate the accuracy of the empirical equations and LS-SVM models for the prediction of *D _{L}*, a wide range of longitudinal dispersion laboratory and field data has been collected from the different sources (Toprak & Cigizoglu 2008; Sahay & Dutta 2009; Li

*et al.*, 2013; Zeng & Huai 2014; Alizadeh

*et al.*2017a; Wang

*et al.*2017). The details of these datasets are presented in the appendix Table 8. Depending on the literature, these datasets were divided into the following three categories: (i) Table 8, rows 1–65 ‘G1-65' (Sahay & Dutta 2009), (ii) Table 8, rows 1–116 ‘G2-116’ (Zeng & Huai 2014), and (iii) Table 8, rows 1–188 ‘G3-188’ (Alizadeh

*et al.*2017b). The goal of data categorization is to evaluate the accuracy of models in various datasets to predict the value of

*D*.

_{L}The studied parameters consist of the depth, *H* (m), the width, *B* (m), the mean velocity, *U* (m s^{−1}), the shear velocity of the flow, *U** (m s^{−1}) and the longitudinal dispersion coefficient, *D _{L}* (m

^{2}s

^{−1}) (Table 8). There are two reasons for selecting these datasets: 1) They have been used by many researchers (Seo & Cheong 1998; Deng

*et al.*2001; Kashefipour & Falconer 2002; Toprak & Cigizoglu 2008; Sahay & Dutta 2009; Etemad-Shahidi & Taghipour 2012; Li

*et al.*, 2013; Zeng & Huai 2014; Disley

*et al.*2015; Wang & Huai 2016; Alizadeh

*et al.*2017a), therefore, the results will be comparable with those obtained by other researchers; and 2) These data represent a wide range of geometrical (B, H) and hydrodynamic (U, U*) parameters of natural streams. Differences in statistical properties across different datasets can indicate the ability of different methods to estimate longitudinal dispersion coefficients. In the following, the datasets used in the present study are analyzed in terms of statistical properties, frequency ratios and regression relationships.

### Statistical characteristics of data groups

Statistical characteristics of the studied data groups (G1-65, G2-116 and G3-188), such as minimum (Min), maximum (Max), average (AVG), standard deviation (STD), coefficient of variation (CV), skewness (SKW) and kurtosis (KUT) of the datasets are shown in Table 2.

Data category . | Par. . | B
. | H
. | U
. | U*
. | D_{L}
. | B/H
. | U/U*
. | D/_{L}HU*
. |
---|---|---|---|---|---|---|---|---|---|

G1-65 | Min-Max | 11.9–202.7 | 0.2–4.0 | 0.0–1.7 | 0.0–0.6 | 1.9–836.1 | 13.6–151.1 | 1.3–17.0 | 5.9–8625.0 |

AVG | 53.80 | 1.24 | 0.49 | 0.10 | 80.51 | 49.12 | 6.80 | 1083.17 | |

STD | 47.30 | 0.99 | 0.35 | 0.09 | 131.82 | 29.89 | 3.70 | 1408.53 | |

CV | 0.90 | 0.79 | 0.71 | 1.01 | 1.64 | 0.61 | 0.54 | 1.30 | |

SKW | 1.70 | 1.23 | .162 | 3.82 | .386 | 1.45 | 1.12 | 3.32 | |

KUT | 2.50 | 0.57 | .269 | 16.69 | .1799 | 1.89 | 0.79 | 13.54 | |

G2-116 | Min-Max | 11.9–711.2 | 0.2–19.9 | 0.0–1.7 | 0.0–1.0 | 1.9–1486.5 | 12.5–1000.0 | 0.2–62.9 | 6.2–40,183.9 |

AVG | .9090 | .176 | .056 | 0.09 | .13073 | .7267 | .955 | 1937.89 | |

STD | .11120 | .226 | .041 | .012 | .23060 | .13339 | .906 | 5267.28 | |

CV | 1.20 | .129 | .073 | .135 | 1.76 | 1.84 | .095 | 2.72 | |

SKW | 2.90 | .503 | 1.20 | 5.22 | 3.23 | 5.77 | 3.29 | 6.00 | |

KUT | 11.10 | .3642 | 0.55 | 32.43 | 12.55 | 34.20 | 14.70 | 38.71 | |

G3-188 | Min-Max | 1.4–711.2 | 0.1–19.9 | 0.0–1.7 | 0.0–1.0 | 0.2–1486.5 | 2.2–1000.0 | 0.2–62.9 | 3.1–40,183.9 |

AVG | 73.30 | 1.70 | 0.52 | 0.09 | 109.10 | 58.46 | 8.61 | 1533.40 | |

STD | 94.43 | 2.04 | 0.37 | 0.10 | 206.67 | 107.71 | 7.81 | 4484.85 | |

CV | .129 | 1.20 | 0.71 | 1.16 | .189 | .184 | 0.91 | .292 | |

SKW | .339 | 4.65 | 1.30 | .535 | 3.44 | .707 | .345 | .670 | |

KUT | .1586 | 34.58 | 1.28 | .3817 | 14.38 | .5352 | .1816 | .4959 |

Data category . | Par. . | B
. | H
. | U
. | U*
. | D_{L}
. | B/H
. | U/U*
. | D/_{L}HU*
. |
---|---|---|---|---|---|---|---|---|---|

G1-65 | Min-Max | 11.9–202.7 | 0.2–4.0 | 0.0–1.7 | 0.0–0.6 | 1.9–836.1 | 13.6–151.1 | 1.3–17.0 | 5.9–8625.0 |

AVG | 53.80 | 1.24 | 0.49 | 0.10 | 80.51 | 49.12 | 6.80 | 1083.17 | |

STD | 47.30 | 0.99 | 0.35 | 0.09 | 131.82 | 29.89 | 3.70 | 1408.53 | |

CV | 0.90 | 0.79 | 0.71 | 1.01 | 1.64 | 0.61 | 0.54 | 1.30 | |

SKW | 1.70 | 1.23 | .162 | 3.82 | .386 | 1.45 | 1.12 | 3.32 | |

KUT | 2.50 | 0.57 | .269 | 16.69 | .1799 | 1.89 | 0.79 | 13.54 | |

G2-116 | Min-Max | 11.9–711.2 | 0.2–19.9 | 0.0–1.7 | 0.0–1.0 | 1.9–1486.5 | 12.5–1000.0 | 0.2–62.9 | 6.2–40,183.9 |

AVG | .9090 | .176 | .056 | 0.09 | .13073 | .7267 | .955 | 1937.89 | |

STD | .11120 | .226 | .041 | .012 | .23060 | .13339 | .906 | 5267.28 | |

CV | 1.20 | .129 | .073 | .135 | 1.76 | 1.84 | .095 | 2.72 | |

SKW | 2.90 | .503 | 1.20 | 5.22 | 3.23 | 5.77 | 3.29 | 6.00 | |

KUT | 11.10 | .3642 | 0.55 | 32.43 | 12.55 | 34.20 | 14.70 | 38.71 | |

G3-188 | Min-Max | 1.4–711.2 | 0.1–19.9 | 0.0–1.7 | 0.0–1.0 | 0.2–1486.5 | 2.2–1000.0 | 0.2–62.9 | 3.1–40,183.9 |

AVG | 73.30 | 1.70 | 0.52 | 0.09 | 109.10 | 58.46 | 8.61 | 1533.40 | |

STD | 94.43 | 2.04 | 0.37 | 0.10 | 206.67 | 107.71 | 7.81 | 4484.85 | |

CV | .129 | 1.20 | 0.71 | 1.16 | .189 | .184 | 0.91 | .292 | |

SKW | .339 | 4.65 | 1.30 | .535 | 3.44 | .707 | .345 | .670 | |

KUT | .1586 | 34.58 | 1.28 | .3817 | 14.38 | .5352 | .1816 | .4959 |

*Note*: Bold and Italic numbers show maximums among the datasets.

It can be seen in Table 2 that the dataset ranges (Min-Max) in the G3-188 datasets are greater than in the other two datasets (G1-65 and G2-116). The values of AVG and STD in the G2-116 data category are higher than in the other two data groups (G1-65 and G3-188). The values of CV, for some parameters (H, U, U* and U/U*) in G2-116 and for some parameters (B, *D _{L}*, B/H and

*D*/HU*) in G3-188 are high. SKW and KUT for most of the parameters (e.g. B, U*, B/H, U/U* and

_{L}*D*/HU*) in the G3-188 data group are greater than in the other two data groups (G1-65 and G2-116).

_{L}### Frequency analysis

In order to provide an accurate interpretation of the data, the histograms of the ratios of B/H, U/U* and *D _{L}*/HU* are shown in Figure 1(a)–1(c) for three datasets.

Figure 1 shows the variations of the frequency percentages for the ratios of B/H, U/U*, and *D _{L}*/HU* for the studied datasets. As can be seen the values of B/H, U/U* and

*D*/HU* for about 30, 40, and 50% of the datasets are ranged between 30–50, 5–10 and 200–1000, respectively. About 70% of the collected dataset (Table 8) were used for training (choose randomly until the best training performance was obtained), while the remaining datasets (about 30% of the data) were used for testing the models.

_{L}### Regression analysis

The relationships between the parameter *D _{L}*/HU* versus B/H, U/ U* and F

_{r}are investigated in G1-65, G2-116 and G3-188 data groups in Figure 2(a)–2(c), respectively. The value of the coefficient of determination (R

^{2}) for the relationships between

*D*/HU* and U/U* is 0.253 (Figure 2(a)), indicating maximum correlation among G1-65 data sets.

_{L}In addition, as can be seen in Figure 2(b), the maximum value of R^{2} (=0.351) in G2-116 datasets was observed between the parameters of *D _{L}*/HU* and B/H. It should be mentioned that the relationship between

*D*/HU* and F

_{L}_{r}in G1-65 datasets is very poor. In general, the relationships between

*D*/HU* and B/H, U/U* and F

_{L}_{r}in all of the datasets are not strong and the methods that can estimate

*D*were accurately appropriate to any datasets.

_{L}### Least square support vector machines (LS-SVM)

Support vector machines were first used for classification (Vapnik 1999); then, another version of SVMs was proposed by (Drucker *et al.* 1997). In this method, a concept known as structural risk minimization is used to minimize the error of the model, while other methods (such as ANN) use the principles of empirical risk minimization (Cristianini *et al.* 2000; Dibike *et al.* 2001). In general, the SVM is used in two or more group classification problems and regression analysis. In the SVM model, quadratic programming is used to solve the equations, making the problem complex and time-consuming (Seyedzadeh *et al.* 2019).

One of the factors affecting the prediction accuracy in the LS-SVM model is the selection of an appropriate kernel function. In this study, radial basis functions (RBF) were investigated. Figure 3 shows the schematic view of the flowchart used for the LS-SVM model. Table 3 presents the LS-SVM characteristics applied in the present study.

Input variables . | Width (B), Depth (H), Velocity (U), and Shear velocity (U*) of the flow
. |
---|---|

Target variable | Longitudinal dispersion coefficient (D) _{L} |

Function estimation | Gaussian |

Kernel function | Radial basic function (RBF) |

Tuning parameters (γ, σ^{2}) | γ = 10 and σ^{2} = 0.2 |

Selection function | Randomize selection (Randper's function) |

Datasets ratio in train and test phases | 70 and 30% |

Input variables . | Width (B), Depth (H), Velocity (U), and Shear velocity (U*) of the flow
. |
---|---|

Target variable | Longitudinal dispersion coefficient (D) _{L} |

Function estimation | Gaussian |

Kernel function | Radial basic function (RBF) |

Tuning parameters (γ, σ^{2}) | γ = 10 and σ^{2} = 0.2 |

Selection function | Randomize selection (Randper's function) |

Datasets ratio in train and test phases | 70 and 30% |

### Evaluation performance criteria

Until now, a lot of performance criteria have been used to evaluate the results of the model for prediction of *D _{L}*. In the present study, six statistical criteria, root mean square error (RMSE), standard error (SE), mean bias error (MBE), discrepancy ratio (DR), R

^{2}, and Nash-Sutcliffe efficiency (NSE), were applied to evaluate the accuracy of each model in predicting the longitudinal dispersion coefficient. The explanations of the indices used are presented in Table 4.

Criteria . | Formula . | References . | Best value . |
---|---|---|---|

Standard error (SE) | Alizadeh et al. (2017a) | 0 | |

Root mean square error (RMSE) | Ma & Iqbal (1984) | 0 | |

Mean bias error (MBE) | Ma & Iqbal (1984) | 0 | |

Discrepancy ratio (DR) | Zeng & Huai (2014) | 0 | |

Nash-Sutcliffe efficiency (NSE) | Alizadeh et al. (2017a) | 1 | |

Coefficient of determination (R^{2}) | Behar et al. (2015) | 1 |

Criteria . | Formula . | References . | Best value . |
---|---|---|---|

Standard error (SE) | Alizadeh et al. (2017a) | 0 | |

Root mean square error (RMSE) | Ma & Iqbal (1984) | 0 | |

Mean bias error (MBE) | Ma & Iqbal (1984) | 0 | |

Discrepancy ratio (DR) | Zeng & Huai (2014) | 0 | |

Nash-Sutcliffe efficiency (NSE) | Alizadeh et al. (2017a) | 1 | |

Coefficient of determination (R^{2}) | Behar et al. (2015) | 1 |

In addition to statistical indices presented in Table 4, the model performance was assessed using predicted and measured *D _{L}* values to calculate the standard deviation (STD), centered root mean square difference (RMSD), and correlation coefficient (R), as summarized by the Taylor diagram (Taylor 2001).

## RESULTS AND DISCUSSION

### Results of LS-SVM

The results of predicting the value of *D _{L}* using LS-SVM model in 10 consecutive runs are shown in Table 5. In addition, the best results among 10 consecutive runs of LS-SVM models for G1-65 datasets are presented in Figure 4.

Data group . | Eq. Abb. . | RMSE . | SE . | MBE . | DR . | R^{2}
. | NSE . |
---|---|---|---|---|---|---|---|

G1-65 | KF | 51 | 35 | −10.7 | −0.16 | 0.85 | 0.85 |

SD | 46 | 31 | 8.9 | .003 | .090 | .088 | |

ET | 68 | 38 | −13.2 | −0.09 | 0.76 | 0.73 | |

LI | 39 | 26 | − 4.3 | −0.07 | .091 | .091 | |

ZH | 52 | 33 | −10.9 | −0.06 | 0.88 | 0.84 | |

DI | 63 | 33 | −10.1 | .000 | 0.84 | 0.76 | |

SG | 52 | 32 | .− 06 | 0.03 | 0.85 | 0.84 | |

WA | 76 | 40 | −21.9 | −0.12 | 0.77 | 0.66 | |

AL | 49 | 32 | −8.3 | −0.15 | 0.88 | 0.86 | |

LS-SVM | 58 | 24 | .− 06 | 0.10 | 0.76 | 0.75 | |

G2-116 | KF | 252 | 94 | 34.9 | −0.06 | 0.46 | −0.20 |

SD | 316 | 116 | 80.7 | 0.14 | 0.43 | −0.90 | |

ET | 190 | 81 | −9.5 | .− 003 | 0.40 | 0.31 | |

LI | 274 | 97 | 55.5 | 0.05 | 0.46 | −0.43 | |

ZH | 201 | 80 | 9.5 | .003 | 0.44 | 0.23 | |

DI | 239 | 93 | .67 | 0.07 | 0.27 | −0.09 | |

SG | 285 | 100 | 37.5 | 0.11 | 0.31 | −0.54 | |

WA | 180 | 78 | −27.1 | −0.05 | 0.41 | .039 | |

AL | 283 | 95 | 45.7 | −0.08 | .048 | −0.52 | |

LS-SVM | 82 | 39 | .26 | 0.08 | .088 | .087 | |

G3-188 | KF | 213 | 86 | 25.1 | .001 | 0.45 | −0.07 |

SD | 257 | 92 | 49.1 | 0.15 | 0.43 | −0.56 | |

ET | 167 | 69 | −16.1 | .000 | 0.40 | 0.34 | |

LI | 226 | 80 | 31.4 | 0.07 | 0.46 | −0.20 | |

ZH | 172 | 68 | .06 | 0.07 | 0.44 | 0.31 | |

DI | 201 | 78 | .09 | 0.14 | 0.29 | 0.05 | |

SG | 236 | 87 | 27.9 | 0.19 | 0.32 | −0.31 | |

WA | 161 | 71 | −21.9 | 0.03 | 0.41 | .039 | |

AL | 234 | 77 | 20.6 | −0.07 | .047 | −0.29 | |

LS-SVM | 76 | 33 | − 1.95 | 0.13 | .087 | .086 |

Data group . | Eq. Abb. . | RMSE . | SE . | MBE . | DR . | R^{2}
. | NSE . |
---|---|---|---|---|---|---|---|

G1-65 | KF | 51 | 35 | −10.7 | −0.16 | 0.85 | 0.85 |

SD | 46 | 31 | 8.9 | .003 | .090 | .088 | |

ET | 68 | 38 | −13.2 | −0.09 | 0.76 | 0.73 | |

LI | 39 | 26 | − 4.3 | −0.07 | .091 | .091 | |

ZH | 52 | 33 | −10.9 | −0.06 | 0.88 | 0.84 | |

DI | 63 | 33 | −10.1 | .000 | 0.84 | 0.76 | |

SG | 52 | 32 | .− 06 | 0.03 | 0.85 | 0.84 | |

WA | 76 | 40 | −21.9 | −0.12 | 0.77 | 0.66 | |

AL | 49 | 32 | −8.3 | −0.15 | 0.88 | 0.86 | |

LS-SVM | 58 | 24 | .− 06 | 0.10 | 0.76 | 0.75 | |

G2-116 | KF | 252 | 94 | 34.9 | −0.06 | 0.46 | −0.20 |

SD | 316 | 116 | 80.7 | 0.14 | 0.43 | −0.90 | |

ET | 190 | 81 | −9.5 | .− 003 | 0.40 | 0.31 | |

LI | 274 | 97 | 55.5 | 0.05 | 0.46 | −0.43 | |

ZH | 201 | 80 | 9.5 | .003 | 0.44 | 0.23 | |

DI | 239 | 93 | .67 | 0.07 | 0.27 | −0.09 | |

SG | 285 | 100 | 37.5 | 0.11 | 0.31 | −0.54 | |

WA | 180 | 78 | −27.1 | −0.05 | 0.41 | .039 | |

AL | 283 | 95 | 45.7 | −0.08 | .048 | −0.52 | |

LS-SVM | 82 | 39 | .26 | 0.08 | .088 | .087 | |

G3-188 | KF | 213 | 86 | 25.1 | .001 | 0.45 | −0.07 |

SD | 257 | 92 | 49.1 | 0.15 | 0.43 | −0.56 | |

ET | 167 | 69 | −16.1 | .000 | 0.40 | 0.34 | |

LI | 226 | 80 | 31.4 | 0.07 | 0.46 | −0.20 | |

ZH | 172 | 68 | .06 | 0.07 | 0.44 | 0.31 | |

DI | 201 | 78 | .09 | 0.14 | 0.29 | 0.05 | |

SG | 236 | 87 | 27.9 | 0.19 | 0.32 | −0.31 | |

WA | 161 | 71 | −21.9 | 0.03 | 0.41 | .039 | |

AL | 234 | 77 | 20.6 | −0.07 | .047 | −0.29 | |

LS-SVM | 76 | 33 | − 1.95 | 0.13 | .087 | .086 |

Note: Rank 1 (Bold, Italic, and Underline), Rank 2 (Bold and Italic) and Rank 3 (Bold).

As can be seen in Figure 4(a), the accuracy of the LS-SVM model in the training phase is better than in the test phase. Figure 4(b) shows obviously that the error in the prediction of *D _{L}* in the test phase of the model is greater than in the training phase. Figure 5 shows the best results for the LS-SVM model in 10 consecutive runs of the model for G2-116 datasets.

As can be seen in Figure 5(a), based on the values of RMSE and R^{2}, the accuracy of the model in the training phase is better than in the testing phase; however, the value of R^{2} in G2-116 datasets in the testing stage is higher than that of the G1-65 datasets. Figure 5(b) shows that errors in the prediction of *D _{L}* are similar in train and test phases (G2-116 datasets). Furthermore, the best results of the LS-SVM model in 10 consecutive runs for G3-188 datasets are shown in Figure 6.

Figure 6(a) shows similar results to those in Figures 4(a) and 5(a). In general, as can be seen in Figures 4–6, the LS-SVM model has an appropriate performance in the prediction of longitudinal dispersion coefficient for all datasets. In addition, the results show that as the number of datasets increases, the value of the R^{2} increases, especially at the testing phase, so that the value of R^{2} increases from 0.46 in the G1-65 datasets to 0.81 in the G3-188 datasets.

### Comparing the results of models

The statistical criteria were calculated for the top 10 empirical equations and the results are presented in Table 5. In addition, the results of the LS-SVM model were calculated based on the average of 10 consecutive runs.

Table 5 shows the statistical criteria for the 10 top ranks of empirical equations, as previously listed in Table 1. The results show that six empirical equations including Elder (1959), Fischer (1975), Liu (1977), Seo & Cheong (1998), Deng *et al.* (2001) and Riahi-Madvar *et al.* (2019) do not exist in Table 5, because of poor results. The results in Table 5 can be interpreted from two different aspects: (1) the value of statistical indices in different datasets and (2) the best equations in different datasets.

The best statistical criteria in Table 5 were obtained for G1-65 datasets, so that the average values of RMSE and R^{2} for G1-65 datasets are 55 and 0.84, while the criteria were computed as 230 and 0.45 for G2-116 datasets, respectively. The results indicate that the statistical indices of average and standard deviation have significant impacts on the prediction of *D _{L}*. The reason is that based on Table 5, the AVG and STD of G2-116 datasets are higher than those of the other two datasets. Another reason for the low accuracy of the empirical equations in the

*D*prediction in G2-116 datasets is the low correlation of the

_{L}*D*with the river and flow characteristics (Figure 2(b)).

_{L}The best equations in G1-65 datasets (Table 5) are Li *et al.*, (2013) (LI) and Sahay & Dutta (2009) (SD); the main reason is that these equations have been extracted by the same datasets (G1-65).

In G2-116 and G3-188 datasets, the best results are related to the LS-SVM method, so that LS-SVM in G2-116 and G3-188 datasets obtained the best values of RMSE, SE, MBE, DR, R^{2} and NSE. After the LS-SVM method, Etemad-Shahidi & Taghipour (2012) (ET), Zeng & Huai (2014) (ZH) and Wang *et al.* (2017) (WA) are the most accurate methods. The error indices (RMSE and SE) in the prediction of *D _{L}* using empirical equations and the LS-SVM method are shown in Figure 7.

As can be seen in Figure 7(a), the LS-SVM method has the lowest SE (SE = 22) for the prediction of *D _{L}* with G1-65 datasets, although Li

*et al.*(2013) (LI) equations (with RMSE = 39) achieved the lowest RMSE in these datasets. In addition, Figure 7(b) shows that the LS-SVM method, with RMSE = 62 and SE = 36, has the best performance in the prediction of

*D*in G1-118 datasets. Furthermore, as is obvious in Figure 7(c), the LS-SVM method (with RMSE = 60 and SE = 31) has the best results for the prediction of

_{L}*D*in G3-188 datasets.

_{L}Figure 7(b) and 7(c) illustrate the low accuracy of the empirical equations in predicting the longitudinal dispersion coefficient, especially in the wide rivers (three measurements at the Mississippi River, rows 91–93, in the appendix Table 8). This shows that the empirical equations are inappropriate for the prediction of *D _{L}* on the wide rivers. Figure 8 shows the variation of dispersion coefficient according to the river width for three datasets.

As can be seen in Figure 8(a)–8(c), the accuracy of the prediction of LS-SVM improved by increasing the river width. On the other hand, Figure 8(c) shows that the lowest accuracy of LS-SVM in predicting the longitudinal dispersion coefficient is related to the lowest measured values (rows No. 117, 143 and G3-188 in the appendix Table 8). For the purpose of further comparison, the percentage of the DR values in different ranges of discrepancy is shown in Figure 9.

The accuracy of each model may be categorized by the number of DR values between −0.3 and 0.3, relative to the total number of datasets. This range was selected because the maximum acceptable error in predicting *D _{L}* by the corresponding measured values is ±100% or (Kashefipour & Falconer 2002). As can be seen in Figure 9(a)–9(c), LS-SVM method obtained the best accuracy, so that in G1-65, G2-116 and G3-188 datasets about 83, 72 and 70% of DR values are in the range of −0.3 and 0.3, respectively. Figure 10(a)–10(c) shows the consistency of the measured and predicted values of longitudinal dispersion coefficients in G1-65, G2-116, and G3-188 datasets, based on the values of R

^{2}and NSE.

It can be generally observed from Figure 10(a)–10(c) that the LS-SVM model yields suitable predictions of the measured data. The best correlation between measured and predicted values of *D _{L}* can be achieved when the LS-SVM method is applied to predict the value of

*D*equal to 0.92, 093 and 0.92 in G1-65, G2-116 and G3-188 datasets, respectively.

_{L}The Taylor diagram (Figure 11) was used to compare different performance indices. According to Taylor's diagram, the closer the model to the measured point, the more accurate the model will be in predicting the longitudinal dispersion coefficient.

Figure 11(b) and 11(c) indicate that the LS-SVM method considerably increased the accuracy of the prediction of *D _{L}*. Furthermore, the efficiencies of previous equations can be ranked as follows: LS-SVM, Etemad-Shahidi & Taghipour (2012); Zeng & Huai (2014); Wang

*et al.*(2017); Alizadeh

*et al.*(2017a) and Li

*et al.*(2013). The superiority of the LS-SVM method compared to the previous equations is clear from Taylor's diagram.

For comparing the model results for *D _{L}* estimation from narrow rivers (B/H < 10) to very wide rivers (B/H > 100), the statistical criteria including RMSE, MBE, and R

^{2}were calculated based on five categories of aspect ratio (B/H < 10; 10 < B/H < 30; 30 < B/H < 50; 50 < B/H < 100 and B/H > 100). The results of the 10 superior models based on aspect ratio ranges are illustrated in Table 6.

Aspect r. . | B/H < 10 . | 10 < B/H < 30 . | 30 < B/H < 50 . | 50 < B/H < 100 . | B/H > 100 . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

models . | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. |

KF | 201 | 166 | 0.34 | 110 | 20 | 0.16 | 225 | 21 | 0.17 | 303 | 36 | 0.53 | 123 | −12 | 0.68 |

SD | 28 | 7 | 0.95 | 91 | −3 | 0.31 | 178 | 16 | 0.20 | 368 | 94 | 0.45 | 471 | 253 | 0.97 |

ET | 14 | 8 | 0.99 | 107 | −21 | 0.13 | 136 | 2 | 0.22 | 255 | −42 | 0.41 | 113 | 20 | 0.74 |

LI | 40 | 15 | 0.90 | 93 | −5 | 0.29 | 178 | 8 | 0.19 | 337 | 68 | 0.48 | 318 | 149 | 0.94 |

ZH | 166 | −135 | 0.96 | 55 | −24 | 0.30 | 97 | −21 | 0.21 | 165 | −48 | 0.44 | 105 | 71 | 0.85 |

DI | 81 | 63 | 0.90 | 87 | 4 | 0.37 | 227 | 20 | 0.13 | 279 | −37 | 0.33 | 152 | 34 | 0.69 |

SG | 196 | 162 | 0.82 | 105 | 29 | 0.26 | 283 | 31 | 0.12 | 308 | 15 | 0.39 | 185 | 19 | 0.58 |

WA | 166 | 128 | 0.94 | 94 | 1 | 0.27 | 126 | −12 | 0.22 | 254 | −80 | 0.43 | 60 | 1 | 0.88 |

AL | 12 | − 6 | 0.99 | 97 | −24 | 0.33 | 234 | 24 | 0.17 | 352 | 75 | 0.52 | 175 | 26 | 0.73 |

LS-SVM | 26 | 10 | 0.97 | 47 | 4 | 0.87 | 41 | − 1 | 0.92 | 89 | − 14 | 0.93 | 50 | 4 | 0.91 |

Aspect r. . | B/H < 10 . | 10 < B/H < 30 . | 30 < B/H < 50 . | 50 < B/H < 100 . | B/H > 100 . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

models . | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. | RMSE . | MBE . | R^{2}
. |

KF | 201 | 166 | 0.34 | 110 | 20 | 0.16 | 225 | 21 | 0.17 | 303 | 36 | 0.53 | 123 | −12 | 0.68 |

SD | 28 | 7 | 0.95 | 91 | −3 | 0.31 | 178 | 16 | 0.20 | 368 | 94 | 0.45 | 471 | 253 | 0.97 |

ET | 14 | 8 | 0.99 | 107 | −21 | 0.13 | 136 | 2 | 0.22 | 255 | −42 | 0.41 | 113 | 20 | 0.74 |

LI | 40 | 15 | 0.90 | 93 | −5 | 0.29 | 178 | 8 | 0.19 | 337 | 68 | 0.48 | 318 | 149 | 0.94 |

ZH | 166 | −135 | 0.96 | 55 | −24 | 0.30 | 97 | −21 | 0.21 | 165 | −48 | 0.44 | 105 | 71 | 0.85 |

DI | 81 | 63 | 0.90 | 87 | 4 | 0.37 | 227 | 20 | 0.13 | 279 | −37 | 0.33 | 152 | 34 | 0.69 |

SG | 196 | 162 | 0.82 | 105 | 29 | 0.26 | 283 | 31 | 0.12 | 308 | 15 | 0.39 | 185 | 19 | 0.58 |

WA | 166 | 128 | 0.94 | 94 | 1 | 0.27 | 126 | −12 | 0.22 | 254 | −80 | 0.43 | 60 | 1 | 0.88 |

AL | 12 | − 6 | 0.99 | 97 | −24 | 0.33 | 234 | 24 | 0.17 | 352 | 75 | 0.52 | 175 | 26 | 0.73 |

LS-SVM | 26 | 10 | 0.97 | 47 | 4 | 0.87 | 41 | − 1 | 0.92 | 89 | − 14 | 0.93 | 50 | 4 | 0.91 |

*Note*: bold numbers show the best results for each criterion between models.

Table 6 demonstrates that in narrow rivers (B/H < 10) the AL equation (Alizadeh *et al.* 2017a) was the most accurate model for *D _{L}* estimation with RMSE = 12 m

^{2}s

^{−1}, MBE = −6 m

^{2}s

^{−1}, and R

^{2}= 0.99. Except for narrow rivers (B/H < 10), in all aspect ratios, the LS-SVM model was the most accurate model for

*D*estimation in different ranges of aspect ratio. According to coefficient of determination (R

_{L}^{2}) from narrow to very wide rivers (R

^{2}= 0.87 (10 < B/H < 30) to 0.97 (B/H < 10)), the LS-SVM model has high performance in comparison with the other models. In wide rivers (50 < B/H < 100), the LS-SVM model provides the best results with RMSE = 89 m

^{2}s

^{−1}, MBE = −14 m

^{2}s

^{−1}and R

^{2}= 0.93. The comparison of predictions obtained by the predictive model of this study with the other existing models revealed that the LS-SVM model is superior to the other because it has the highest value of R

^{2}and the lowest value of the RMSE, especially in wide rivers (B/H > 50).

### Comparison with previous studies

Riahi-Madvar *et al.* (2019) presented the best equations for predicting the longitudinal dispersion coefficients as Pareto optimal multigene genetic programming (POMGGP) (Liu 1977; Sattar & Gharabaghi 2015; Alizadeh *et al.* 2017a). They concluded that the AL and SG equations were among the best equations in predicting longitudinal dispersion coefficient. Alizadeh *et al.* (2017a) reported frequency values of DR ranging from −0.3 to 0.3 for particle swarm optimization (PSO), ET and SG methods, respectively, in 63, 60 and 58% of datasets. In the present study, the AL and ET equations with values of 61% were the best empirical equations after the LS-SVM method with 72%. A comparison of the results of the present study with the previous studies is summarized in Table 7.

Researchers . | No. of datasets . | Methods . | RMSE . | SE . | Accuracy −0.3 < DR < 0.3 . | R^{2}
. | NSE . | d . | PA% . |
---|---|---|---|---|---|---|---|---|---|

Riahi-Madvar et al. (2019) | 503 | Pareto optimal multigene genetic programming (POMGGP) | 720 | 373 | – | 0.417 | 0.39 | 0.75 | – |

Alizadeh et al. (2017a) | 164 | Multi-objective particle swarm optimization(MOPSO) | 86.57 | 37.5 | 63 | 0.761 | 0.75 | – | – |

Sattar & Gharabaghi (2015) | 150 | Gene expression programming (GEP) | 464 | – | – | 0.80 | 0.80 | 0.93 | – |

Li et al. (2013) | 65 | Differential evolution (DE) | 38.96 | – | – | 0.913 | – | – | 56.92 |

Etemad-Shahidi & Taghipour (2012) | 149 | M5′ model tree (MT) | – | – | 63.1 | 0.36 | – | – | – |

Sahay & Dutta (2009) | 65 | Genetic algorithms (GA) | 45 | – | – | 0.902 | – | – | – |

The study | G1-65 | Least square support vector machines (LS-SVM) | 52 | 22 | 83 | 0.92 | 0.84 | 0.94 | 72 |

G2-116 | 62 | 36 | 72 | 0.93 | 0.94 | 0.98 | 63 | ||

G3-188 | 60 | 31 | 70 | 0.92 | 0.93 | 0.98 | 57 |

Researchers . | No. of datasets . | Methods . | RMSE . | SE . | Accuracy −0.3 < DR < 0.3 . | R^{2}
. | NSE . | d . | PA% . |
---|---|---|---|---|---|---|---|---|---|

Riahi-Madvar et al. (2019) | 503 | Pareto optimal multigene genetic programming (POMGGP) | 720 | 373 | – | 0.417 | 0.39 | 0.75 | – |

Alizadeh et al. (2017a) | 164 | Multi-objective particle swarm optimization(MOPSO) | 86.57 | 37.5 | 63 | 0.761 | 0.75 | – | – |

Sattar & Gharabaghi (2015) | 150 | Gene expression programming (GEP) | 464 | – | – | 0.80 | 0.80 | 0.93 | – |

Li et al. (2013) | 65 | Differential evolution (DE) | 38.96 | – | – | 0.913 | – | – | 56.92 |

Etemad-Shahidi & Taghipour (2012) | 149 | M5′ model tree (MT) | – | – | 63.1 | 0.36 | – | – | – |

Sahay & Dutta (2009) | 65 | Genetic algorithms (GA) | 45 | – | – | 0.902 | – | – | – |

The study | G1-65 | Least square support vector machines (LS-SVM) | 52 | 22 | 83 | 0.92 | 0.84 | 0.94 | 72 |

G2-116 | 62 | 36 | 72 | 0.93 | 0.94 | 0.98 | 63 | ||

G3-188 | 60 | 31 | 70 | 0.92 | 0.93 | 0.98 | 57 |

As can be seen in Table 7, LS-SVM methods have a robust capability to predict the longitudinal dispersion coefficient in natural streams on various and wide range of datasets.

## CONCLUSIONS

In this study, an LS-SVM method was applied for predicting the longitudinal dispersion coefficient in natural streams. The performance of the LS-SVM was evaluated for three datasets (G1-65, G2-116 and G3-188) with different hydraulic and geometrical characteristics. From each dataset, 70% of samples were used for the training phase, while the remaining samples (30%) were used for the testing phase. In addition, the performance of previous empirical equations was evaluated to predict the longitudinal dispersion coefficient by various collected datasets. The most important findings of the research can be summarized as follows:

The performance of the empirical equations depends significantly on the statistical properties of the datasets and most of the empirical equations showed many errors in datasets with high variations. For example, the ranges of RMSE are equal to 39–332 and 180–53,256, respectively, for the G1-65 and G2-116 datasets with the lowest and highest average and standard deviation of datasets for prediction of

*D*._{L}The LS-SVM method has a high capability in predicting the longitudinal dispersion coefficients in different datasets; however, when the low number of datasets was used for training and testing phases, the accuracy of this method is reduced, especially in the testing phase.

In general, the longitudinal dispersion coefficient

*D*predicted by Etemad-Shahidi & Taghipour (2012) (ET), Li_{L}*et al.*, (2013) (LI), Zeng & Huai (2014) (ZH), Alizadeh*et al.*(2017a) (AL), Elder (1959) (EL), Fischer (1975) (FI), and Liu (1977) (LIU) equations had, respectively, the best performance in all three datasets (G1-65, G2-116 and G3-188).Some empirical equations, such as ET and AL, have two different formulas based on the ratio of B/H and show the most accurate results in predicting longitudinal dispersion coefficients in different datasets.

The accuracy of the empirical equations in estimation of

*D*depends on the data series on which the equation is developed based on the data. For example, in G1-65 and G2-116 datasets, empirical equations such as Sahay & Dutta (2009) (SD), Li_{L}*et al.*(2013) (LI), Zeng & Huai (2014) (ZH), and Wang*et al.*(2017) (WA) had the best accuracy to estimate the value of*D*. This is because these equations are developed based on 65 and 116 data series._{L}The strengths and weaknesses of the LS-SVM method were respectively identified in the prediction of

*D*in wide rivers (B = 533–711 m) and low longitudinal dispersion (_{L}*D*= 0.2–0.5 m_{L}^{2}s^{−1}).

## CONFLICTS OF INTEREST

The authors declare no conflict of interest.

## CODE AVAILABILITY

The software used in this research will be available (by the corresponding author), upon reasonable request.

## AUTHORS' CONTRIBUTIONS

Mehdi Mohammadi Ghaleni, Mahmood Akbari, Saeed Sharafi and Mohammad Javad Nahvinia conceived the study. MMG led the data analysis and prepared all figures. MA, SS, and MJN wrote the paper. All authors reviewed the paper and contributed to the discussions.

## DATA AVAILABILITY STATEMENT

All relevant data are included in the paper or its Supplementary Information.