In this study, the least square support vector machines (LS-SVM) method was used to predict the longitudinal dispersion coefficient (DL) in natural streams in comparison with the empirical equations in various datasets. To do this, three datasets of field data including hydraulic and geometrical characteristics of different rivers, with various statistical characteristics, were applied to evaluate the performance of LS-SVM and 15 empirical equations. The LS-SVM was evaluated and compared with developed empirical equations using statistical indices of root mean square error (RMSE), standard error (SE), mean bias error (MBE), discrepancy ratio (DR), Nash-Sutcliffe efficiency (NSE) and coefficient of determination (R2). The results demonstrated that LS-SVM method has a high capability to predict the DL in different datasets with RMSE = 58–82 m2 s−1, SE = 24–39 m2 s−1, MBE = −1.95–2.6 m2 s−1, DR = 0.08–0.13, R2 = 0.76–0.88, and NSE = 0.75–0.87 as compared with previous empirical equations. It can be concluded that the proposed LS-SVM model can be successfully applied to predict the DL for a wide range of river characteristics.

  • Least square support vector machines and 15 empirical equations were selected to predict longitudinal dispersion coefficient in natural streams.

  • Experimental datasets, consisting of the depth, width, mean velocity, shear velocity, and the longitudinal dispersion coefficient from various streams, were used from around the world.

  • Comprehensive statistical analysis was performed to evaluate the applied model accuracy.

Measured longitudinal dispersion

Predicted longitudinal dispersion

C

Cross-sectional average concentration

DL

Longitudinal dispersion coefficient

DR

Discrepancy ratio

g

Acceleration due to gravity

G1-65

Group 1 data including 65 data sets

G2-116

Group 2 data including 116 data sets

G3-188

Group 3 data including 188 data sets

H

Mean cross-sectional depth

LS-SVM

Least square support vector machines

MAE

mean absolute error

MBE

Mean bias error

N

Number of observations

NSE

Nash-Sutcliffe efficiency

POMGGP

Pareto-optimal-multigene genetic programming

R

Hydraulic radius

R2

Coefficient of determination

RMSE

Root mean square error

S

Slope of the total energy line in downstream direction

t

Time of observation

U

Mean longitudinal velocity

U*

bottom shear friction velocity

x

Longitudinal distance

Prediction of the longitudinal dispersion coefficient (DL) is a pre-condition for the solution of the advection-dispersion equation (ADE). The decision makers in field of environmental science consider the DL as the essential and critical parameter to study the longitudinal transport of pollution in the river systems. Accurate estimation of DL is required in several practical applications such as river engineering, design of water intakes, environmental engineering, and assessing the injection of hazardous contaminants in rivers (Alizadeh et al. 2017a; Memarzadeh et al. 2020). As a pioneering study (Taylor 1953, 1954) presented the concept of the longitudinal dispersion coefficient. He derived the following equation for one-dimension dispersion in a laminar pipe flow:
formula
(1)
where C is the cross-sectional averaged pollutant concentration, t is the time of observation, U is cross-sectional averaged velocity, x is the distance from the injection point, and DL is the longitudinal dispersion coefficient of the pollutant.

Until now, many analytical and numerical solutions have been developed for ADE (Equation (1)) with different boundary conditions. It was generally found that the value of DL can be estimated by the pollutant concentration profile, stream velocity profile, or channel and flow parameters.

In general, studies on the prediction of the longitudinal dispersion coefficient could be divided into four main categories: tracing experiments, empirical equations, artificial intelligence (AI), and combining machine learning and evolutionary algorithms.

Prediction of DL using trace experiments has been performed by many researchers (Palancar et al. 2003; Seo & Baek 2004; Disley et al. 2015). Despite the high accuracy of these methods in the prediction of the DL, there are many limitations for these methods such as non-uniform flow characteristics along the river and high variations of velocity and concentration in width and depth of flow; these methods are also costly and time-consuming in the field and experimental studies.

In the last six decades, since Elder (1959) presented the first empirical equations to predict the DL, many researchers have proposed many methods for a high accurate prediction of the DL in natural steams. These methods include analytical, numerical and data driven methods. On the other hand, there has been a great deal of research over the last decade on machine learning approaches to predict the DL.

Fisher (1968) proposed the routing approach using an analytical solution of Equation (1) by numerical integration. Subsequently, the numerical solutions were applied for the prediction of DL (Ramezani et al. 2019).

Since the publication of Fischer (1975), new and highly accurate equations have been introduced for determining DL (Deng et al. 2002; Kashefipour & Falconer 2002; Sahay & Dutta 2009; Etemad-Shahidi & Taghipour 2012; Sattar & Gharabaghi 2015; Haghiabi 2016; Wang & Huai 2016; Alizadeh et al. 2017b).

In this study, by reviewing the literature, a complete set of empirical equations for predicting DL were selected (Table 1) and their results were compared with the least square support vector machine (LS-SVM) method.

Table 1

Empirical equations for prediction of DL

ModelDeveloped equation#datasetAbbreviation
Elder (1959)   – EL 
Fischer (1975)   – FI 
Liu (1977)   – LIU 
Seo & Cheong (1998)   59 SC 
Deng et al. (2001)   – DE 
Kashefipour & Falconer (2002)   81 KF 
Sahay & Dutta (2009)   65 SD 
Etemad-Shahidi & Taghipour (2012)   149 ET 
Li et al. (2013 65 LI 
Zeng & Huai (2014)   116 ZH 
Disley et al. (2015)   56 DI 
Sattar & Gharabaghi (2015)   150 SG 
Wang et al. (2017 116 WA 
Alizadeh et al. (2017a 164 AL 
Riahi-Madvar et al. (2019)   503 RI 
ModelDeveloped equation#datasetAbbreviation
Elder (1959)   – EL 
Fischer (1975)   – FI 
Liu (1977)   – LIU 
Seo & Cheong (1998)   59 SC 
Deng et al. (2001)   – DE 
Kashefipour & Falconer (2002)   81 KF 
Sahay & Dutta (2009)   65 SD 
Etemad-Shahidi & Taghipour (2012)   149 ET 
Li et al. (2013 65 LI 
Zeng & Huai (2014)   116 ZH 
Disley et al. (2015)   56 DI 
Sattar & Gharabaghi (2015)   150 SG 
Wang et al. (2017 116 WA 
Alizadeh et al. (2017a 164 AL 
Riahi-Madvar et al. (2019)   503 RI 

Note: B, H, U, U*, g, Fr, and DL are, respectively, width, depth, cross-sectional averaged velocity, shear velocity, acceleration due to gravity, Froude number and longitudinal dispersion coefficient. In the equation of Table 1, , , , and .

In recent decades, artificial intelligence approaches have been used for predicting the DL. For example, artificial neural network (ANN) (Toprak & Cigizoglu 2008; Alizadeh et al. 2017b; Riahi-Madvar et al. 2019), adaptive neuro-fuzzy inference system (ANFIS) (Riahi-Madvar et al. 2009; Noori et al. 2016), support vector machine (Noori 2009; Azamathulla & Wu 2011) and genetic programming (Sahay & Dutta 2009; Azamathulla & Ghani 2011) were used. These researches found satisfactory results in predicting the longitudinal dispersion coefficient.

On the other hand, such new methods as AI (Toprak & Cigizoglu 2008; Azamathulla & Wu 2011; Etemad-Shahidi & Taghipour 2012; Noori et al. 2016), evolutionary algorithms (Sahay & Dutta 2009; Li et al., 2013; Sattar & Gharabaghi 2015; Alizadeh et al. 2017b; Riahi-Madvar et al. 2019) and hybrid models (Najafzadeh & Tafarojnoruz 2016; Alizadeh et al. 2017a; Seifi & Riahi-Madvar 2019) have been used for predicting the DL.

It should be mentioned that previous research has mainly focused on developing a highly accurate method to predict the DL in a special dataset, and the accuracy of each method has not been investigated in different datasets. There are two general aspects for the prediction of DL: (1) the method used for prediction and (2) the datasets for evaluating the accuracy of the used method.

In this study, the LS-SVM method was used to predict the DL in natural streams. Furthermore, the results were compared with empirical equations (Table 1) in the various datasets. Although the new methods have increased the prediction accuracy of the longitudinal dispersion coefficient, the increase in accuracy greatly depends on the statistical characteristics of the used datasets. Therefore, one of the aims of this paper is to evaluate the accuracy of different methods for different datasets and to compare the results of previous methods with the new methods.

Experimental data

To evaluate the accuracy of the empirical equations and LS-SVM models for the prediction of DL, a wide range of longitudinal dispersion laboratory and field data has been collected from the different sources (Toprak & Cigizoglu 2008; Sahay & Dutta 2009; Li et al., 2013; Zeng & Huai 2014; Alizadeh et al. 2017a; Wang et al. 2017). The details of these datasets are presented in the appendix Table 8. Depending on the literature, these datasets were divided into the following three categories: (i) Table 8, rows 1–65 ‘G1-65' (Sahay & Dutta 2009), (ii) Table 8, rows 1–116 ‘G2-116’ (Zeng & Huai 2014), and (iii) Table 8, rows 1–188 ‘G3-188’ (Alizadeh et al. 2017b). The goal of data categorization is to evaluate the accuracy of models in various datasets to predict the value of DL.

The studied parameters consist of the depth, H (m), the width, B (m), the mean velocity, U (m s−1), the shear velocity of the flow, U* (m s−1) and the longitudinal dispersion coefficient, DL (m2 s−1) (Table 8). There are two reasons for selecting these datasets: 1) They have been used by many researchers (Seo & Cheong 1998; Deng et al. 2001; Kashefipour & Falconer 2002; Toprak & Cigizoglu 2008; Sahay & Dutta 2009; Etemad-Shahidi & Taghipour 2012; Li et al., 2013; Zeng & Huai 2014; Disley et al. 2015; Wang & Huai 2016; Alizadeh et al. 2017a), therefore, the results will be comparable with those obtained by other researchers; and 2) These data represent a wide range of geometrical (B, H) and hydrodynamic (U, U*) parameters of natural streams. Differences in statistical properties across different datasets can indicate the ability of different methods to estimate longitudinal dispersion coefficients. In the following, the datasets used in the present study are analyzed in terms of statistical properties, frequency ratios and regression relationships.

Statistical characteristics of data groups

Statistical characteristics of the studied data groups (G1-65, G2-116 and G3-188), such as minimum (Min), maximum (Max), average (AVG), standard deviation (STD), coefficient of variation (CV), skewness (SKW) and kurtosis (KUT) of the datasets are shown in Table 2.

Table 2

Statistical characteristics of the studied datasets

Data categoryPar.BHUU*DLB/HU/U*DL/HU*
G1-65 Min-Max 11.9–202.7 0.2–4.0 0.0–1.7 0.0–0.6 1.9–836.1 13.6–151.1 1.3–17.0 5.9–8625.0 
AVG 53.80 1.24 0.49 0.10 80.51 49.12 6.80 1083.17 
STD 47.30 0.99 0.35 0.09 131.82 29.89 3.70 1408.53 
CV 0.90 0.79 0.71 1.01 1.64 0.61 0.54 1.30 
SKW 1.70 1.23 1.62 3.82 3.86 1.45 1.12 3.32 
KUT 2.50 0.57 2.69 16.69 17.99 1.89 0.79 13.54 
G2-116 Min-Max 11.9–711.2 0.2–19.9 0.0–1.7 0.0–1.0 1.9–1486.5 12.5–1000.0 0.2–62.9 6.2–40,183.9 
AVG 90.90 1.76 0.56 0.09 130.73 72.67 9.55 1937.89 
STD 111.20 2.26 0.41 0.12 230.60 133.39 9.06 5267.28 
CV 1.20 1.29 0.73 1.35 1.76 1.84 0.95 2.72 
SKW 2.90 5.03 1.20 5.22 3.23 5.77 3.29 6.00 
KUT 11.10 36.42 0.55 32.43 12.55 34.20 14.70 38.71 
G3-188 Min-Max 1.4–711.2 0.1–19.9 0.0–1.7 0.0–1.0 0.2–1486.5 2.2–1000.0 0.2–62.9 3.1–40,183.9 
AVG 73.30 1.70 0.52 0.09 109.10 58.46 8.61 1533.40 
STD 94.43 2.04 0.37 0.10 206.67 107.71 7.81 4484.85 
CV 1.29 1.20 0.71 1.16 1.89 1.84 0.91 2.92 
SKW 3.39 4.65 1.30 5.35 3.44 7.07 3.45 6.70 
KUT 15.86 34.58 1.28 38.17 14.38 53.52 18.16 49.59 
Data categoryPar.BHUU*DLB/HU/U*DL/HU*
G1-65 Min-Max 11.9–202.7 0.2–4.0 0.0–1.7 0.0–0.6 1.9–836.1 13.6–151.1 1.3–17.0 5.9–8625.0 
AVG 53.80 1.24 0.49 0.10 80.51 49.12 6.80 1083.17 
STD 47.30 0.99 0.35 0.09 131.82 29.89 3.70 1408.53 
CV 0.90 0.79 0.71 1.01 1.64 0.61 0.54 1.30 
SKW 1.70 1.23 1.62 3.82 3.86 1.45 1.12 3.32 
KUT 2.50 0.57 2.69 16.69 17.99 1.89 0.79 13.54 
G2-116 Min-Max 11.9–711.2 0.2–19.9 0.0–1.7 0.0–1.0 1.9–1486.5 12.5–1000.0 0.2–62.9 6.2–40,183.9 
AVG 90.90 1.76 0.56 0.09 130.73 72.67 9.55 1937.89 
STD 111.20 2.26 0.41 0.12 230.60 133.39 9.06 5267.28 
CV 1.20 1.29 0.73 1.35 1.76 1.84 0.95 2.72 
SKW 2.90 5.03 1.20 5.22 3.23 5.77 3.29 6.00 
KUT 11.10 36.42 0.55 32.43 12.55 34.20 14.70 38.71 
G3-188 Min-Max 1.4–711.2 0.1–19.9 0.0–1.7 0.0–1.0 0.2–1486.5 2.2–1000.0 0.2–62.9 3.1–40,183.9 
AVG 73.30 1.70 0.52 0.09 109.10 58.46 8.61 1533.40 
STD 94.43 2.04 0.37 0.10 206.67 107.71 7.81 4484.85 
CV 1.29 1.20 0.71 1.16 1.89 1.84 0.91 2.92 
SKW 3.39 4.65 1.30 5.35 3.44 7.07 3.45 6.70 
KUT 15.86 34.58 1.28 38.17 14.38 53.52 18.16 49.59 

Note: Bold and Italic numbers show maximums among the datasets.

It can be seen in Table 2 that the dataset ranges (Min-Max) in the G3-188 datasets are greater than in the other two datasets (G1-65 and G2-116). The values of AVG and STD in the G2-116 data category are higher than in the other two data groups (G1-65 and G3-188). The values of CV, for some parameters (H, U, U* and U/U*) in G2-116 and for some parameters (B, DL, B/H and DL/HU*) in G3-188 are high. SKW and KUT for most of the parameters (e.g. B, U*, B/H, U/U* and DL/HU*) in the G3-188 data group are greater than in the other two data groups (G1-65 and G2-116).

Frequency analysis

In order to provide an accurate interpretation of the data, the histograms of the ratios of B/H, U/U* and DL/HU* are shown in Figure 1(a)–1(c) for three datasets.

Figure 1

Frequency histogram of the datasets: (a) B/H; (b) U/U*; and (c) DL/HU*.

Figure 1

Frequency histogram of the datasets: (a) B/H; (b) U/U*; and (c) DL/HU*.

Close modal

Figure 1 shows the variations of the frequency percentages for the ratios of B/H, U/U*, and DL/HU* for the studied datasets. As can be seen the values of B/H, U/U* and DL/HU* for about 30, 40, and 50% of the datasets are ranged between 30–50, 5–10 and 200–1000, respectively. About 70% of the collected dataset (Table 8) were used for training (choose randomly until the best training performance was obtained), while the remaining datasets (about 30% of the data) were used for testing the models.

Regression analysis

The relationships between the parameter DL/HU* versus B/H, U/ U* and Fr are investigated in G1-65, G2-116 and G3-188 data groups in Figure 2(a)–2(c), respectively. The value of the coefficient of determination (R2) for the relationships between DL/HU* and U/U* is 0.253 (Figure 2(a)), indicating maximum correlation among G1-65 data sets.

Figure 2

Relationship between DL/HU* vs. B/H, U/U* and Fr = U/√gH in (a) G1-65; (b) G2-116; and (c) G3-188 datasets.

Figure 2

Relationship between DL/HU* vs. B/H, U/U* and Fr = U/√gH in (a) G1-65; (b) G2-116; and (c) G3-188 datasets.

Close modal

In addition, as can be seen in Figure 2(b), the maximum value of R2 (=0.351) in G2-116 datasets was observed between the parameters of DL/HU* and B/H. It should be mentioned that the relationship between DL/HU* and Fr in G1-65 datasets is very poor. In general, the relationships between DL/HU* and B/H, U/U* and Fr in all of the datasets are not strong and the methods that can estimate DL were accurately appropriate to any datasets.

Least square support vector machines (LS-SVM)

Support vector machines were first used for classification (Vapnik 1999); then, another version of SVMs was proposed by (Drucker et al. 1997). In this method, a concept known as structural risk minimization is used to minimize the error of the model, while other methods (such as ANN) use the principles of empirical risk minimization (Cristianini et al. 2000; Dibike et al. 2001). In general, the SVM is used in two or more group classification problems and regression analysis. In the SVM model, quadratic programming is used to solve the equations, making the problem complex and time-consuming (Seyedzadeh et al. 2019).

One of the factors affecting the prediction accuracy in the LS-SVM model is the selection of an appropriate kernel function. In this study, radial basis functions (RBF) were investigated. Figure 3 shows the schematic view of the flowchart used for the LS-SVM model. Table 3 presents the LS-SVM characteristics applied in the present study.

Table 3

Characteristics of LS-SVM in the study

Input variablesWidth (B), Depth (H), Velocity (U), and Shear velocity (U*) of the flow
Target variable Longitudinal dispersion coefficient (DL
Function estimation Gaussian 
Kernel function Radial basic function (RBF) 
Tuning parameters (γ, σ2γ = 10 and σ2 = 0.2 
Selection function Randomize selection (Randper's function) 
Datasets ratio in train and test phases 70 and 30% 
Input variablesWidth (B), Depth (H), Velocity (U), and Shear velocity (U*) of the flow
Target variable Longitudinal dispersion coefficient (DL
Function estimation Gaussian 
Kernel function Radial basic function (RBF) 
Tuning parameters (γ, σ2γ = 10 and σ2 = 0.2 
Selection function Randomize selection (Randper's function) 
Datasets ratio in train and test phases 70 and 30% 
Figure 3

A schematic of the flowchart used for the LS-SVM model.

Figure 3

A schematic of the flowchart used for the LS-SVM model.

Close modal

Evaluation performance criteria

Until now, a lot of performance criteria have been used to evaluate the results of the model for prediction of DL. In the present study, six statistical criteria, root mean square error (RMSE), standard error (SE), mean bias error (MBE), discrepancy ratio (DR), R2, and Nash-Sutcliffe efficiency (NSE), were applied to evaluate the accuracy of each model in predicting the longitudinal dispersion coefficient. The explanations of the indices used are presented in Table 4.

Table 4

Characteristics of evaluation performance criteria used in the present study

CriteriaFormulaReferencesBest value
Standard error (SE)  Alizadeh et al. (2017a)  
Root mean square error (RMSE)  Ma & Iqbal (1984)  
Mean bias error (MBE)  Ma & Iqbal (1984)  
Discrepancy ratio (DR)  Zeng & Huai (2014)  
Nash-Sutcliffe efficiency (NSE)  Alizadeh et al. (2017a)  
Coefficient of determination (R2 Behar et al. (2015)  
CriteriaFormulaReferencesBest value
Standard error (SE)  Alizadeh et al. (2017a)  
Root mean square error (RMSE)  Ma & Iqbal (1984)  
Mean bias error (MBE)  Ma & Iqbal (1984)  
Discrepancy ratio (DR)  Zeng & Huai (2014)  
Nash-Sutcliffe efficiency (NSE)  Alizadeh et al. (2017a)  
Coefficient of determination (R2 Behar et al. (2015)  

In addition to statistical indices presented in Table 4, the model performance was assessed using predicted and measured DL values to calculate the standard deviation (STD), centered root mean square difference (RMSD), and correlation coefficient (R), as summarized by the Taylor diagram (Taylor 2001).

Results of LS-SVM

The results of predicting the value of DL using LS-SVM model in 10 consecutive runs are shown in Table 5. In addition, the best results among 10 consecutive runs of LS-SVM models for G1-65 datasets are presented in Figure 4.

Table 5

Results of empirical equations for prediction of longitudinal dispersion coefficient

Data groupEq. Abb.RMSESEMBEDRR2NSE
G1-65 KF 51 35 −10.7 −0.16 0.85 0.85 
SD 46 31 8.9 0.03 0.90 0.88 
ET 68 38 −13.2 −0.09 0.76 0.73 
LI 39 26 − 4.3 −0.07 0.91 0.91 
ZH 52 33 −10.9 −0.06 0.88 0.84 
DI 63 33 −10.1 0.00 0.84 0.76 
SG 52 32 − 0.6 0.03 0.85 0.84 
WA 76 40 −21.9 −0.12 0.77 0.66 
AL 49 32 −8.3 −0.15 0.88 0.86 
LS-SVM 58 24 − 0.6 0.10 0.76 0.75 
G2-116 KF 252 94 34.9 −0.06 0.46 −0.20 
SD 316 116 80.7 0.14 0.43 −0.90 
ET 190 81 −9.5 − 0.03 0.40 0.31 
LI 274 97 55.5 0.05 0.46 −0.43 
ZH 201 80 9.5 0.03 0.44 0.23 
DI 239 93 6.7 0.07 0.27 −0.09 
SG 285 100 37.5 0.11 0.31 −0.54 
WA 180 78 −27.1 −0.05 0.41 0.39 
AL 283 95 45.7 −0.08 0.48 −0.52 
LS-SVM 82 39 2.6 0.08 0.88 0.87 
G3-188 KF 213 86 25.1 0.01 0.45 −0.07 
SD 257 92 49.1 0.15 0.43 −0.56 
ET 167 69 −16.1 0.00 0.40 0.34 
LI 226 80 31.4 0.07 0.46 −0.20 
ZH 172 68 0.6 0.07 0.44 0.31 
DI 201 78 0.9 0.14 0.29 0.05 
SG 236 87 27.9 0.19 0.32 −0.31 
WA 161 71 −21.9 0.03 0.41 0.39 
AL 234 77 20.6 −0.07 0.47 −0.29 
LS-SVM 76 33 − 1.95 0.13 0.87 0.86 
Data groupEq. Abb.RMSESEMBEDRR2NSE
G1-65 KF 51 35 −10.7 −0.16 0.85 0.85 
SD 46 31 8.9 0.03 0.90 0.88 
ET 68 38 −13.2 −0.09 0.76 0.73 
LI 39 26 − 4.3 −0.07 0.91 0.91 
ZH 52 33 −10.9 −0.06 0.88 0.84 
DI 63 33 −10.1 0.00 0.84 0.76 
SG 52 32 − 0.6 0.03 0.85 0.84 
WA 76 40 −21.9 −0.12 0.77 0.66 
AL 49 32 −8.3 −0.15 0.88 0.86 
LS-SVM 58 24 − 0.6 0.10 0.76 0.75 
G2-116 KF 252 94 34.9 −0.06 0.46 −0.20 
SD 316 116 80.7 0.14 0.43 −0.90 
ET 190 81 −9.5 − 0.03 0.40 0.31 
LI 274 97 55.5 0.05 0.46 −0.43 
ZH 201 80 9.5 0.03 0.44 0.23 
DI 239 93 6.7 0.07 0.27 −0.09 
SG 285 100 37.5 0.11 0.31 −0.54 
WA 180 78 −27.1 −0.05 0.41 0.39 
AL 283 95 45.7 −0.08 0.48 −0.52 
LS-SVM 82 39 2.6 0.08 0.88 0.87 
G3-188 KF 213 86 25.1 0.01 0.45 −0.07 
SD 257 92 49.1 0.15 0.43 −0.56 
ET 167 69 −16.1 0.00 0.40 0.34 
LI 226 80 31.4 0.07 0.46 −0.20 
ZH 172 68 0.6 0.07 0.44 0.31 
DI 201 78 0.9 0.14 0.29 0.05 
SG 236 87 27.9 0.19 0.32 −0.31 
WA 161 71 −21.9 0.03 0.41 0.39 
AL 234 77 20.6 −0.07 0.47 −0.29 
LS-SVM 76 33 − 1.95 0.13 0.87 0.86 

Note: Rank 1 (Bold, Italic, and Underline), Rank 2 (Bold and Italic) and Rank 3 (Bold).

Figure 4

(a) Measured and predicted values of DL, (b) Errors in prediction, and (c) Distribution of errors of LS-SVM model for prediction of DL in G1-65 datasets.

Figure 4

(a) Measured and predicted values of DL, (b) Errors in prediction, and (c) Distribution of errors of LS-SVM model for prediction of DL in G1-65 datasets.

Close modal

As can be seen in Figure 4(a), the accuracy of the LS-SVM model in the training phase is better than in the test phase. Figure 4(b) shows obviously that the error in the prediction of DL in the test phase of the model is greater than in the training phase. Figure 5 shows the best results for the LS-SVM model in 10 consecutive runs of the model for G2-116 datasets.

Figure 5

(a) Measured and predicted values of DL, (b) Errors in prediction, and (c) Distribution of errors of LS-SVM model for prediction of DL in G2-116 datasets.

Figure 5

(a) Measured and predicted values of DL, (b) Errors in prediction, and (c) Distribution of errors of LS-SVM model for prediction of DL in G2-116 datasets.

Close modal

As can be seen in Figure 5(a), based on the values of RMSE and R2, the accuracy of the model in the training phase is better than in the testing phase; however, the value of R2 in G2-116 datasets in the testing stage is higher than that of the G1-65 datasets. Figure 5(b) shows that errors in the prediction of DL are similar in train and test phases (G2-116 datasets). Furthermore, the best results of the LS-SVM model in 10 consecutive runs for G3-188 datasets are shown in Figure 6.

Figure 6

(a) Measured and predicted values of DL, (b) Errors in prediction, and (c) Distribution of errors of LS-SVM model for prediction of DL in G3-188 datasets. LDC, longitudinal dispersion coefficient.

Figure 6

(a) Measured and predicted values of DL, (b) Errors in prediction, and (c) Distribution of errors of LS-SVM model for prediction of DL in G3-188 datasets. LDC, longitudinal dispersion coefficient.

Close modal

Figure 6(a) shows similar results to those in Figures 4(a) and 5(a). In general, as can be seen in Figures 46, the LS-SVM model has an appropriate performance in the prediction of longitudinal dispersion coefficient for all datasets. In addition, the results show that as the number of datasets increases, the value of the R2 increases, especially at the testing phase, so that the value of R2 increases from 0.46 in the G1-65 datasets to 0.81 in the G3-188 datasets.

Comparing the results of models

The statistical criteria were calculated for the top 10 empirical equations and the results are presented in Table 5. In addition, the results of the LS-SVM model were calculated based on the average of 10 consecutive runs.

Table 5 shows the statistical criteria for the 10 top ranks of empirical equations, as previously listed in Table 1. The results show that six empirical equations including Elder (1959), Fischer (1975), Liu (1977), Seo & Cheong (1998), Deng et al. (2001) and Riahi-Madvar et al. (2019) do not exist in Table 5, because of poor results. The results in Table 5 can be interpreted from two different aspects: (1) the value of statistical indices in different datasets and (2) the best equations in different datasets.

The best statistical criteria in Table 5 were obtained for G1-65 datasets, so that the average values of RMSE and R2 for G1-65 datasets are 55 and 0.84, while the criteria were computed as 230 and 0.45 for G2-116 datasets, respectively. The results indicate that the statistical indices of average and standard deviation have significant impacts on the prediction of DL. The reason is that based on Table 5, the AVG and STD of G2-116 datasets are higher than those of the other two datasets. Another reason for the low accuracy of the empirical equations in the DL prediction in G2-116 datasets is the low correlation of the DL with the river and flow characteristics (Figure 2(b)).

The best equations in G1-65 datasets (Table 5) are Li et al., (2013) (LI) and Sahay & Dutta (2009) (SD); the main reason is that these equations have been extracted by the same datasets (G1-65).

In G2-116 and G3-188 datasets, the best results are related to the LS-SVM method, so that LS-SVM in G2-116 and G3-188 datasets obtained the best values of RMSE, SE, MBE, DR, R2 and NSE. After the LS-SVM method, Etemad-Shahidi & Taghipour (2012) (ET), Zeng & Huai (2014) (ZH) and Wang et al. (2017) (WA) are the most accurate methods. The error indices (RMSE and SE) in the prediction of DL using empirical equations and the LS-SVM method are shown in Figure 7.

Figure 7

Measured and predicted values of DL by empirical equations and LS-SVM model for (a) G1-65, (b) G2-116, and (c) G3-188 datasets.

Figure 7

Measured and predicted values of DL by empirical equations and LS-SVM model for (a) G1-65, (b) G2-116, and (c) G3-188 datasets.

Close modal

As can be seen in Figure 7(a), the LS-SVM method has the lowest SE (SE = 22) for the prediction of DL with G1-65 datasets, although Li et al. (2013) (LI) equations (with RMSE = 39) achieved the lowest RMSE in these datasets. In addition, Figure 7(b) shows that the LS-SVM method, with RMSE = 62 and SE = 36, has the best performance in the prediction of DL in G1-118 datasets. Furthermore, as is obvious in Figure 7(c), the LS-SVM method (with RMSE = 60 and SE = 31) has the best results for the prediction of DL in G3-188 datasets.

Figure 7(b) and 7(c) illustrate the low accuracy of the empirical equations in predicting the longitudinal dispersion coefficient, especially in the wide rivers (three measurements at the Mississippi River, rows 91–93, in the appendix Table 8). This shows that the empirical equations are inappropriate for the prediction of DL on the wide rivers. Figure 8 shows the variation of dispersion coefficient according to the river width for three datasets.

Figure 8

The variation of dispersion coefficient according to the river width for (a) G1-65, (b) G2-116 and (c) G3-188 datasets.

Figure 8

The variation of dispersion coefficient according to the river width for (a) G1-65, (b) G2-116 and (c) G3-188 datasets.

Close modal

As can be seen in Figure 8(a)–8(c), the accuracy of the prediction of LS-SVM improved by increasing the river width. On the other hand, Figure 8(c) shows that the lowest accuracy of LS-SVM in predicting the longitudinal dispersion coefficient is related to the lowest measured values (rows No. 117, 143 and G3-188 in the appendix Table 8). For the purpose of further comparison, the percentage of the DR values in different ranges of discrepancy is shown in Figure 9.

Figure 9

The histogram of DR values for six models (a) G1-65, (b) G2-116, and (c) G3-188 datasets.

Figure 9

The histogram of DR values for six models (a) G1-65, (b) G2-116, and (c) G3-188 datasets.

Close modal

The accuracy of each model may be categorized by the number of DR values between −0.3 and 0.3, relative to the total number of datasets. This range was selected because the maximum acceptable error in predicting DL by the corresponding measured values is ±100% or (Kashefipour & Falconer 2002). As can be seen in Figure 9(a)–9(c), LS-SVM method obtained the best accuracy, so that in G1-65, G2-116 and G3-188 datasets about 83, 72 and 70% of DR values are in the range of −0.3 and 0.3, respectively. Figure 10(a)–10(c) shows the consistency of the measured and predicted values of longitudinal dispersion coefficients in G1-65, G2-116, and G3-188 datasets, based on the values of R2 and NSE.

Figure 10

Scatterplots of the measured and predicted values of longitudinal dispersion coefficients (a) G1-65, (b) G2-116, and (c) G3-188 datasets.

Figure 10

Scatterplots of the measured and predicted values of longitudinal dispersion coefficients (a) G1-65, (b) G2-116, and (c) G3-188 datasets.

Close modal

It can be generally observed from Figure 10(a)–10(c) that the LS-SVM model yields suitable predictions of the measured data. The best correlation between measured and predicted values of DL can be achieved when the LS-SVM method is applied to predict the value of DL equal to 0.92, 093 and 0.92 in G1-65, G2-116 and G3-188 datasets, respectively.

The Taylor diagram (Figure 11) was used to compare different performance indices. According to Taylor's diagram, the closer the model to the measured point, the more accurate the model will be in predicting the longitudinal dispersion coefficient.

Figure 11

Taylor diagram (a) G1-65, (b) G2-116 and (c) G3-188 datasets.

Figure 11

Taylor diagram (a) G1-65, (b) G2-116 and (c) G3-188 datasets.

Close modal

Figure 11(b) and 11(c) indicate that the LS-SVM method considerably increased the accuracy of the prediction of DL. Furthermore, the efficiencies of previous equations can be ranked as follows: LS-SVM, Etemad-Shahidi & Taghipour (2012); Zeng & Huai (2014); Wang et al. (2017); Alizadeh et al. (2017a) and Li et al. (2013). The superiority of the LS-SVM method compared to the previous equations is clear from Taylor's diagram.

For comparing the model results for DL estimation from narrow rivers (B/H < 10) to very wide rivers (B/H > 100), the statistical criteria including RMSE, MBE, and R2 were calculated based on five categories of aspect ratio (B/H < 10; 10 < B/H < 30; 30 < B/H < 50; 50 < B/H < 100 and B/H > 100). The results of the 10 superior models based on aspect ratio ranges are illustrated in Table 6.

Table 6

Models’ results for DL estimation in aspect ratio class

Aspect r.B/H < 10
10 < B/H < 30
30 < B/H < 50
50 < B/H < 100
B/H > 100
modelsRMSEMBER2RMSEMBER2RMSEMBER2RMSEMBER2RMSEMBER2
KF 201 166 0.34 110 20 0.16 225 21 0.17 303 36 0.53 123 −12 0.68 
SD 28 0.95 91 −3 0.31 178 16 0.20 368 94 0.45 471 253 0.97 
ET 14 0.99 107 −21 0.13 136 0.22 255 −42 0.41 113 20 0.74 
LI 40 15 0.90 93 −5 0.29 178 0.19 337 68 0.48 318 149 0.94 
ZH 166 −135 0.96 55 −24 0.30 97 −21 0.21 165 −48 0.44 105 71 0.85 
DI 81 63 0.90 87 0.37 227 20 0.13 279 −37 0.33 152 34 0.69 
SG 196 162 0.82 105 29 0.26 283 31 0.12 308 15 0.39 185 19 0.58 
WA 166 128 0.94 94 1 0.27 126 −12 0.22 254 −80 0.43 60 1 0.88 
AL 12 − 6 0.99 97 −24 0.33 234 24 0.17 352 75 0.52 175 26 0.73 
LS-SVM 26 10 0.97 47 0.87 41 − 1 0.92 89 − 14 0.93 50 0.91 
Aspect r.B/H < 10
10 < B/H < 30
30 < B/H < 50
50 < B/H < 100
B/H > 100
modelsRMSEMBER2RMSEMBER2RMSEMBER2RMSEMBER2RMSEMBER2
KF 201 166 0.34 110 20 0.16 225 21 0.17 303 36 0.53 123 −12 0.68 
SD 28 0.95 91 −3 0.31 178 16 0.20 368 94 0.45 471 253 0.97 
ET 14 0.99 107 −21 0.13 136 0.22 255 −42 0.41 113 20 0.74 
LI 40 15 0.90 93 −5 0.29 178 0.19 337 68 0.48 318 149 0.94 
ZH 166 −135 0.96 55 −24 0.30 97 −21 0.21 165 −48 0.44 105 71 0.85 
DI 81 63 0.90 87 0.37 227 20 0.13 279 −37 0.33 152 34 0.69 
SG 196 162 0.82 105 29 0.26 283 31 0.12 308 15 0.39 185 19 0.58 
WA 166 128 0.94 94 1 0.27 126 −12 0.22 254 −80 0.43 60 1 0.88 
AL 12 − 6 0.99 97 −24 0.33 234 24 0.17 352 75 0.52 175 26 0.73 
LS-SVM 26 10 0.97 47 0.87 41 − 1 0.92 89 − 14 0.93 50 0.91 

Note: bold numbers show the best results for each criterion between models.

Table 6 demonstrates that in narrow rivers (B/H < 10) the AL equation (Alizadeh et al. 2017a) was the most accurate model for DL estimation with RMSE = 12 m2 s−1, MBE = −6 m2 s−1, and R2 = 0.99. Except for narrow rivers (B/H < 10), in all aspect ratios, the LS-SVM model was the most accurate model for DL estimation in different ranges of aspect ratio. According to coefficient of determination (R2) from narrow to very wide rivers (R2 = 0.87 (10 < B/H < 30) to 0.97 (B/H < 10)), the LS-SVM model has high performance in comparison with the other models. In wide rivers (50 < B/H < 100), the LS-SVM model provides the best results with RMSE = 89 m2 s−1, MBE = −14 m2 s−1 and R2 = 0.93. The comparison of predictions obtained by the predictive model of this study with the other existing models revealed that the LS-SVM model is superior to the other because it has the highest value of R2 and the lowest value of the RMSE, especially in wide rivers (B/H > 50).

Comparison with previous studies

Riahi-Madvar et al. (2019) presented the best equations for predicting the longitudinal dispersion coefficients as Pareto optimal multigene genetic programming (POMGGP) (Liu 1977; Sattar & Gharabaghi 2015; Alizadeh et al. 2017a). They concluded that the AL and SG equations were among the best equations in predicting longitudinal dispersion coefficient. Alizadeh et al. (2017a) reported frequency values of DR ranging from −0.3 to 0.3 for particle swarm optimization (PSO), ET and SG methods, respectively, in 63, 60 and 58% of datasets. In the present study, the AL and ET equations with values of 61% were the best empirical equations after the LS-SVM method with 72%. A comparison of the results of the present study with the previous studies is summarized in Table 7.

Table 7

Comparison of this study with the results of previous studies

ResearchersNo. of datasetsMethodsRMSESEAccuracy −0.3 < DR < 0.3R2NSEdPA%
Riahi-Madvar et al. (2019)  503 Pareto optimal multigene genetic programming (POMGGP) 720 373 – 0.417 0.39 0.75 – 
Alizadeh et al. (2017a164 Multi-objective particle swarm optimization(MOPSO) 86.57 37.5 63 0.761 0.75 – – 
Sattar & Gharabaghi (2015)  150 Gene expression programming (GEP) 464 – – 0.80 0.80 0.93 – 
Li et al. (201365 Differential evolution (DE) 38.96 – – 0.913 – – 56.92 
Etemad-Shahidi & Taghipour (2012)  149 M5′ model tree (MT) – – 63.1 0.36 – – – 
Sahay & Dutta (2009)  65 Genetic algorithms (GA) 45 – – 0.902 – – – 
The study G1-65 Least square support vector machines (LS-SVM) 52 22 83 0.92 0.84 0.94 72 
G2-116 62 36 72 0.93 0.94 0.98 63 
G3-188 60 31 70 0.92 0.93 0.98 57 
ResearchersNo. of datasetsMethodsRMSESEAccuracy −0.3 < DR < 0.3R2NSEdPA%
Riahi-Madvar et al. (2019)  503 Pareto optimal multigene genetic programming (POMGGP) 720 373 – 0.417 0.39 0.75 – 
Alizadeh et al. (2017a164 Multi-objective particle swarm optimization(MOPSO) 86.57 37.5 63 0.761 0.75 – – 
Sattar & Gharabaghi (2015)  150 Gene expression programming (GEP) 464 – – 0.80 0.80 0.93 – 
Li et al. (201365 Differential evolution (DE) 38.96 – – 0.913 – – 56.92 
Etemad-Shahidi & Taghipour (2012)  149 M5′ model tree (MT) – – 63.1 0.36 – – – 
Sahay & Dutta (2009)  65 Genetic algorithms (GA) 45 – – 0.902 – – – 
The study G1-65 Least square support vector machines (LS-SVM) 52 22 83 0.92 0.84 0.94 72 
G2-116 62 36 72 0.93 0.94 0.98 63 
G3-188 60 31 70 0.92 0.93 0.98 57 

As can be seen in Table 7, LS-SVM methods have a robust capability to predict the longitudinal dispersion coefficient in natural streams on various and wide range of datasets.

In this study, an LS-SVM method was applied for predicting the longitudinal dispersion coefficient in natural streams. The performance of the LS-SVM was evaluated for three datasets (G1-65, G2-116 and G3-188) with different hydraulic and geometrical characteristics. From each dataset, 70% of samples were used for the training phase, while the remaining samples (30%) were used for the testing phase. In addition, the performance of previous empirical equations was evaluated to predict the longitudinal dispersion coefficient by various collected datasets. The most important findings of the research can be summarized as follows:

  • The performance of the empirical equations depends significantly on the statistical properties of the datasets and most of the empirical equations showed many errors in datasets with high variations. For example, the ranges of RMSE are equal to 39–332 and 180–53,256, respectively, for the G1-65 and G2-116 datasets with the lowest and highest average and standard deviation of datasets for prediction of DL.

  • The LS-SVM method has a high capability in predicting the longitudinal dispersion coefficients in different datasets; however, when the low number of datasets was used for training and testing phases, the accuracy of this method is reduced, especially in the testing phase.

  • In general, the longitudinal dispersion coefficient DL predicted by Etemad-Shahidi & Taghipour (2012) (ET), Li et al., (2013) (LI), Zeng & Huai (2014) (ZH), Alizadeh et al. (2017a) (AL), Elder (1959) (EL), Fischer (1975) (FI), and Liu (1977) (LIU) equations had, respectively, the best performance in all three datasets (G1-65, G2-116 and G3-188).

  • Some empirical equations, such as ET and AL, have two different formulas based on the ratio of B/H and show the most accurate results in predicting longitudinal dispersion coefficients in different datasets.

  • The accuracy of the empirical equations in estimation of DL depends on the data series on which the equation is developed based on the data. For example, in G1-65 and G2-116 datasets, empirical equations such as Sahay & Dutta (2009) (SD), Li et al. (2013) (LI), Zeng & Huai (2014) (ZH), and Wang et al. (2017) (WA) had the best accuracy to estimate the value of DL. This is because these equations are developed based on 65 and 116 data series.

  • The strengths and weaknesses of the LS-SVM method were respectively identified in the prediction of DL in wide rivers (B = 533–711 m) and low longitudinal dispersion (DL = 0.2–0.5 m2 s−1).

The authors declare no conflict of interest.

The software used in this research will be available (by the corresponding author), upon reasonable request.

Mehdi Mohammadi Ghaleni, Mahmood Akbari, Saeed Sharafi and Mohammad Javad Nahvinia conceived the study. MMG led the data analysis and prepared all figures. MA, SS, and MJN wrote the paper. All authors reviewed the paper and contributed to the discussions.

All relevant data are included in the paper or its Supplementary Information.

Alizadeh
M. J.
,
Ahmadyar
D.
&
Afghantoloee
A.
2017a
Improvement on the existing equations for predicting longitudinal dispersion coefficient
.
Water Resources Management
31
(
6
),
1777
1794
.
Alizadeh
M. J.
,
Shabani
A.
&
Kavianpour
M. R.
2017b
Predicting longitudinal dispersion coefficient using ANN with metaheuristic training algorithms
.
International Journal of Environmental Science and Technology
14
(
11
),
2399
2410
.
Azamathulla
H. M.
&
Ghani
A. A.
2011
Genetic programming for predicting longitudinal dispersion coefficients in streams
.
Water Resources Management
25
(
6
),
1537
1544
.
Azamathulla
H. M.
&
Wu
F. C.
2011
Support vector machine approach for longitudinal dispersion coefficients in natural streams
.
Applied Soft Computing Journal
11
(
2
),
2902
2905
.
doi:10.1016/j.asoc.2010.11.026
.
Cristianini
N.
&
Shawe-Taylor
J.
2000
An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods
.
Cambridge University Press, Cambridge
.
Deng
Z.-Q.
,
Singh
V. P.
&
Bengtsson
L.
2001
Longitudinal dispersion coefficient in straight rivers
.
Journal of Hydraulic Engineering
127
(
11
),
919
927
.
Deng
Z.-Q.
,
Bengtsson
L.
,
Singh
V. P.
&
Adrian
D. D.
2002
Longitudinal dispersion coefficient in single-channel streams
.
Journal of Hydraulic Engineering
128
(
10
),
901
916
.
Dibike
Y. B.
,
Velickov
S.
,
Solomatine
D.
&
Abbott
M. B.
2001
Model induction with support vector machines: introduction and applications
.
Journal of Computing in Civil Engineering
15
(
3
),
208
216
.
Disley
T.
,
Gharabaghi
B.
,
Mahboubi
A. A.
&
McBean
E. A.
2015
Predictive equation for longitudinal dispersion coefficient
.
Hydrological Processes
29
(
2
),
161
172
.
Drucker
H.
,
Burges
C. J. C.
,
Kaufman
L.
,
Smola
A.
&
Vapnik
V.
1996
Support vector regression machines
.
Advances in Neural Information Processing Systems
9
,
155
161
.
Elder
J.
1959
The dispersion of marked fluid in turbulent shear flow
.
Journal of Fluid Mechanics
5
(
4
),
544
560
.
Etemad-Shahidi
A.
&
Taghipour
M.
2012
Predicting longitudinal dispersion coefficient in natural streams using M5′ model tree
.
Journal of Hydraulic Engineering
138
(
6
),
542
554
.
Fisher
H. B.
1968
Dispersion predictions in natural streams
.
Journal of the Sanitary Engineering Division
94
(
5
),
927
943
.
Fischer
H. B.
1975
Discussion of ‘simple method for predicting dispersion in streams’
.
Journal of the Environmental Engineering Division
101
(
3
),
453
455
.
Kashefipour
S. M.
&
Falconer
R. A.
2002
Longitudinal dispersion coefficients in natural channels
.
Water Research
36
(
6
),
1596
1608
.
Li
X.
,
Liu
H.
&
Yin
M.
2013
Differential evolution for prediction of longitudinal dispersion coefficients in natural streams
.
Water Resources Management
27
(
15
),
5245
5260
.
Liu
H.
1977
Predicting dispersion coefficient of streams
.
Journal of the Environmental Engineering Division
103
(
1
),
59
69
.
Memarzadeh
R.
,
Zadeh
H. G.
,
Dehghani
M.
,
Riahi-Madvar
H.
,
Seifi
A.
&
Mortazavi
S. M.
2020
A novel equation for longitudinal dispersion coefficient prediction based on the hybrid of SSMD and whale optimization algorithm
.
Science of The Total Environment
716
,
137007
.
Noori
R.
2009
Predicting the longitudinal dispersion coefficient using support vector
.
Environmental Engineering Science
26
(
10
),
1503
1510
.
Noori
R.
,
Deng
Z.
,
Kiaghadi
A.
&
Kachoosangi
F. T.
2016
How reliable are ANN, ANFIS, and SVM techniques for predicting longitudinal dispersion coefficient in natural rivers?
Journal of Hydraulic Engineering
142
(
1
),
04015039
.
Palancar
M. C.
,
Aragón
J. M.
,
Sánchez
F.
&
Gil
R.
2003
The determination of longitudinal dispersion coefficients in rivers
.
Water Environment Research
75
(
4
),
324
335
.
Ramezani
M.
,
Noori
R.
,
Hooshyaripor
F.
,
Deng
Z.
&
Sarang
A.
2019
Numerical modelling-based comparison of longitudinal dispersion coefficient formulas for solute transport in rivers
.
Hydrological Sciences Journal
64
(
7
),
808
819
.
https://doi.org/10.1080/02626667.2019.1605240
.
Riahi-Madvar
H.
,
Ayyoubzadeh
S. A.
,
Khadangi
E.
&
Ebadzadeh
M. M.
2009
An expert system for predicting longitudinal dispersion coefficient in natural streams by using ANFIS
.
Expert Systems with Applications
36
(
4
),
8589
8596
.
Riahi-Madvar
H.
,
Dehghani
M.
,
Seifi
A.
&
Singh
V. P.
2019
Pareto optimal multigene genetic programming for prediction of longitudinal dispersion coefficient
.
Water Resources Management
33
(
3
),
905
921
.
Sattar
A. M. A.
&
Gharabaghi
B.
2015
Gene expression models for prediction of longitudinal dispersion coefficient in streams
.
Journal of Hydrology
524
,
587
596
.
http://dx.doi.org/10.1016/j.jhydrol.2015.03.016
.
Seifi
A.
&
Riahi-Madvar
H.
2019
Improving one-dimensional pollution dispersion modeling in rivers using ANFIS and ANN-based GA optimized models
.
Environmental Science and Pollution Research
26
(
1
),
867
885
.
Seo
I. W.
&
Cheong
T. S.
1998
Predicting longitudinal dispersion coefficient in natural streams
.
Journal of Hydraulic Engineering
124
,
25
32
.
Seyedzadeh
S.
,
Rahimian
F. P.
,
Rastogi
P.
&
Glesk
I.
2019
Tuning machine learning models for prediction of building energy loads
.
Sustainable Cities and Society
47
,
101484
.
Taylor
G. I.
1953
Dispersion of soluble matter in solvent flowing slowly through a tube
.
Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences
219
(
1137
),
186
203
.
Taylor
G.
1954
The dispersion of matter in turbulent flow through a pipe
.
Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences
223
(
1155
),
446
468
.
Taylor
K. E.
2001
Summarizing multiple aspects of model performance in a single diagram
.
Journal of Geophysical Research: Atmospheres
106
(
D7
),
7183
7192
.
Toprak
Z. F.
&
Cigizoglu
H. K.
2008
Predicting longitudinal dispersion coefficient in natural streams by artificial intelligence methods
.
Hydrological Processes: An International Journal
22
(
20
),
4106
4129
.
Vapnik
V. N.
1999
An overview of statistical learning theory
.
IEEE Transactions on Neural Networks
10
(
5
),
988
999
.
Wang
Y.
&
Huai
W.
2016
Estimating the longitudinal dispersion coefficient in straight natural rivers
.
Journal of Hydraulic Engineering
142
(
11
),
04016048
.
Wang
Y.-F.
,
Huai
W.-X.
&
Wang
W.-J.
2017
Physically sound formula for longitudinal dispersion coefficients of natural rivers
.
Journal of Hydrology
544
,
511
523
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Licence (CC BY 4.0), which permits copying, adaptation and redistribution, provided the original work is properly cited (http://creativecommons.org/licenses/by/4.0/).

Supplementary data