Abstract
Estimating sediment transport rate in rivers has high importance due to the difficulties and costs associated with its measurement, which has drawn the attention of experts in water engineering. In this study, Gaussian process regression (GPR) is applied to predict the sediment transport rate for 19 gravel-bed rivers in the United States. To compare the performance of GPR, the support vector machine (SVM) as a common type of kernel-based models was developed. Model inputs of sediment transport were prepared based on two scenarios: the first scenario considers only hydraulic characteristics and the second scenario was formed using hydraulic and sediment properties. Obtained results revealed that the GPR models present better performance compared to the SVM models and other empirical sediment transport formulas. Also, it was found that incorporating the second scenario as input led to better predictions. In addition, performing sensitivity analysis showed that the ratio of average velocity to shear flow velocity is the most effective parameter in predicting the sediment transport rate of gravel-bed rivers.
INTRODUCTION
For over half a century, there have been continuing efforts to enhance the understanding of sediment transport process. Many investigations proposed empirical relationships for predicting sediment transport of alluvial rivers (Yang et al. 2009). Despite an extensive application of empirical formulas, the prediction error of these formulas is reported to be extremely high (Barry et al. 2004; Bathurst 2007; Roushangar et al. 2014). The transport rate of sediment load which is carried by surface flow, has a major role in controlling river ecosystems and is one of the main parameters in design, implementation and operation of hydraulic structures, irrigation, transfer and treatment of water, watershed management, and flood control. Due to the importance of sediment transport phenomenon, extensive research during recent years has been done using artificial intelligence (AI) methods. Among them, significant applications of artificial neural networks (ANN) method in sediment transport rate estimation have been reported in the literature (Bhattacharya et al. 2007; Doğan et al. 2007; Sasal et al. 2009; Yang et al. 2009; Kumar 2012). Remarkable performance of AI methods has been the motive of hydraulic and river engineers to develop more effective techniques with greater generalizability. In this way, Azamathulla et al. (2009) suggested the adaptive neuro-fuzzy inference system (ANFIS) method as a flexible and more optimum technique for predicting bed load. Azamathulla et al. (2010) conducted research on a case study sediment load prediction and demonstrated the encouraging performance of the support vector machine (SVM) for prediction of sediment load. Ghani & Azamathulla (2014) offered gene expression programming (GEP) for the development of functional relationship for total sediment load in three Malaysian rivers. Having utilized ANFIS with GEP to model total bed material load of Qotur River, Roushangar et al. (2014) showed that the models based on stream power approach are more reliable than those that are based on shear stress approach. Okcu et al. (2016) applied polynomial best subset regression (PBSR) to a database containing both river and flume measurements and developed a new equation for predicting total sediment load. Kitsikoudis et al. (2014) found that ANN and ANFIS surpass the symbolic regression (SR) in terms of bed load prediction in gravel-bed rivers. Roushangar & Koosheh (2015) introduced a hybrid method based on support vector regression (SVR) coupled with genetic algorithm (GA) for quantification of bed load transport rate in three gravel-bed rivers. Their hybrid model enjoyed greater accuracy when it came to predict low transport rate. Sahraei et al. (2017) introduced a useful prediction method based on least square support vector regression (LSSVR) with particle swarm optimization (PSO) for the purpose of predicting total sediment load.
Gaussian process regression (GPR) is a newly developed learning approach that works based on the concept of kernel functions. GPR presents probabilistic models, which means that Gaussian process provides a reliability of responses to the given input data (Yuan et al. 2008). In addition, the GPR method is flexible as it has an ability to handle nonlinear problems and also non-parametric as it does not need parameter selection. Some previous studies have used GPR as a probabilistic stream flow forecaster (Sun et al. 2014; Zhu et al. 2018). In addition, promising application of GPR in forecasting daily seepage discharge of an earth dam (Roushangar et al. 2016), prediction of stream water temperature (Grbić et al. 2013), and prediction of urban water consumption (Roushangar & Alizadeh 2018) have been reported in the literature.
A detailed literature review demonstrated that although some research has been conducted on GPR, none considered prediction of sediment load as a method applicable to a wide range of flow and sediment characteristics. The present study aims to investigate the capability of GPR in predicting the bed load and total load of gravel-bed rivers. An extensive database compiled from 19 gravel-bed rivers (King et al. 2004) was used to feed the utilized GPR models. Moreover, since the SVM is closely related to the employed GPR approach in terms of using kernel functions, the performance of the employed GPR approach was compared with SVM-based regression. Optimum input combination and the most important parameters in predicting sediment transport rate are determined using sensitivity analysis.
MATERIALS AND METHODS
Study area and used data
The present study covers 19 gravel-bed rivers, information for which was collected by the US Forest Service in cooperation with other agencies. This database has become a robust source for engineers and researchers working on sediment transport (Recking 2010; Schneider et al. 2015). This dataset includes bed load, suspended load, and hydraulic measurements of gravel-bed rivers, while the additional information regarding this dataset and details of the methods used to measure the various types of data are presented in King et al. (2004). Parallel measurements of suspended load and bed load from 19 streams within the Snake River basin with a range of discharge between 0.05 m3/s and 30 m3/s and varied hydraulic and sediment properties were selected. The notable point is, that on all sites, the diameters d50 and d90 of the surface material were larger than those of the subsurface material, indicating the presence of an armor layer which is the main characteristic of gravel-bed rivers. An armor layer establishes a stable boundary in low flows, but forms a complicated hydraulic condition in floods due to sudden scouring of finer sub-surface material (Wang & Liu 2009). Some characteristics of the selected rivers are presented in Table 1.
Rivers . | Drainage area (km2) . | Data for training . | Data for testing . | Total data . | Slope m/m . | d50,sur (mm) . | Data of sampling . | Range of discharge (m3/s) . |
---|---|---|---|---|---|---|---|---|
Big Wood River | 349.7 | 17 | 8 | 26 | 0.0091 | 119 | 1999–2000 | 9.6–30.8 |
Bruneaul River | 989 | 18 | 9 | 27 | 0.0054 | 27 | 1998–2002 | 4.7–20.9 |
Fourth Of July | 44.28 | 17 | 8 | 25 | 0.0202 | 51 | 1994–1995 | 0.2–3.8 |
Herd Creek | 292.6 | 15 | 7 | 22 | 0.0077 | 67 | 1994–1995 | 0.5–8.1 |
Jarbidge River | 79.25 | 18 | 9 | 26 | 0.0160 | 89 | 1998–2002 | 1.4–8 |
Johns Creek | 293.1 | 14 | 7 | 22 | 0.207 | 199.2 | 1986–1995 | 0.97–26 |
Little Slate Creek | 168.5 | 55 | 24 | 79 | 0.0268 | 98.1 | 1986–1997 | 0.52–15.7 |
Lolo Creek | 107.7 | 28 | 13 | 41 | 0.0097 | 67 | 1980–1997 | 1.8–16.2 |
Main Fork Red River | 129.3 | 77 | 33 | 110 | 0.0059 | 50.5 | 1986–1999 | 0.29–18.2 |
Marsh Creek | 191.5 | 18 | 9 | 27 | 0.0060 | 57 | 1994–1995 | 3.36–23.2 |
Rapid River | 279.5 | 50 | 22 | 72 | 0.0108 | 61.8 | 1986–2000 | 0.91–36.8 |
South Fork Red River | 97.8 | 67 | 30 | 97 | 0.0146 | 105.7 | 1986–1999 | 0.2–11 |
South Fork Salmon River | 853.6 | 35 | 16 | 51 | 0.0025 | 35 | 1985–1997 | 3.8–124.3 |
Squaw Creek (USGS)a | 192 | 22 | 10 | 32 | 0.0100 | 46.6 | 1994–1995 | 0.4–7.5 |
Thompson Creek | 58.1 | 16 | 8 | 24 | 0.0153 | 67.1 | 1994–1995 | 0.4–3.5 |
Trapper Creek | 22.2 | 60 | 27 | 87 | 0.0414 | 86.1 | 1985–1997 | 0.05–2.8 |
Hawley Creek | 104.8 | 45 | 20 | 65 | 0.0233 | 40 | 1990–1996 | 0.27–2.6 |
Salmon River near Obsidian | 243.9 | 14 | 6 | 19 | 0.0066 | 61.8 | 1990 | 11.44–20.9 |
Squaw Creek (USFS)b | 37.6 | 26 | 12 | 38 | 0.0240 | 23 | 1990–1996 | 0.18–1.5 |
Rivers . | Drainage area (km2) . | Data for training . | Data for testing . | Total data . | Slope m/m . | d50,sur (mm) . | Data of sampling . | Range of discharge (m3/s) . |
---|---|---|---|---|---|---|---|---|
Big Wood River | 349.7 | 17 | 8 | 26 | 0.0091 | 119 | 1999–2000 | 9.6–30.8 |
Bruneaul River | 989 | 18 | 9 | 27 | 0.0054 | 27 | 1998–2002 | 4.7–20.9 |
Fourth Of July | 44.28 | 17 | 8 | 25 | 0.0202 | 51 | 1994–1995 | 0.2–3.8 |
Herd Creek | 292.6 | 15 | 7 | 22 | 0.0077 | 67 | 1994–1995 | 0.5–8.1 |
Jarbidge River | 79.25 | 18 | 9 | 26 | 0.0160 | 89 | 1998–2002 | 1.4–8 |
Johns Creek | 293.1 | 14 | 7 | 22 | 0.207 | 199.2 | 1986–1995 | 0.97–26 |
Little Slate Creek | 168.5 | 55 | 24 | 79 | 0.0268 | 98.1 | 1986–1997 | 0.52–15.7 |
Lolo Creek | 107.7 | 28 | 13 | 41 | 0.0097 | 67 | 1980–1997 | 1.8–16.2 |
Main Fork Red River | 129.3 | 77 | 33 | 110 | 0.0059 | 50.5 | 1986–1999 | 0.29–18.2 |
Marsh Creek | 191.5 | 18 | 9 | 27 | 0.0060 | 57 | 1994–1995 | 3.36–23.2 |
Rapid River | 279.5 | 50 | 22 | 72 | 0.0108 | 61.8 | 1986–2000 | 0.91–36.8 |
South Fork Red River | 97.8 | 67 | 30 | 97 | 0.0146 | 105.7 | 1986–1999 | 0.2–11 |
South Fork Salmon River | 853.6 | 35 | 16 | 51 | 0.0025 | 35 | 1985–1997 | 3.8–124.3 |
Squaw Creek (USGS)a | 192 | 22 | 10 | 32 | 0.0100 | 46.6 | 1994–1995 | 0.4–7.5 |
Thompson Creek | 58.1 | 16 | 8 | 24 | 0.0153 | 67.1 | 1994–1995 | 0.4–3.5 |
Trapper Creek | 22.2 | 60 | 27 | 87 | 0.0414 | 86.1 | 1985–1997 | 0.05–2.8 |
Hawley Creek | 104.8 | 45 | 20 | 65 | 0.0233 | 40 | 1990–1996 | 0.27–2.6 |
Salmon River near Obsidian | 243.9 | 14 | 6 | 19 | 0.0066 | 61.8 | 1990 | 11.44–20.9 |
Squaw Creek (USFS)b | 37.6 | 26 | 12 | 38 | 0.0240 | 23 | 1990–1996 | 0.18–1.5 |
aUSGS: United States Geological Survey.
bUSFS: United States Forest Service.
Gaussian process regression
If there are n training data and test data, then represents the matrix of covariances evaluated at all pairs of training and test dataset, and this is similarly true for the other value of , , and ; here, and are the vector of the training data and training data labels .
To acquire the hyperparameters, the partial derivative of Equation (3) can be obtained with respect to and k, and minimization can be obtained by gradient descent. For more information about GPR and different covariance functions, readers are referred to Kuss (2006).
Support vector machine
It is observed that the defined equation in SVM (Equation (4)) is similar to Gaussian process formulation for regression. In fact, GPR is inspired by SVM's structure and formulation and both approaches introduce two different but equivalent perspectives for regression by application of a function f(.) directly to the input data points. Put differently, in the GPR case, data were generated with Gaussian white noise around the function f, but in the case of SVM, ɛ-insensitive error function can be considered as a non-Gaussian likelihood or noise model.
Among the various kernel functions presented in Table 2, the RBF kernel is reported to perform better than other kernel functions and was used in the presented study (Pal et al. 2014; Azamathulla et al. 2016; Komasi et al. 2018).
Kernel type . | Function . | Kernel parameter . |
---|---|---|
Linear | – | |
Polynomial | d | |
RBF | ||
Sigmoid | , c |
Kernel type . | Function . | Kernel parameter . |
---|---|---|
Linear | – | |
Polynomial | d | |
RBF | ||
Sigmoid | , c |
Experimental setup
In the present study, the prediction process was codified in MATLAB® environment. Regression learning toolbox, which is available for free, was partly used for prediction of sediment transport rate. Setting of the optimal values of capacity constant (C), the size of error-intensive zone (ɛ), Gaussian noise and, most importantly, the kernel parameter (γ) are some of the issues which considerably affect the proposed modeling process. The optimum values of Gaussian noise and kernel parameter (γ) were obtained after trial-and-error process. Furthermore, optimization of parameters C and ɛ has been carried out by a systematic grid search of the parameters using cross-validation on the training dimensionless measures.
Different models were defined using the mentioned parameters and after trial-and-error procedure, the best models were selected. Table 3 sums up the selected input combination for modeling bed load and total load in two scenarios based on flow conditions and sediment properties.
Scenario 1 . | Scenario 2 . | ||||
---|---|---|---|---|---|
Parameters of flow conditions . | Parameters of flow conditions and sediment properties . | ||||
All states . | Bed load . | Total load . | |||
Models . | Input variables . | Models . | Input variables . | Models . | Input variables . |
(I) | BL(I) | TL(I) | |||
(II) | BL(II) | TL(II) | |||
(III) | BL(III) | TL(III) | |||
(IV) |
Scenario 1 . | Scenario 2 . | ||||
---|---|---|---|---|---|
Parameters of flow conditions . | Parameters of flow conditions and sediment properties . | ||||
All states . | Bed load . | Total load . | |||
Models . | Input variables . | Models . | Input variables . | Models . | Input variables . |
(I) | BL(I) | TL(I) | |||
(II) | BL(II) | TL(II) | |||
(III) | BL(III) | TL(III) | |||
(IV) |
Empirical approaches
Due to the plethora of equations that have been developed for sediment transport rate, the results of Khorram & Ergil's (2010a, 2010b) studies were used in order to choose empirical approaches. They utilized above 2,000 laboratory and 700 field data for investigating the efficiency of 75 different formulas in order to predict bed load and total sediment load and introduced the most appropriate equations for sand and gravel-bed rivers separately. The selected formulas are presented in Tables 4 and 5.
Formula . | Name . | Approach . |
---|---|---|
Parker et al. (1982); Pitlick et al. (2009) | Deterministic equal mobility method | |
Wilcock (2001); Pitlick et al. (2009) | Deterministic equal mobility method | |
Rottner (1959); Yang (1996) | Regression method | |
Engelund & Hansen (1967) | Regression method | |
Formula . | Name . | Approach . |
---|---|---|
Parker et al. (1982); Pitlick et al. (2009) | Deterministic equal mobility method | |
Wilcock (2001); Pitlick et al. (2009) | Deterministic equal mobility method | |
Rottner (1959); Yang (1996) | Regression method | |
Engelund & Hansen (1967) | Regression method | |
Formula . | Name . | Approach . |
---|---|---|
Ackers & White (1973); Yang (2006) | Energy balance concept | |
Graf & Acaroglu (1968) | Shear intensity | |
Karim (1998) | Regression method | |
Bhattacharya et al. (2007) | Regression analysis via machine learning |
Formula . | Name . | Approach . |
---|---|---|
Ackers & White (1973); Yang (2006) | Energy balance concept | |
Graf & Acaroglu (1968) | Shear intensity | |
Karim (1998) | Regression method | |
Bhattacharya et al. (2007) | Regression analysis via machine learning |
, and : bed, suspended and total load transport rate per unit width (L2T−1).
Q, : discharge of water and total sediment load (L3T−1).
: sediment mobility parameter (–).
: shear velocity (LT−1).
: sediment specific gravity (–).
and : density of water and sediment (ML−3).
and : specific weight of water and sediment (ML−2T−2).
: energy slope (m/m).
and : shear and critical shear stress at the bed (ML−1T−2).
: hydraulic radius (L).
: average velocity (LT−1).
: particle median size; 50% of the sample is finer (L).
: median particle size for subsurface bed zone (L).
: sediment particle diameter (L).
: dimensionless particle parameter (–).
: transport stage parameter (–).
: fall velocity of sediment particles (LT−1).
g: acceleration due to gravity (LT−2).
: dimensionless intensity of the bed load transport rate (–).
: shear intensity parameter (–).
: transport parameter (–).
Performance criteria
RESULTS AND DISCUSSION
Owing to constraints of available information about sediment transport and also fewer variables of flow characteristics, scenario 1 was defined based on flow conditions. In scenario 2, for predicting bed load and total load, six models were developed based on flow conditions and sediment properties. The values of performance criteria obtained from GPR and SVM models based on scenario 1 are presented in Table 6. From the obtained results of statistical parameters (RMSE, R, NSE, and RE), it is observed that models based on scenario 1 have not been accurate enough for predicting bed load transport rate, while the estimated and observed values of scenario 1 for total load are in good agreement and it can be stated that it is possible to achieve a good approximation of total sediment load by using only hydraulic characteristics. It seems that the inability of the first scenario in predicting the bed load transport rate is due to the fact that the bed load transport is more affected by characteristics of bed layer such as median diameter of particles and using hydraulic characteristics as input parameters caused insufficient results. Conversely, suspended load (which is more affected by flow conditions) is more important in studied gravel-bed rivers and constitutes almost 75% of the total sediment load on average. Therefore, it seems different effective hydraulic parameters are the effective factors for prediction of total sediment load. In scenario 1, the most accurate estimations correspond to model (II), in which the input parameters are: , , and . In scenario 2, different combinations of input variables were developed after a trial-and-error process according to flow conditions and sediment properties. Comparing the results between the two scenarios demonstrates the superiority of scenario 2 in quantification of bed load and total load transport rate. The results of the employed methods for models based on scenario 2 are listed in Table 7. The best models for predicting bed load and total load were, respectively, BL(II) and TL(II) with input parameters of , ,, and for bed load and , , , and for total sediment load. According to the results presented in Table 7, using four inputs ensures the best performance, and an increased number of inputs did not have any effect on improving the accuracy of the employed methods. According to NSE values in predicting the total load, when comparing TL(I) and TL(II), introducing particle mobility parameter θ and instead of and improves the accuracy of the models in NSE = 0.894 (GPR) and NSE = 0.858 (SVM) for the test set. Furthermore, in the case of bed load, considering and and also omitting and leads to better outcomes in NSE = 0.831 (GPR) and NSE = 0.806 (SVM). However, according to the performance statistics which are presented in Table 7, GPR shows more flexibility and provides better prediction capability for both BL(I) and BL(II) models. Results of models BL(I) and TL(I) revealed that considering the combination of and with influential parameters of Fr and produces relatively accurate prediction for sediment transport rate in gravel-bed rivers. Therefore, it can be assumed that the dimensionless shear stress and dimensionless median particle size are effective parameters in prediction of sediment transport rates. From the kernel-based methods utilized, a detailed comparison of the overall performance shows that prediction of GPR is reasonably better than SVM in the case of sediment transport rate. The scatter plots of the model predictions for the test set including 278 points are depicted in Figure 1. Due to the high dispersion of data in low sediment transport rate and in order to compare the obtained results in a better way, the scatter plots are shown on logarithmic scale.
Input models . | Method . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|
Train . | Test . | ||||||||
R . | NSE . | RMSE (t/day) . | RE (%) . | R . | NSE . | RMSE (t/day) . | RE (%) . | ||
Bed load | |||||||||
(I) | SVM | 0.796 | 0.575 | 0.042 | 5.31 | 0.781 | 0.564 | 0.046 | 8.07 |
GPR | 0.824 | 0.676 | 0.037 | 5.92 | 0.814 | 0.663 | 0.040 | 8.23 | |
(II) | SVM | 0.809 | 0.615 | 0.040 | 5.88 | 0.779 | 0.565 | 0.045 | 7.99 |
GPR | 0.842 | 0.705 | 0.035 | 5.53 | 0.819 | 0.672 | 0.039 | 6.48 | |
(III) | SVM | 0.792 | 0.590 | 0.041 | 4.62 | 0.765 | 0.570 | 0.045 | 8.39 |
GPR | 0.829 | 0.685 | 0.036 | 5.90 | 0.814 | 0.660 | 0.040 | 8.32 | |
(IV) | SVM | 0.807 | 0.620 | 0.040 | 5.55 | 0.788 | 0.601 | 0.044 | 8.54 |
GPR | 0.841 | 0.706 | 0.035 | 5.35 | 0.830 | 0.688 | 0.038 | 6.55 | |
Total load | |||||||||
(I) | SVM | 0.916 | 0.794 | 0.026 | 6.35 | 0.876 | 0.713 | 0.037 | 9.39 |
GPR | 0.921 | 0.845 | 0.022 | 5.61 | 0.903 | 0.775 | 0.032 | 8.54 | |
(II) | SVM | 0.948 | 0.895 | 0.018 | 7.27 | 0.919 | 0.843 | 0.027 | 9.09 |
GPR | 0.947 | 0.893 | 0.018 | 4.50 | 0.927 | 0.850 | 0.026 | 5.48 | |
(III) | SVM | 0.914 | 0.784 | 0.027 | 4.61 | 0.910 | 0.775 | 0.029 | 7.34 |
GPR | 0.934 | 0.871 | 0.020 | 5.17 | 0.899 | 0.749 | 0.034 | 8.89 | |
(IV) | SVM | 0.934 | 0.868 | 0.021 | 8.98 | 0.912 | 0.827 | 0.028 | 11.77 |
GPR | 0.942 | 0.884 | 0.019 | 4.79 | 0.915 | 0.820 | 0.029 | 6.20 |
Input models . | Method . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|
Train . | Test . | ||||||||
R . | NSE . | RMSE (t/day) . | RE (%) . | R . | NSE . | RMSE (t/day) . | RE (%) . | ||
Bed load | |||||||||
(I) | SVM | 0.796 | 0.575 | 0.042 | 5.31 | 0.781 | 0.564 | 0.046 | 8.07 |
GPR | 0.824 | 0.676 | 0.037 | 5.92 | 0.814 | 0.663 | 0.040 | 8.23 | |
(II) | SVM | 0.809 | 0.615 | 0.040 | 5.88 | 0.779 | 0.565 | 0.045 | 7.99 |
GPR | 0.842 | 0.705 | 0.035 | 5.53 | 0.819 | 0.672 | 0.039 | 6.48 | |
(III) | SVM | 0.792 | 0.590 | 0.041 | 4.62 | 0.765 | 0.570 | 0.045 | 8.39 |
GPR | 0.829 | 0.685 | 0.036 | 5.90 | 0.814 | 0.660 | 0.040 | 8.32 | |
(IV) | SVM | 0.807 | 0.620 | 0.040 | 5.55 | 0.788 | 0.601 | 0.044 | 8.54 |
GPR | 0.841 | 0.706 | 0.035 | 5.35 | 0.830 | 0.688 | 0.038 | 6.55 | |
Total load | |||||||||
(I) | SVM | 0.916 | 0.794 | 0.026 | 6.35 | 0.876 | 0.713 | 0.037 | 9.39 |
GPR | 0.921 | 0.845 | 0.022 | 5.61 | 0.903 | 0.775 | 0.032 | 8.54 | |
(II) | SVM | 0.948 | 0.895 | 0.018 | 7.27 | 0.919 | 0.843 | 0.027 | 9.09 |
GPR | 0.947 | 0.893 | 0.018 | 4.50 | 0.927 | 0.850 | 0.026 | 5.48 | |
(III) | SVM | 0.914 | 0.784 | 0.027 | 4.61 | 0.910 | 0.775 | 0.029 | 7.34 |
GPR | 0.934 | 0.871 | 0.020 | 5.17 | 0.899 | 0.749 | 0.034 | 8.89 | |
(IV) | SVM | 0.934 | 0.868 | 0.021 | 8.98 | 0.912 | 0.827 | 0.028 | 11.77 |
GPR | 0.942 | 0.884 | 0.019 | 4.79 | 0.915 | 0.820 | 0.029 | 6.20 |
Input models . | Method . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|
Train . | Test . | ||||||||
R . | NSE . | RMSE (t/day) . | RE (%) . | R . | NSE . | RMSE (t/day) . | RE (%) . | ||
Bed load | |||||||||
BL(I) | SVM | 0.912 | 0.823 | 0.027 | 8.88 | 0.887 | 0.786 | 0.032 | 12.43 |
GPR | 0.941 | 0.883 | 0.022 | 4.79 | 0.934 | 0.870 | 0.025 | 7.24 | |
BL(II) | SVM | 0.916 | 0.810 | 0.028 | 7.21 | 0.898 | 0.806 | 0.030 | 10.14 |
GPR | 0.926 | 0.854 | 0.024 | 5.78 | 0.916 | 0.831 | 0.028 | 8 | |
BL(III) | SVM | 0.916 | 0.835 | 0.026 | 10.06 | 0.874 | 0.738 | 0.035 | 15.36 |
GPR | 0.914 | 0.831 | 0.026 | 6.04 | 0.899 | 0.800 | 0.030 | 9.15 | |
Total load | |||||||||
TL(I) | SVM | 0.953 | 0.908 | 0.017 | 9.66 | 0.888 | 0.775 | 0.032 | 15.68 |
GPR | 0.965 | 0.928 | 0.015 | 4.77 | 0.930 | 0.865 | 0.025 | 8.21 | |
TL(II) | SVM | 0.955 | 0.892 | 0.019 | 6 | 0.932 | 0.858 | 0.026 | 9.07 |
GPR | 0.968 | 0.935 | 0.014 | 4.15 | 0.948 | 0.894 | 0.022 | 6.56 | |
TL(III) | SVM | 0.946 | 0.888 | 0.019 | 13.16 | 0.891 | 0.778 | 0.032 | 17.31 |
GPR | 0.977 | 0.954 | 0.012 | 4.39 | 0.941 | 0.882 | 0.023 | 9.44 |
Input models . | Method . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|
Train . | Test . | ||||||||
R . | NSE . | RMSE (t/day) . | RE (%) . | R . | NSE . | RMSE (t/day) . | RE (%) . | ||
Bed load | |||||||||
BL(I) | SVM | 0.912 | 0.823 | 0.027 | 8.88 | 0.887 | 0.786 | 0.032 | 12.43 |
GPR | 0.941 | 0.883 | 0.022 | 4.79 | 0.934 | 0.870 | 0.025 | 7.24 | |
BL(II) | SVM | 0.916 | 0.810 | 0.028 | 7.21 | 0.898 | 0.806 | 0.030 | 10.14 |
GPR | 0.926 | 0.854 | 0.024 | 5.78 | 0.916 | 0.831 | 0.028 | 8 | |
BL(III) | SVM | 0.916 | 0.835 | 0.026 | 10.06 | 0.874 | 0.738 | 0.035 | 15.36 |
GPR | 0.914 | 0.831 | 0.026 | 6.04 | 0.899 | 0.800 | 0.030 | 9.15 | |
Total load | |||||||||
TL(I) | SVM | 0.953 | 0.908 | 0.017 | 9.66 | 0.888 | 0.775 | 0.032 | 15.68 |
GPR | 0.965 | 0.928 | 0.015 | 4.77 | 0.930 | 0.865 | 0.025 | 8.21 | |
TL(II) | SVM | 0.955 | 0.892 | 0.019 | 6 | 0.932 | 0.858 | 0.026 | 9.07 |
GPR | 0.968 | 0.935 | 0.014 | 4.15 | 0.948 | 0.894 | 0.022 | 6.56 | |
TL(III) | SVM | 0.946 | 0.888 | 0.019 | 13.16 | 0.891 | 0.778 | 0.032 | 17.31 |
GPR | 0.977 | 0.954 | 0.012 | 4.39 | 0.941 | 0.882 | 0.023 | 9.44 |
Figure 2 illustrates NSE values of different γ values of the employed GPR and SVM models (fed with the BL(II) and TL(II) as the best input combinations). In the case of RBF kernel, γ indicates the optimal width of kernel function. From the figure it can be seen that the NSE values fluctuate with varying γ values. Considering the SVM approach, small values of γ lead to the risk of overfitting (as a result of ignoring most of the support vectors). Conversely, GPR provides better performance with smaller γ values and is less threatened by the danger of overfitting. Moreover, in contrast to the SVM method, a clear smooth change of NSE values with variation of γ values can be seen in utilization of the GPR approach.
Results of empirical equations
The results of selected empirical equations in comparison to GPR and SVM for predicting bed load and total load are demonstrated in Figure 3. Based on the value of RMSE, it can be clearly seen that none of the proposed equations are sufficiently precise. The important point about empirical methods is that the existing equations were developed in special laboratories with specific flow conditions and sediment particle features, therefore, these equations show acceptable results in particular conditions, but their applicability to field data with various hydraulic conditions is questionable. However, developing an equation that quantifies bed load and total load rate for all streams, seems to be impossible.
Sensitivity analysis
In this step, sensitivity analysis is used to investigate the effect of different parameters on the sediment transport process. The superior models with four inputs for bed load and total load were selected and the importance of each parameter was evaluated by eliminating them. According to the results of sensitivity analysis which are presented in Table 8, it can be clearly seen that the ratio of average to shear velocity (V/U*) has the most significant effect in quantification of sediment transport rate in gravel-bed rivers, which represents the flow resistance in open channels. Furthermore, the Froude number is also a common parameter in relative modeling and plays an important role in predicting bed and total sediment load. Meantime, results of performed sensitivity analysis (Table 8) show that for prediction of total sediment load, elimination of θ from the list of input variables leads to better generalization ability of the GPR approach (R = 0.951, NSE = 0.901, RMSE = 0.021, and RE = 7.26%). From the analysis, it can be inferred that the employed GPR approach with three inputs is able to successfully predict total load transport in a great variety of gravel-bed rivers.
Best model . | Eliminated variable . | Method . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | |||||||||
R . | NSE . | RMSE (t/day) . | RE (%) . | R . | NSE . | RMSE (t/day) . | RE (%) . | |||
BL(II) | Fr | SVM | 0.833 | 0.658 | 0.038 | 5.89 | 0.804 | 0.641 | 0.041 | 7.83 |
GPR | 0.889 | 0.789 | 0.029 | 5.14 | 0.874 | 0.706 | 0.037 | 7.69 | ||
SVM | 0.446 | 0.173 | 0.059 | 21.84 | 0.233 | 0.002 | 0.069 | 24.48 | ||
GPR | 0.530 | 0.278 | 0.055 | 13.70 | 0.400 | 0.154 | 0.064 | 15.52 | ||
SVM | 0.896 | 0.760 | 0.031 | 6.37 | 0.877 | 0.737 | 0.035 | 9.57 | ||
GPR | 0.933 | 0.869 | 0.023 | 4.89 | 0.900 | 0.807 | 0.030 | 6.59 | ||
SVM | 0.903 | 0.792 | 0.029 | 10.31 | 0.884 | 0.778 | 0.032 | 13.60 | ||
GPR | 0.921 | 0.845 | 0.025 | 6.14 | 0.900 | 0.781 | 0.032 | 8.71 | ||
TL(II) | Fr | SVM | 0.879 | 0.727 | 0.030 | 3.17 | 0.847 | 0.603 | 0.043 | 7.56 |
GPR | 0.962 | 0.924 | 0.016 | 4 | 0.947 | 0.882 | 0.023 | 6.04 | ||
SVM | 0.607 | 0.205 | 0.051 | 8.32% | 0.273 | 0.027 | 0.068 | 9.75 | ||
GPR | 0.801 | 0.640 | 0.034 | 9.55 | 0.332 | 0.064 | 0.067 | 13.87 | ||
SVM | 0.951 | 0.897 | 0.018 | 13.35 | 0.863 | 0.738 | 0.035 | 16.92 | ||
GPR | 0.970 | 0.940 | 0.014 | 4.34 | 0.951 | 0.901 | 0.021 | 7.26 | ||
SVM | 0.966 | 0.932 | 0.015 | 9.09 | 0.891 | 0.782 | 0.032 | 12.57 | ||
GPR | 0.970 | 0.940 | 0.014 | 4.5 | 0.947 | 0.887 | 0.023 | 7.48 |
Best model . | Eliminated variable . | Method . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | |||||||||
R . | NSE . | RMSE (t/day) . | RE (%) . | R . | NSE . | RMSE (t/day) . | RE (%) . | |||
BL(II) | Fr | SVM | 0.833 | 0.658 | 0.038 | 5.89 | 0.804 | 0.641 | 0.041 | 7.83 |
GPR | 0.889 | 0.789 | 0.029 | 5.14 | 0.874 | 0.706 | 0.037 | 7.69 | ||
SVM | 0.446 | 0.173 | 0.059 | 21.84 | 0.233 | 0.002 | 0.069 | 24.48 | ||
GPR | 0.530 | 0.278 | 0.055 | 13.70 | 0.400 | 0.154 | 0.064 | 15.52 | ||
SVM | 0.896 | 0.760 | 0.031 | 6.37 | 0.877 | 0.737 | 0.035 | 9.57 | ||
GPR | 0.933 | 0.869 | 0.023 | 4.89 | 0.900 | 0.807 | 0.030 | 6.59 | ||
SVM | 0.903 | 0.792 | 0.029 | 10.31 | 0.884 | 0.778 | 0.032 | 13.60 | ||
GPR | 0.921 | 0.845 | 0.025 | 6.14 | 0.900 | 0.781 | 0.032 | 8.71 | ||
TL(II) | Fr | SVM | 0.879 | 0.727 | 0.030 | 3.17 | 0.847 | 0.603 | 0.043 | 7.56 |
GPR | 0.962 | 0.924 | 0.016 | 4 | 0.947 | 0.882 | 0.023 | 6.04 | ||
SVM | 0.607 | 0.205 | 0.051 | 8.32% | 0.273 | 0.027 | 0.068 | 9.75 | ||
GPR | 0.801 | 0.640 | 0.034 | 9.55 | 0.332 | 0.064 | 0.067 | 13.87 | ||
SVM | 0.951 | 0.897 | 0.018 | 13.35 | 0.863 | 0.738 | 0.035 | 16.92 | ||
GPR | 0.970 | 0.940 | 0.014 | 4.34 | 0.951 | 0.901 | 0.021 | 7.26 | ||
SVM | 0.966 | 0.932 | 0.015 | 9.09 | 0.891 | 0.782 | 0.032 | 12.57 | ||
GPR | 0.970 | 0.940 | 0.014 | 4.5 | 0.947 | 0.887 | 0.023 | 7.48 |
CONCLUSION
In this study, datasets of 19 gravel-bed streams and rivers located in the State of Idaho, USA, were used to exhibit the functionality of machine learning methods in predicting sediment transport rate. Different combinations of non-dimensional parameters based on two scenarios were developed in order to attain the purpose and obtained results were compared with empirical approaches. The obtained results of the employed GPR and SVM methods demonstrate a great performance over empirical formulas. In predicting bed load transport rate, the obtained results reveal that the second scenario based on flow conditions and sediment properties is more accurate, while in the case of total sediment load, both scenarios lead to good outcomes. It was found that the inclusion of Fr, V/U*, R/D50, and θ inputs resulted in the best performance accuracy for prediction of bed load transport rate, and conversely, in prediction of total sediment load, using Fr, V/U*, θ, and yielded the best results. Performing sensitivity analysis demonstrates the significant effect of V/U* in sediment transport rate of gravel-bed rivers. The results found that the employed kernel-based approach represented by the GPR model was quite accurate in respect to prediction of sediment transport rates of gravel-bed rivers and performed better than the common SVM method. In addition, the Froude number plays an important role and is a common parameter in all superior models. However, the GPR and SVM are data-driven models and the results presented here are data sensitive, so further studies should be done using data from different rivers worldwide to evaluate the effectiveness of the recommended models.