Abstract
The amount of transported sediment load by streams is a vital but high nonlinear dynamic process in water resources management. In the current paper, two optimum predictive models subjected to artificial neural network (ANN) were developed. The employed inputs were then prioritized using diverse sensitivity analysis (SA) methods to address new updated but more efficient ANN structures. The models were found through the 263 processed datasets of three rivers in Idaho, USA using nine different measured flow and sediment variables (e.g., channel geometry, geomorphology, hydraulic) for a period of 11 years. The used parameters were selected based on the prior knowledge of the conventional analyses in which the effect of suspended load on bed load was also investigated. Analyzed accuracy performances using different criteria exhibited improved predictability in updated models which can lead to an advanced understanding of used parameters. Despite different SA methods being employed in evaluating model parameters, almost similar results were observed and then verified using relevant sensitivity indices. It was demonstrated that the ranked parameters using SA due to covering more uncertainties can be more reliable. Evaluated models using sensitivity indices showed that contribution of suspended load on predicted bed load is not significant.
INTRODUCTION
The transported sediments by rivers as a complicated set of processes between stream flow, geologic, geomorphic, and organic factors is an important but critical regionally specific concern in the hydrological perspective to realize how rivers work (e.g., Melesse et al. 2011; Hajbabaei et al. 2017; Sari et al. 2017; Jin et al. 2018). Such sediments can be very informative in assessment of engineering purposes (e.g., channels, reservoirs, and dams), geo-environmental and ecosystem impacts (e.g., protection of fish and wildlife habitats), and river basin management (e.g., soil erosion, transported sediments, and pollutants) (e.g., Kisi et al. 2012; Bouzeria et al. 2017; Jin et al. 2018). Thereby, prediction of sediment loads has become an important issue in many countries in introducing schemes for river water monitoring. Modeling approaches are the common way to estimate transported sediment loads. However, the effects of the involved parameters due to model structure, hydrological, time-series inputs, geological, geomorphological, hydrological and hydraulic features on predicted sediment loads should be considered. The wide variety of involved parameters exhibit no accepted universal approach to predict all types of sediment loads (Ma et al. 2017; Leimgruber et al. 2018; Asheghi & Hosseini 2020). This indicates why several modeling tools for simulating sediment loads have been developed and evaluated (e.g., Kisi et al. 2012; Bouzeria et al. 2017; Leimgruber et al. 2018; Asheghi & Hosseini 2020). However, development of transported sediment models often needs to identify the uncertainty and the sensitivity of system performance due to any changes in possible input data from that predicted. This process assists in reducing not only the level of uncertainty but also, to an extent, the practicability (Gevrey et al. 2003; Saltelli et al. 2008; Razavi & Gupta 2015; Vu-Bac et al. 2016; Asheghi & Hosseini 2020).
Almost all of the developed conventional and analytical predictive sediment load models have mainly been utilized by regression techniques relying on hydrologic engineering parameters or landscape features (e.g., Camenen & Larson 2008; Ahmad et al. 2010; Kumar 2012). Meanwhile, the deficiency of regression techniques in simulating the effects of auxiliary factors and the involved uncertainty of experimental tests as well as inaccurate prediction in the wide range of expanded data (Cao et al. 2016; Asheghi et al. 2019) should be considered. Therefore, the developed equations are regionally specific and thus their applicability for other areas can never be guaranteed. Owing to such drawbacks in the adopted equations, prediction of sediment loads using different variables is a challenging task in the field of computational hydrology (e.g., Melesse et al. 2011; Bouzeria et al. 2017; Jin et al. 2018). Despite increased computing power in creating more sophisticated mathematical models, identifying the most important parameters on predicted sediment loads using sensitivity analysis (SA) techniques can lead to generating more accurate predictive models for carried sediment by a river.
In recent years, soft computing and data mining techniques and, in particular, artificial neural networks (ANNs) have successfully been applied, not only to capture complex nonlinear predictive sediment load models but also to overcome the inefficiencies of conventional methods to produce more precise results (e.g., Kisi et al. 2012; Afan et al. 2015; Bouzeria et al. 2017; Toriman et al. 2018; Asheghi et al. 2019). The main goal of ANN technology in dynamic environments such as rivers is to build a system that can change, adapt, and convert the potentials to become computable using many different types of computer learning. In order to design adaptive models for the evolving complexity of dynamic environments, ANN-based models are indicated as an appropriate choice.
Such developed ANN-based models then can be analyzed by different SA methods to identify the importance of the input variables. The SA methods allow understanding the concept of scientific codes (Rabitz 1989) and play a crucial role to provide essential insights on model behavior, structure, and response to inputs (Razavi & Gupta 2015; Borgonovo & Plischke 2016; Jin et al. 2018). Subsequently, removing the less effective factors not only leads to simpler and cost-effective models but also reduces the design and analysis time (Storlie et al. 2009; Abbaszadeh Shahri 2016; Asheghi et al. 2019). This issue in water resource engineering is gaining more importance to explain the nonlinear relationships between the explicative and response variables of a problem (e.g., Bahremand & De Smedt 2008; Razavi & Gupta 2015; Leimgruber et al. 2018).
Applying SA compels the decision-maker to identify effective variables on forecasts and indicates the critical variables for which additional information may be obtained. It helps to expose inappropriate estimation and thus guides the decision-maker to concentrate on the relevant variables. Due to the influence of various uncertain parameters on transported sediment loads' behavior, there is a need to identify and rank the importance of input factors on model output.
In this paper, two ANN-based predictive models for suspended and bed load using compiled datasets from 11 years' measurements of three rivers in Idaho, USA were developed. Discharge (Q), mean grain size (D50), slope (S), flow velocity (V), area (A), depth (d), width (W), and shear flow velocity (U*) were the selected inputs according to prior knowledge of conventional analyses. The models were then updated using different SA methods and examined by means of different external sensitivity indices. The compared performances indicated the appropriate predictability level of the updated models which can lead to an advanced understanding of the parameters used for model improvement.
STUDY AREAS AND DATA SOURCE
The Main Fork Red River (MFRR), South Fork Red River (SFRR), and Little Slate Creek (LSC) are in the streams' category of the state of Idaho (Figure 1). The MFRR in northern Idaho forms a confluence with the SFRR in the Nez Perce National Forest and the watershed predominantly lies on metamorphic rocks. The LSC flow is also on land administered by the Nez Perce Forest Service, but the geology of the watershed is mostly intrusive igneous. A unified dataset from the primary information on flow records and sediment transport measurements was screened from the United States Department of Agriculture (USDA) and United State Geological Survey (USGS). The provided dataset covers a period of 11 years (1986–1997) for both suspended and bed load sediments, including 263 sets of discharge (Q), mean grain size (D50), slope (S), river area (A), velocity (V), river depth (d), river width (W), and shear flow velocity (U*). The components of the database were then categorized into channel geometry, geomorphological, and hydraulic sets. Descriptive statistics of the compiled datasets can be found in Table 1. Due to the wide range of precipitations in the recorded years and consequently significant observed variation in Q and A, higher standard deviation for these factors are to be expected (Table 1). The datasets were normalized within the range of [0, 1] as a necessary step to improve the learning speed and model stability. To organize training, testing, and validation sets for ANN models, datasets were randomized into 55%, 25%, and 20%. These values were considered because in comparison with several different tested percentages they showed more accurate results.
River . | Variable . | Mean . | Mean SE . | St. dev. . | Min . | Max . | Skewness . |
---|---|---|---|---|---|---|---|
MFRR | Q (ft3/s) | 151.50 | 10.50 | 105.3 | 13.3 | 487 | 1.36 |
D50 (mm) | 1.343 | 0.06 | 0.634 | 0.54 | 5.279 | 2.6 | |
W (ft) | 32.184 | 0.29 | 2.925 | 22 | 40.3 | 0.4 | |
V (ft/s) | 2.867 | 0.089 | 0.903 | 0.99 | 5.01 | 0.22 | |
d (ft) | 1.423 | 0.044 | 0.445 | 0.34 | 2.86 | 0.57 | |
A (ft2) | 46.4 | 2.41 | 24.22 | 9.3 | 126 | 0.94 | |
S (ft/ft) | 0.004 | 0.000001 | 0.000085 | 0.0038 | 0.0041 | −0.06 | |
U* (ft/s) | 0.42 | 0.007 | 0.06938 | 0.204 | 0.612 | −0.08 | |
SFRR | Q (ft3/s) | 109.78 | 9.95 | 93.85 | 7.25 | 458 | 1.69 |
D50 (mm) | 0.886 | 0.048 | 0.454 | 0.13 | 2.7 | 1.03 | |
W (ft) | 26.942 | 0.364 | 3.432 | 20 | 40 | 1.28 | |
V (ft/s) | 2.572 | 0.107 | 1.014 | 0.553 | 5.293 | 0.52 | |
d (ft) | 1.281 | 0.045 | 0.423 | 0.37 | 2.28 | 0.46 | |
A (ft2) | 37.37 | 1.66 | 15.69 | 10.3 | 78.95 | 0.94 | |
S (ft/ft) | 0.0014 | 0.000005 | 0.000044 | 0.0013 | 0.00146 | −0.13 | |
U* (ft/s) | 0.236 | 0.0044 | 0.042 | 0.125 | 0.32657 | 0.07 | |
LSC | Q (ft3/s) | 194.4 | 14.5 | 123.9 | 18.7 | 534 | 0.81 |
D50 (mm) | 1.118 | 0.147 | 1.26 | 0.42 | 6.65 | 3.73 | |
W (ft) | 37.775 | 0.487 | 4.16 | 22 | 44 | −1.37 | |
V (ft/s) | 2.537 | 0.127 | 1.085 | 0.68 | 5.39 | 0.52 | |
d (ft) | 1.619 | 0.044 | 0.374 | 0.81 | 2.67 | 0.18 | |
A (ft2) | 69.3 | 2.28 | 19.46 | 27.1 | 112 | −0.14 | |
S (ft/ft) | 0.0261 | 0.000052 | 0.00044 | 0.025 | 0.0267 | −0.26 | |
U* (ft/s) | 1.158 | 0.016 | 0.136 | 0.825 | 1.47 | −0.21 |
River . | Variable . | Mean . | Mean SE . | St. dev. . | Min . | Max . | Skewness . |
---|---|---|---|---|---|---|---|
MFRR | Q (ft3/s) | 151.50 | 10.50 | 105.3 | 13.3 | 487 | 1.36 |
D50 (mm) | 1.343 | 0.06 | 0.634 | 0.54 | 5.279 | 2.6 | |
W (ft) | 32.184 | 0.29 | 2.925 | 22 | 40.3 | 0.4 | |
V (ft/s) | 2.867 | 0.089 | 0.903 | 0.99 | 5.01 | 0.22 | |
d (ft) | 1.423 | 0.044 | 0.445 | 0.34 | 2.86 | 0.57 | |
A (ft2) | 46.4 | 2.41 | 24.22 | 9.3 | 126 | 0.94 | |
S (ft/ft) | 0.004 | 0.000001 | 0.000085 | 0.0038 | 0.0041 | −0.06 | |
U* (ft/s) | 0.42 | 0.007 | 0.06938 | 0.204 | 0.612 | −0.08 | |
SFRR | Q (ft3/s) | 109.78 | 9.95 | 93.85 | 7.25 | 458 | 1.69 |
D50 (mm) | 0.886 | 0.048 | 0.454 | 0.13 | 2.7 | 1.03 | |
W (ft) | 26.942 | 0.364 | 3.432 | 20 | 40 | 1.28 | |
V (ft/s) | 2.572 | 0.107 | 1.014 | 0.553 | 5.293 | 0.52 | |
d (ft) | 1.281 | 0.045 | 0.423 | 0.37 | 2.28 | 0.46 | |
A (ft2) | 37.37 | 1.66 | 15.69 | 10.3 | 78.95 | 0.94 | |
S (ft/ft) | 0.0014 | 0.000005 | 0.000044 | 0.0013 | 0.00146 | −0.13 | |
U* (ft/s) | 0.236 | 0.0044 | 0.042 | 0.125 | 0.32657 | 0.07 | |
LSC | Q (ft3/s) | 194.4 | 14.5 | 123.9 | 18.7 | 534 | 0.81 |
D50 (mm) | 1.118 | 0.147 | 1.26 | 0.42 | 6.65 | 3.73 | |
W (ft) | 37.775 | 0.487 | 4.16 | 22 | 44 | −1.37 | |
V (ft/s) | 2.537 | 0.127 | 1.085 | 0.68 | 5.39 | 0.52 | |
d (ft) | 1.619 | 0.044 | 0.374 | 0.81 | 2.67 | 0.18 | |
A (ft2) | 69.3 | 2.28 | 19.46 | 27.1 | 112 | −0.14 | |
S (ft/ft) | 0.0261 | 0.000052 | 0.00044 | 0.025 | 0.0267 | −0.26 | |
U* (ft/s) | 1.158 | 0.016 | 0.136 | 0.825 | 1.47 | −0.21 |
Note: The units are according to US measurement system: SE, standard error; St. dev., standard deviation.
MODELING BY ANN
The ANNs are recognized as applicable and robust computational models for predicting and classification purposes. Typically, such structures are configured by an appropriate combination of artificial neurons and activation functions to improve the quality of processed information (e.g., Kisi et al. 2012; Bouzeria et al. 2017; Toriman et al. 2018; Asheghi & Hosseini 2020). In each artificial neuron (Figure 2), input (xi), weights (wi,j), bias (bi), activation function (fact), and output (Oi,j) are the involved components on information transferring. The data from the input layer are projected to the intermediate (hidden) layers while the final hidden layer projects the information to the output neurons.
SENSITIVITY ANALYSES TO ASSESS MODEL PARAMETERS
In recent years, different SA techniques have been developed to evaluate quantitative models and address the contribution of parameters on produced output (Borgonovo & Plischke 2016). The SA methods, due to their ability in determining the effectiveness of input parameters on produced outputs, are important in a simulation process (Calver 1988; Saltelli et al. 2000, 2008). According to the literature (e.g., Jacomino & Fields 1997; Saltelli 2002; Borgonovo & Plischke 2016), the SA methods are categorized into quantitative techniques, graphical method, sensitivity-index approach, and specified tailored mathematical models. These methods facilitate finding a simplified but robust calibrated model from a large number of parameters and identify important connections between observations and model output as well as ability in investigating the effect and impacts of the uncertainties in the output of a mathematical model (Wang et al. 2000; Saltelli 2002; Gevrey et al. 2003; Helton et al. 2006; Bahremand & De Smedt 2008).
The one-at-a-time (OAT/OFAT) method (Czitrom 1999), the local methods including adjoint modeling (Cacuci et al. 2005) and automated differentiation (Griewank 2000), scatter plots (Paruolo et al. 2013), regression analysis and variance-based methods (Sobol 1993), variogram-based methods (Haghnegahdar & Razavi 2017), screening (Campolongo et al. 2007), emulators (data-modeling/machine learning approaches) (Storlie et al. 2009), and probabilistic methods (Oakley & O'Hagan 2004; Vu-Bac et al. 2016) are some of the used or introduced SA methods.
In ANN-based models, the SA is conducted by analyzing adjusted weights through the equation method (EM) (Hashem 1992), weight magnitude analysis method (WMAM) (Garson 1991; Poh et al. 1998), variable perturbation method (VPM) (e.g., Gedeon 1997; Poh et al. 1998; Montaño & Palmer 2003; Zeng & Yeung 2003), partial derivative algorithm (PaD) (Dimopoulos et al. 1995), profile method (PM) (Lek et al. 1996), stepwise method (SM) (Sung 1998; Gevrey et al. 2003), and cosine amplitude method (CAM) (Ross 1995). Despite different suggested SA techniques, the PaD and the VPM have presented superior performance compared to other techniques based on the WMAM (Wang et al. 2000; Zeng & Yeung 2003). However, successes of the CAM in different engineering applications have also been approved (Abbaszadeh Shahri 2016; Abbaszadeh Shahri & Asheghi 2018; Abbaszadeh Shahri et al. 2019).
The PM introduced by Lek et al. (1996) aims to analyze the median of a particular input subjected to fixing of all other inputs using dividing into five equal subintervals (scales) corresponding to minimum, quarter, half, three quarters, and maximum. The contribution of each input parameter then can be explained from the created profile of the median values against the corresponding subintervals. This procedure should be executed for all inputs to obtain a set of descriptive relative importance curves (Gevrey et al. 2003).
The SM is focused on examining a step-by-step procedure for adding or rejecting the input using an iterative loop. In the SM process, by blocking one-by-one of the input parameters and calculating the corresponding MSE of responses, the relative importance of each input variable is ranked. The parameter with the maximum MSE value is considered as the most important and can then be either removed from the model or use its mean value to find the contribution of other parameters (Gevrey et al. 2003). The SM can be organized into two forward and backward strategies. In the backward strategy, the MSE of each parameter is calculated using constructed ANN models consisting of all input parameters and then starting to block each input parameter while forward strategy works in the reverse way (Sung 1998).
The VPM is a common straightforward SA technique for ANN-based models which can be achieved by analyzing the output disturbance due to perturbed inputs. The VPM adjusts the input values of one variable while keeping all the other variables untouched (Gedeon 1997; Montaño & Palmer 2003). In the VPM, the direct small perturbation on each ANN input and the corresponding change in the outputs is measured, while EM and WMAM analyze indirect changes of ANN weights. The variance of the input parameter from 0 to 50% by steps of 5% can be implemented as perturbation and the generated outputs can be ranked based on the calculated MSE corresponding to each perturbed input (Gevrey et al. 2003).
APPLYING THE SA TO UPDATE ANN MODELS
In this paper, the contribution of input variables in predicted suspended and bed loads were found through two developed optimum ANN-based models. The dependency of optimum network size to internal characteristics (e.g., training algorithm, number of neurons, learning rate, activation function, architecture, regularization) implies that no standard method nor for programmatic network configuration neither to prevent the over-fitting problem is accepted (Ghaderi et al. 2019). To optimize the ANN models the organized procedure in Figure 3 using integration of trial-and-error methods with a developed code based on constructive techniques was followed. In this process, various training algorithms including quick propagation (QP), Levenberg–Marquardt (L-M), quasi Newton (QN) and momentum (MO) were used. As defined in Figure 3, different internal characteristics on numerous generated topologies were applied to avoid the overfitting problem and escape from local minima.
The QP as one of the most popular recognized back propagation training algorithms is based on the mathematical method of gradient descent, with appropriate results in most problems (Fahlman 1988). The L-M (Levenberg 1944; Marquardt 1963) is an advanced and fast non-linear optimization algorithm that can solve generic curve-fitting problems. However, it can only be used on networks with a single output unit or small networks because its memory requirements are proportional to the square of the number of weights in the network. Moreover, L-M is specifically designed to minimize the sum of squares error and thus cannot be used for other types of network error. The QN (Bertsekas 1995) is a network training algorithm based on Newton's method to avoid the need to store computed Hessian matrix during each iteration and thus requires less memory and can be used for bigger networks. The MO as a well-known standard algorithm in the neural network community is designed to overcome some of the problems associated with standard back propagation training algorithm and is used to speed up convergence and maintain generalization performance (Swanston et al. 1994). The MO is a locally adaptive approach in which each weight remembers the most recent update and thus each weight is able to update independent of other weights (Wiegerinck et al. 1994). Two stopping criteria, the minimum root mean square error (MRMSE) and number of iterations, were employed. The number of iterations is replaced when MRMSE cannot be achieved. As presented in Figure 4(a) and 4(b), the MRMSE of applied training algorithms subjected to different activation functions against the number of neurons was found to be 11 and 12, which further should be organized in hidden layer(s). According to the defined procedure in Figure 3, numerous models with similar structures but different internal characteristics were examined and investigated by the absolute error (AE). The AE as the deviation between predicted and measured values corresponds to model quality and indicates the amount of physical error and uncertainty in a measurement (Abbaszadeh Shahri 2016). In Figure 4(c)–4(f), a sample of the carried out procedure to find the optimum topologies and corresponding calculated AE, as well as model predictability for suspended and bed loads, is presented. The characteristics of optimum structures using applied training algorithms are reflected in Table 2. It was observed that the 7-5-6-1 and 8-5-7-1 structures for suspended and bed loads can generate higher predictability than other tested models (Table 2; Figure 4(a) and 4(b)). The effect of input parameters on the predicted sediment loads were then identified using PaD, CAM, RC, and EM sensitivity analysis methods (Figure 5). Despite observed differences, the ranked parameters almost follow a similar trend. Accordingly, the Q, V, d, and A for suspended load and the Q, V, D50, S, and d on bed load were identified as the most effective factors.
Training algorithm . | Number of neurons . | Corresponding structure . | MRMSE . | Activation function . | |
---|---|---|---|---|---|
Hidden layer . | Output . | ||||
Suspended load➔ Inputs: Q, S, V, d, W, U*, A | |||||
QP | 14 | 7-6-8-1 | 0.375 | hyperbolic tangent | logistic |
L-M | 11 | 7-11-1 | 0.361 | logistic | logistic |
QN | 12 | 7-5-7-1 | 0.350 | hyperbolic tangent | hyperbolic tangent |
MO | 11 | 7-5-6-1 | 0.317 | hyperbolic tangent | logistic |
Bed load➔ Inputs: Q, S, V, d, W, D50, A, Sus-load | |||||
QP | 15 | 8-6-9-1 | 0.441 | hyperbolic tangent | logistic |
L-M | 12 | 8-5-7-1 | 0.383 | logistic | logistic |
QN | 11 | 8-11-1 | 0.403 | hyperbolic tangent | hyperbolic tangent |
MO | 14 | 8-9-5-1 | 0.426 | logistic | hyperbolic tangent |
Training algorithm . | Number of neurons . | Corresponding structure . | MRMSE . | Activation function . | |
---|---|---|---|---|---|
Hidden layer . | Output . | ||||
Suspended load➔ Inputs: Q, S, V, d, W, U*, A | |||||
QP | 14 | 7-6-8-1 | 0.375 | hyperbolic tangent | logistic |
L-M | 11 | 7-11-1 | 0.361 | logistic | logistic |
QN | 12 | 7-5-7-1 | 0.350 | hyperbolic tangent | hyperbolic tangent |
MO | 11 | 7-5-6-1 | 0.317 | hyperbolic tangent | logistic |
Bed load➔ Inputs: Q, S, V, d, W, D50, A, Sus-load | |||||
QP | 15 | 8-6-9-1 | 0.441 | hyperbolic tangent | logistic |
L-M | 12 | 8-5-7-1 | 0.383 | logistic | logistic |
QN | 11 | 8-11-1 | 0.403 | hyperbolic tangent | hyperbolic tangent |
MO | 14 | 8-9-5-1 | 0.426 | logistic | hyperbolic tangent |
On the basis of SA results, the least effective factors on predicted output can be removed. This procedure is not only able to update the model and reduce network size but may also lead to increasing the accuracy of prediction (Hamby 1994; Saltelli 2002; Gevrey et al. 2003; Helton et al. 2006; Saltelli et al. 2008; Razavi & Gupta 2015; Vu-Bac et al. 2016; Abbaszadeh Shahri et al. 2019; Asheghi et al. 2019). Therefore, the S, U*, W and Sus-load, W, A as the three least effective factors were ignored. The results of updated models subjected to the most dominant identified variables (Figure 5) and the same randomized datasets through the defined procedure (Figure 3) are then reflected in Table 3 and Figure 6, respectively.
. | Model . | Topology . | Activation function . | Training algorithm . | MRMSE . |
---|---|---|---|---|---|
Suspended load | optimum | 7-5-6-1 | hidden layer: hyperbolic tangent | MO | 0.317 |
output: logistic | |||||
updated | 4-6-1 | hidden layer: logistic | MO | 0.198 | |
output: logistic | |||||
Bed load | optimum | 8-5-7-1 | hidden layer: logistic | L-M | 0.383 |
output: logistic | |||||
updated | 5-8-1 | hidden layer: logistic | QN | 0.201 | |
output: hyperbolic tangent |
. | Model . | Topology . | Activation function . | Training algorithm . | MRMSE . |
---|---|---|---|---|---|
Suspended load | optimum | 7-5-6-1 | hidden layer: hyperbolic tangent | MO | 0.317 |
output: logistic | |||||
updated | 4-6-1 | hidden layer: logistic | MO | 0.198 | |
output: logistic | |||||
Bed load | optimum | 8-5-7-1 | hidden layer: logistic | L-M | 0.383 |
output: logistic | |||||
updated | 5-8-1 | hidden layer: logistic | QN | 0.201 | |
output: hyperbolic tangent |
DISCUSSION AND VALIDATION
Each EEi using SI then can be characterized by the mean value and its standard deviation whereas high EEi values indicate more impact on the model Y. The lower SI value also shows the less variability in the output Y and consequently is more robust to variations in the model parameters (Gan et al. 2014; Song et al. 2015; Jin et al. 2018). The scatter plots of mean and standard deviation values of SI using validation datasets for suspended and bed load are presented in Figure 7. The closer to the (0, 0) coordinate is interpreted as more robust method to capture the changes of parameters.
In intelligence models, performance measurement is an essential task. The AUC-ROC (area under curve-receiver operating characteristics) curve is one of the most important evaluation metrics in illustrating the diagnostic ability of a classifier system, where the higher the AUC, the higher predictability in the model. The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings.
TPR is the measured percentage of actual positives which are correctly identified. In statistics, when performing multiple comparisons, a false positive ratio is the probability of falsely rejecting the null hypothesis for a particular test. The FPR usually refers to the expectancy of the false positive ratio. As presented in Figure 9, increased AUC-ROC of the updated models is an indicator of model improvement. This also implies that by removing the least effective factors the predictability of models has been increased.
The capability of updated and optimum models in covering a new set of feed data points can be interpreted by confidence intervals and prediction bands. These factors can reflect the region of uncertainties in the predicted or single additional observation values over a range of independent variables. Therefore, aggregation of data in a higher percentage of these factors indicates better model performance. The reflected results of updated models for suspended and bed loads subjected to 95% confidence interval and prediction bands using validation datasets assign higher predictability and consequently better performance than optimum structures (Figure 10(a) and 10(c)). To evaluate the accuracy performance, the calculated residuals (CR) (Figure 10(e) and 10(f)), measured and predicted values (Figure 10(b) and 10(d)), as well as MRMSE and R2 (Table 3) were compared to each other. The CR is the difference between the measured and predicted values and thus better performance can be found in higher values of R2 as well as lower CR and MRMSE (Figure 10 and Table 3). Decreasing the tolerances of CR and MRMSE, as well as increasing R2, is obvious evidence of improvement in the predictability level of updated models (Figure 10(e) and 10(f) and Tables 2 and 3).
CONCLUSION
Modeling of sediment loads as a very complex nonlinear behavior is a difficult task in river engineering. In the current paper, two predictive ANN-based models for suspended and bed loads of three rivers in Idaho, USA were successfully developed and examined. These models, using nine input parameters, covered the channel geometry, geomorphological features, and hydraulic characteristics. To overcome the complexity of the introduced models, four different SA methods, CAM, EM, RC, and PaD, were applied and two updated models with smaller size using the highest ranked inputs were introduced. It was observed that the best performance before applying the SA methods decreased from 11 and 12 neurons to 6 and 8, respectively. Accordingly, the calculated MRMSE values for suspended (0.317) and bed (0.383) loads after updating were reduced to 0.198 and 0.201. This implies a 37.54% and 47.54% decreasing of MRSME in the updating process of suspended and bed load predictions, which show a more superior performance than optimum models. Furthermore, decreasing the CR and AE as well as increasing the R2 values (2.04% in suspended and 3.1% in bed load) exhibited robust improvement in predictability of the updated models. Accordingly, the interpreted confidence and prediction intervals due to the presence of high aggregation of data in a more shrunk region of uncertainties demonstrated better consistency in updated models. Furthermore, comparing the performance of models using AUC-ROC as one of the most important evaluation metrics showed 9.39% and 7.56% improvement in accuracy level of bed and suspended loads, respectively. Such increasing in the covered AUC-ROC, as an indicator, confirmed that the predictability of updated models by removing the least effective factors can significantly be enhanced.
Although the contribution of input parameters on output according to the used SA techniques showed similar trend, the analyses indicated that the results of PaD and CAM were more reliable than EM and RC. The results of the applied SA methods were then verified using MAEEi, SI, and TSI indices and similarly the Q, V, S, d, and D50 (for bed load) and Q, V, A, and d (for suspended load) were recognized as the most effective factors on transported sediment loads. The influence of U* and W were evaluated as the least effective. To have insight and better understanding in transported sediment process, the effect of suspended load on bed loads was also considered. The applied SA methods showed that the effect of suspended load on bed load is not significant and thus can be ignored in bed load predictions.
The results of this study in distinguishing the critical and effective variables on dynamic nonlinear forecasts will assist decision-makers to know which additional information may need to be obtained. Therefore, appropriate decisions can help in strengthening the model and guide the decision-makers to concentrate on the relevant variables.