Abstract
Sediment transport is one of the most important issues in river engineering. In this study, the capability of the Kernel Extreme Learning Machine (KELM) approach for predicting the river daily Suspended Sediment Concentration (SSC) and Discharge (SSD) was assessed. Three successive hydrometric stations of Mississippi river were considered and based on the sediment and flow characteristics during the period of 2005–2008. Several models were developed and tested for SSC and SSD modeling. For improving the applied model efficiency, two post-processing techniques, namely Wavelet Transform (WT) and Ensemble Empirical Mode Decomposition (EEMD), were used. Also, two states of modeling based on stations' own data (state 1) and previous stations' data (state 2) were considered. The single and integrated KELM model results comparison indicated that the integrated WT and EEMD-KELM models resulted in more accurate outcomes. Results showed that data processing with WT was more effective than EEMD in increasing the models' efficiency. Data processing enhanced the models' capability by up to 15%. The results showed that the state 1 modeling led to better results, however, using the integrated KELM approaches the previous stations data could be applied successfully for SSC and SSD modeling when the stations' own data were not available.
HIGHLIGHT
The suspended sediment concentration (SSC) and suspended sediment discharge (SSD) were predicted via artificial intelligence approach in successive hydrometric stations. The data pre-processing impacts on models' efficiency improvement was assessed. The sensitivity analysis showed the most effective subseries was obtained from pre-processing models.
INTRODUCTION
Sediment transportation and accurate estimation of its rate is a significant issue for river engineers and researchers. So far, various and complex relationships have been proposed to predict the suspended sediment transport rate, such as velocity and critical shear stress based equations. However, the complex nature of sediment transport and lack of validated models make it difficult to model the suspended sediment concentration and suspended sediment discharge carried by rivers. Bhattacharya et al. (2004) stated that it is difficult to express the transport process through a deterministic mathematical framework. Based on laboratory experiments, Vongvisessomjai et al. (2010) studied the sediment transport for non-cohesive sediment in uniform flow at a no-deposition state. Harrington & Harrington (2013) evaluated the efficiency of the Sediment Rating Curve (SRC) method in modeling the suspended sediment load of the Bandon and Owenabue rivers in Ireland. Rajaee et al. (2009) and Chen & Chau (2016) indicated that the sediment rating curve and the auto-regressive integrated moving average model are inadequate to predict SSC under extreme hyperconcentrated flow conditions. Although the mentioned models led to promising results in sediment transport prediction, due to the importance of sediment transport and its impact on hydraulic structures it is necessary to use other methods with higher efficiency (Lafdani et al. 2013; Rahman & Chakrabarty 2020).
In recent years, intelligence techniques such as Artificial Neural Networks (ANNs), Neuro-Fuzzy models (NF), Genetic Programming (GP), Multivariate Adaptive Regression Splines (MARS), Kernel Extreme Learning Machine (KELM), and Gaussian Process Regression (GPR) have been used in assessing the complex hydraulic and hydrological phenomena (Roushangar & Ghasempour 2018) such as estimation of reference evapotranspiration (Yin et al. 2017), daily suspended sediment concentration modeling (Kaveh et al. 2017), side weir discharge coefficient modeling (Azamathulla et al. 2017), prediction of roughness coefficient in sewer pipes (Roushangar et al. 2020), and modeling form resistance coefficient of movable bed channels (Saghebian et al. 2020). In artificial intelligence models we are looking for a learning machine capable of finding an accurate approximation of a natural phenomenon, as well as expressing it in the form of an interpretable equation. However, this bias towards interpretability creates several new issues. The computer-generated hypotheses should take advantage of the already existing body of knowledge about the domain in question. However, the method by which we express our knowledge and make it available to a learning machine remains rather unclear (Babovic 2009). Machine learning, a branch of artificial intelligence, deals with representation and generalization using data learning techniques. Representation of data instances and functions evaluated on these instances are part of all machine learning systems. Generalization is the property that the system will perform well on unseen data instances; the conditions under which this can be guaranteed are a key object of study in the subfield of computational learning theory. There are a wide variety of machine learning tasks and successful applications (Mitchell 1997). In general, the task of a machine learning algorithm can be described as follows: Given a set of input variables and the associated output variable(s), the objective is learning a functional relationship for the input-output variables set. It should be noted that artificial intelligence models typically do not really represent the physics of a modeled process; they are just devices used to capture relationships between the relevant input and output variables. However, when the interrelationships among the relevant variables are poorly understood, finding the size and shape of the ultimate solution is difficult, and conventional mathematical analysis methods do not (or cannot) provide analytical solutions; these methods can predict the interest variable with more accuracy.
On the other hand, hybrid models based on signal decomposition can be effective in increasing the time series prediction methods' efficiency (Pachori et al. 2015). Wavelet analysis is one of the commonly used methods for signal decomposition. Additionally, the Empirical Mode Decomposition (EMD) method, which is suitable for nonlinear and non-stationary time series (Huang et al. 1998), has been used recently. Unlike wavelet decomposition, empirical mode decomposition extracts the data oscillatory mode components without a priori determining the basis functions or level of decomposition (Labate et al. 2013).
Therefore, in the current study, the Kernel Extreme Learning Machine (KELM) as a kernel-based approach was used for modeling Suspended Sediment Concentration (SSC) and Suspended Sediment Discharge (SSD) in three successive hydrometric stations. The KELM, as a kernel-based approach based on quadratic optimization of convex function, can easily switch from linear to nonlinear separation. This is realized by nonlinear mapping using so-called kernel functions. Kernel based approaches such as KELM are a relatively new and important method based on the different kernel types. Such models are based on statistical learning theory and are capable of adapting themselves to predict any variable of interest via sufficient inputs. The training of this method is fast, has high accuracy, and the probability of occurrence of data overtraining in this method is less. Discrete Wavelet Transform (DWT) and EEMD were used as pre-processing methods to improve the models efficiency. In integrated pre-processing models, the inputs data were decomposed into sub-series by Wavelet Transform (WT) and EEMD. Then, these sub-series were used as inputs in the KELM method. In this regard, daily sediment and flow data of the Mississippi river in the period of 2005–2008 were used and under two scenarios various models were developed. In the first scenario, the intended parameters of each station were estimated using the stations' own data, and in the second scenario, the SSC and SSD parameters were estimated using the previous station's data. Also, sensitivity analysis was carried out to select the most effective sub-series obtained from WT and EEMD in the modeling process.
MATERIALS AND METHODS
Study area
The Mississippi river is the second longest river, and the most important river of the second-largest drainage system on the North American continent. From its traditional source of Lake Itasca in northern Minnesota, it flows generally south for 3,730 km to the Mississippi River Delta in the Gulf of Mexico. In the current study, daily data of streamflow and suspended sediment discharge and concentration during the period of 2005–2008 were used. Three consecutive stations, namely station A (7010000), station B (7020500), and station C (7022000), were selected and suspended sediment discharge and suspended sediment concentration were investigated under two scenarios. In the first scenario, modeling was performed based on each station's data and in the second scenario the previous station's data were used. Table 1 shows the statistical characteristics of the selected stations. In this table, parameters Qsc, Qsd, and Qd are suspended sediment concentration, suspended sediment discharge, and flow discharge, respectively. Figure 1 shows the location of the selected stations.
Hydrometric station number . | Qsc (mg/L) . | Qd (ft3/s) . | Qsd (ton/day) . | |||
---|---|---|---|---|---|---|
Max . | Min . | Max . | Min . | Max . | Min . | |
7010000 | 716,000 | 63,000 | 1,510 | 59.6 | 200,000 | 12,600 |
7020500 | 695,000 | 64,100 | 1,650 | 44.8 | 2,150,000 | 10,900 |
7022000 | 710,000 | 68,300 | 1,260 | 40.3 | 1,740,000 | 11,300 |
Hydrometric station number . | Qsc (mg/L) . | Qd (ft3/s) . | Qsd (ton/day) . | |||
---|---|---|---|---|---|---|
Max . | Min . | Max . | Min . | Max . | Min . | |
7010000 | 716,000 | 63,000 | 1,510 | 59.6 | 200,000 | 12,600 |
7020500 | 695,000 | 64,100 | 1,650 | 44.8 | 2,150,000 | 10,900 |
7022000 | 710,000 | 68,300 | 1,260 | 40.3 | 1,740,000 | 11,300 |
Kernel Extreme Learning Machine (KELM)
The matrix K is identified as the target matrix of the hidden layers of the neural network. Huang et al. (2012) also introduced kernel functions in the design of ELM. Now, a number of kernel functions are used in the design of ELM such as linear, radial basis, normalized polynomial, polynomial kernel functions. Kernel function based ELM design is known as Kernel Extreme Learning Machine (KELM). For more details about KELM, readers and researchers are referred to Huang et al. (2012).
Pre-processing approaches
One of the most popular approaches in time series processing is Wavelet Transform (WT) (Farajzadeh & Alizadeh 2017). The WT uses a flexible window function (mother wavelet) in signal processing. The flexible window function can be changed over time according to the signal shape and compactness (Mehr et al. 2013). After using WT, the signal will decompose into two approximations (large-scale or low-frequency component) and detailed (small-scale component) components. An illustration of a three-level WT is shown in Figure 2. In the first level, the original signal (x) is decomposed to two components of approximation (cA1) and detailed (cD1). In the second level, cA1 is again decomposed to approximation (cA2) and detailed (cD2) components. Finally, in the third level, cA2 is decomposed to cA3 approximation and cD3 detailed components. The sum of all detailed sub-series and approximation series obtained from the third level will be the original signal (i.e. x = cD1 + cD2 + cD3 + cA3). The other approach for time series processing is Empirical Mode Decomposition (EMD). The EMD method is an effective self-adaptive dyadic filter bank which is applied to the white noise (a random signal which has equal intensity at different frequencies). By applying this method, each signal can be decomposed into a number of Inherent Mode Functions (IMFs) which can be used to process nonlinear and non-stationary signals. One of the advantages of this method is the ability to determine the instantaneous frequency of the signal. At each step of the signal decomposition into its frequency components, the high frequency components are separated first and this process must continue until the component with the lowest frequency remains (see Lei et al. (2009) for more details). EEMD is developed based on EMD. The main benefit of EEMD is solving the mode mixing problem of EMD which determines the true IMF as the mean of an ensemble of trials (Wu & Huang 2009). For selecting the most effective IMFs and using them as inputs in the modeling process, their energy values can be calculated and the IMFs with higher energy can be used as inputs.
Performance criteria
Simulation and model development
Output variable . | Model . | Input variable . | Output variable . | Model . | Input variable . |
---|---|---|---|---|---|
SSC modeling | |||||
QSC(t) | SC(I) | QSC(t–1) | QSC2(t) or QSC3(t) | H(I) | QSC1(t–1) or QSC2 t–1) |
SC(II) | QSC(t–1), Qd(t) | H(II) | QSC1(t–1), QSd1 t–1) or QSC2(t–1), QSd2 t–1) | ||
SC(III) | QSC(t–1), QSC(t–2) | H(III) | QSC1t–1), QSC1(t–2) or QSC2(t–1),QSC2 (t–2) | ||
SC(IV) | QSC(t–1), Qd(t–1) | H(IV) | QSC1(t), QSC1(t–1) or QSC2(t), QSC2(t–1) | ||
SSD modelling | |||||
QSd(t) | SD(I) | QSd(t–1) | QSd2(t) or QSd3(t) | D(I) | QSd1(t–1) or QSd2(t–1) |
SD(II) | QSd(t–1), QSd(t–2) | D(II) | QSd1(t–1), QSd1(t–2) or QSd2(t–1), QSd2(t–2) | ||
SD(III) | QSC(t–1), QSC(t–2), Qd(t–1), Qd (t–2) | D(III) | QSd1(t), QSd1(t–1) or QSd2(t), QSd2(t–1) | ||
QSd3(t) | D(IV) | QSd1, 2(t–1), QSd1, 2(t–2), QSd1, 2(t–3) |
Output variable . | Model . | Input variable . | Output variable . | Model . | Input variable . |
---|---|---|---|---|---|
SSC modeling | |||||
QSC(t) | SC(I) | QSC(t–1) | QSC2(t) or QSC3(t) | H(I) | QSC1(t–1) or QSC2 t–1) |
SC(II) | QSC(t–1), Qd(t) | H(II) | QSC1(t–1), QSd1 t–1) or QSC2(t–1), QSd2 t–1) | ||
SC(III) | QSC(t–1), QSC(t–2) | H(III) | QSC1t–1), QSC1(t–2) or QSC2(t–1),QSC2 (t–2) | ||
SC(IV) | QSC(t–1), Qd(t–1) | H(IV) | QSC1(t), QSC1(t–1) or QSC2(t), QSC2(t–1) | ||
SSD modelling | |||||
QSd(t) | SD(I) | QSd(t–1) | QSd2(t) or QSd3(t) | D(I) | QSd1(t–1) or QSd2(t–1) |
SD(II) | QSd(t–1), QSd(t–2) | D(II) | QSd1(t–1), QSd1(t–2) or QSd2(t–1), QSd2(t–2) | ||
SD(III) | QSC(t–1), QSC(t–2), Qd(t–1), Qd (t–2) | D(III) | QSd1(t), QSd1(t–1) or QSd2(t), QSd2(t–1) | ||
QSd3(t) | D(IV) | QSd1, 2(t–1), QSd1, 2(t–2), QSd1, 2(t–3) |
Note: Qsc: suspended sediment concentration, Qsd: suspended sediment discharge, Qd: water discharge.
In parameters Qsci or Qsdi, i shows the station number.
RESULTS AND DISCUSSION
KELM models development
It should be noted that each artificial intelligence method has its own parameters for achieving the desired results, and the optimized amount of these parameters should be determined. For example, in designing the KELM approach, the selection of appropriate type of kernel function is needed. There are various kernel functions which can be used based on the nature of the studied phenomenon. In this research, for selecting the best kernel function, the model SC(III) was used for SSC prediction in station 1 via various kernels. Figure 4(a) indicates the results of statistical parameters of different kernels for this model. According to Figure 4(a), the RBF kernel function [ in which γ is kernel parameter] was fined as the best kernel function. Figure 4(b) shows the RMSE statistic parameter via γ values for comparing the impact of RBF kernel parameter of γ on the performance of the employed algorithm for the testing set of model SC(III) for station 1. In this study, optimization of γ was performed by a systematic grid search of the parameter using cross-validation.
The results of SSC modeling
Modeling based on raw data
For evaluating the suspended sediment concentration in three selected stations, several models were developed based on the suspended sediment concentration, suspended sediment discharge, and flow discharge data. The models were analyzed with KELM model to carry out the SSC prediction. Table 3 and Figure 5 show the results of KELM models. From the obtained results of statistical parameters (RMSE, R and DC) it could be stated that in the first state the model SC(III) with input parameters of QS(t–1), QS(t–2) performed better than the others. Based on the results, it could be seen that in estimation of the SSC, using the previous suspended sediment concentration led to more accurate results and the use of sediment discharge had no significant impact on modeling. In the second state, the model H(IV) with input parameters of QS(t), QS(t–1) was selected as the superior model. A comparison between the results of the two states showed that modeling based on each station's own data led to more desirable results. However, using the previous station's data in the modeling process yielded relatively accurate results, therefore, via the KELM kernel-based approach, the previous station's data could be used when the station's own data were unavailable. In fact, in the second state, this issue was investigated whereby the existing sub-basins between the consecutive stations may have noticeable impacts on the flow regime of the downstream station. The distance between the stations used in this research was 50 km and since much sediment was not carried between stations, therefore a relationship was found between the flow regime of the upstream and downstream. However, if there are special conditions between the stations (such as diversion dams, intake structures, etc.) this connection may be less.
Station/model . | Performance criteria . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | . | Train . | Test . | ||||||||||
R . | DC . | RMSE . | R . | DC . | RMSE . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | ||
State 1 | ||||||||||||||
1 | SC(I) | 0.92 | 0.85 | 0.047 | 0.91 | 0.81 | 0.068 | 2 | 0.93 | 0.87 | 0.053 | 0.92 | 0.83 | 0.065 |
SC(II) | 0.83 | 0.86 | 0.046 | 0.83 | 0.62 | 0.099 | 0.94 | 0.88 | 0.051 | 0.88 | 0.69 | 0.089 | ||
SC(III) | 0.93 | 0.83 | 0.049 | 0.90 | 0.82 | 0.069 | 0.95 | 0.91 | 0.044 | 0.93 | 0.86 | 0.059 | ||
SC(IV) | 0.93 | 0.85 | 0.047 | 0.89 | 0.81 | 0.071 | 0.93 | 0.87 | 0.052 | 0.88 | 0.75 | 0.081 | ||
3 | SC(I) | 0.96 | 0.94 | 0.069 | 0.92 | 0.85 | 0.081 | |||||||
SC(II) | 0.93 | 0.87 | 0.052 | 0.86 | 0.64 | 0.104 | ||||||||
SC(III) | 0.90 | 0.89 | 0.042 | 0.89 | 0.87 | 0.056 | ||||||||
SC(IV) | 0.93 | 0.86 | 0.054 | 0.87 | 0.83 | 0.071 | ||||||||
State 2 | ||||||||||||||
2-1 | H(I) | 0.9 | 0.81 | 0.064 | 0.84 | 0.63 | 0.099 | 3-2 | 0.91 | 0.83 | 0.061 | 0.91 | 0.82 | 0.074 |
H(II) | 0.91 | 0.83 | 0.059 | 0.87 | 0.61 | 0.102 | 0.92 | 0.84 | 0.057 | 0.85 | 0.71 | 0.092 | ||
H(III) | 0.83 | 0.69 | 0.082 | 0.75 | 0.55 | 0.109 | 0.88 | 0.77 | 0.071 | 0.88 | 0.69 | 0.096 | ||
H(IV) | 0.92 | 0.87 | 0.058 | 0.88 | 0.71 | 0.087 | 0.92 | 0.86 | 0.06 | 0.91 | 0.82 | 0.072 | ||
3-1 | H(II) | 0.89 | 0.82 | 0.062 | 0.89 | 0.62 | 0.107 | |||||||
H(III) | 0.91 | 0.82 | 0.062 | 0.90 | 0.71 | 0.095 | ||||||||
H(IV) | 0.89 | 0.82 | 0.061 | 0.89 | 0.72 | 0.094 |
Station/model . | Performance criteria . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | . | Train . | Test . | ||||||||||
R . | DC . | RMSE . | R . | DC . | RMSE . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | ||
State 1 | ||||||||||||||
1 | SC(I) | 0.92 | 0.85 | 0.047 | 0.91 | 0.81 | 0.068 | 2 | 0.93 | 0.87 | 0.053 | 0.92 | 0.83 | 0.065 |
SC(II) | 0.83 | 0.86 | 0.046 | 0.83 | 0.62 | 0.099 | 0.94 | 0.88 | 0.051 | 0.88 | 0.69 | 0.089 | ||
SC(III) | 0.93 | 0.83 | 0.049 | 0.90 | 0.82 | 0.069 | 0.95 | 0.91 | 0.044 | 0.93 | 0.86 | 0.059 | ||
SC(IV) | 0.93 | 0.85 | 0.047 | 0.89 | 0.81 | 0.071 | 0.93 | 0.87 | 0.052 | 0.88 | 0.75 | 0.081 | ||
3 | SC(I) | 0.96 | 0.94 | 0.069 | 0.92 | 0.85 | 0.081 | |||||||
SC(II) | 0.93 | 0.87 | 0.052 | 0.86 | 0.64 | 0.104 | ||||||||
SC(III) | 0.90 | 0.89 | 0.042 | 0.89 | 0.87 | 0.056 | ||||||||
SC(IV) | 0.93 | 0.86 | 0.054 | 0.87 | 0.83 | 0.071 | ||||||||
State 2 | ||||||||||||||
2-1 | H(I) | 0.9 | 0.81 | 0.064 | 0.84 | 0.63 | 0.099 | 3-2 | 0.91 | 0.83 | 0.061 | 0.91 | 0.82 | 0.074 |
H(II) | 0.91 | 0.83 | 0.059 | 0.87 | 0.61 | 0.102 | 0.92 | 0.84 | 0.057 | 0.85 | 0.71 | 0.092 | ||
H(III) | 0.83 | 0.69 | 0.082 | 0.75 | 0.55 | 0.109 | 0.88 | 0.77 | 0.071 | 0.88 | 0.69 | 0.096 | ||
H(IV) | 0.92 | 0.87 | 0.058 | 0.88 | 0.71 | 0.087 | 0.92 | 0.86 | 0.06 | 0.91 | 0.82 | 0.072 | ||
3-1 | H(II) | 0.89 | 0.82 | 0.062 | 0.89 | 0.62 | 0.107 | |||||||
H(III) | 0.91 | 0.82 | 0.062 | 0.90 | 0.71 | 0.095 | ||||||||
H(IV) | 0.89 | 0.82 | 0.061 | 0.89 | 0.72 | 0.094 |
Note: in state 2, 2-1 means that the SSC values of station 2 are predicted based on station 1′ data.
3-1 means that the SSC values of station 3 are predicted based on station 1′ data.
3-2 means that the SSC values of station 3 are predicted based on station 2′ data.
Modeling based on pre-processing data
In this section, the effect of time series pre-processing on increasing the model's accuracy was investigated. Therefore, the time series were decomposed using WT and EEMD methods. To decompose the time series by WT, a mother wavelet which is more similar to the signal should be selected. In this study, the daubechies (db2 and db4) and symlet (sym2 and sym4) mother wavelets were trained and it was found that the db4 mother wavelet led to better outcomes. Therefore, the db4 mother wavelet was used for time series decomposition. Also, in the second step, data was decomposed via EEMD. The principle of EEMD is decomposition of signal to different IMFs and one residual signal. The sum of these signals will be the same original signal. The formation of IMFs is based on subtracting the basic function from the original signal. This process continues until the residual signal remains almost constant. In this study, time series were decomposed into 10 IMFs and one residual signal. Then, the obtained sub-series were used as inputs in the KELM model to predict the SSC. The results of the integrated pre-processing models are listed in Table 4 and shown in Figure 6. According to the results presented in Tables 2 and 3, it could be induced that data pre-processing significantly improved the results accuracy and integrated models were more accurate than a single-KELM model. In fact, the use of WT and EEMD led to an improvement in the outcomes. Sadeghpoor (2014) tried to design and evaluate the efficiency of the wavelet- SVM model for daily SSC forecasting and showed that the integrated models (wavelet SVM) provide acceptable predictions of the SSC. It was found that the wavelet transform is a powerful tool which has a great ability to extract useful information from time series. Consequently, it increases the SVM models’ performances significantly. In this study, in the integrated pre-processing method, the model's accuracy increased between 10 and 12% in training sets, and 8 and 20% in testing sets. According to the results, it could be seen that between two pre-processing methods, the EEMD had higher RMSE error criteria in comparison with the WT. Therefore, it could be stated that in enhancing the predictions accuracy, the WT method performed more successfully than the EEMD method. Also, it was found that the model SC(III) with input parameters of QS(t–1), QS(t–2) in modeling based on each station's own data and the model H(IV) with input parameters of QS(t), QS(t–1) in modeling based on the previous station's data performed more successfully.
Station/model/method . | Performance criteria . | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | Train . | Test . | . | . | Train . | Test . | ||||||||
. | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . |
State 1 | ||||||||||||||||
1 | SC(I) | EEMD | 0.91 | 0.88 | 0.047 | 0.87 | 0.84 | 0.067 | 2 | EEMD | 0.91 | 0.9 | 0.045 | 0.88 | 0.85 | 0.062 |
WT | 0.95 | 0.88 | 0.041 | 0.92 | 0.85 | 0.061 | WT | 0.91 | 0.9 | 0.045 | 0.88 | 0.85 | 0.062 | |||
SC(II) | EEMD | 0.92 | 0.91 | 0.035 | 0.83 | 0.81 | 0.069 | EEMD | 0.92 | 0.92 | 0.041 | 0.8 | 0.82 | 0.087 | ||
WT | 0.98 | 0.95 | 0.025 | 0.93 | 0.88 | 0.055 | WT | 0.98 | 0.97 | 0.022 | 0.96 | 0.91 | 0.044 | |||
SC(III) | EEMD | 0.95 | 0.94 | 0.028 | 0.96 | 0.90 | 0.039 | EEMD | 0.97 | 0.95 | 0.029 | 0.95 | 0.92 | 0.045 | ||
WT | 0.96 | 0.97 | 0.023 | 0.97 | 0.93 | 0.037 | WT | 0.98 | 0.96 | 0.028 | 0.98 | 0.94 | 0.041 | |||
SC(IV) | EEMD | 0.92 | 0.91 | 0.036 | 0.82 | 0.82 | 0.065 | EEMD | 0.91 | 0.88 | 0.049 | 0.87 | 0.79 | 0.074 | ||
WT | 0.96 | 0.92 | 0.034 | 0.92 | 0.85 | 0.062 | WT | 0.96 | 0.91 | 0.042 | 0.93 | 0.85 | 0.062 | |||
3 | SC(I) | EEMD | 0.96 | 0.93 | 0.049 | 0.88 | 0.82 | 0.064 | ||||||||
WT | 0.95 | 0.95 | 0.046 | 0.94 | 0.89 | 0.056 | ||||||||||
SC(II) | EEMD | 0.94 | 0.82 | 0.049 | 0.91 | 0.89 | 0.055 | |||||||||
WT | 0.98 | 0.92 | 0.027 | 0.97 | 0.91 | 0.033 | ||||||||||
SC(III) | EEMD | 0.97 | 0.95 | 0.029 | 0.94 | 0.91 | 0.038 | |||||||||
WT | 0.97 | 0.96 | 0.027 | 0.96 | 0.93 | 0.033 | ||||||||||
SC(IV) | EEMD | 0.92 | 0.91 | 0.036 | 0.82 | 0.82 | 0.065 | |||||||||
WT | 0.95 | 0.9 | 0.045 | 0.93 | 0.84 | 0.064 | ||||||||||
State 2 | ||||||||||||||||
2-1 | H(I) | EEMD | 0.91 | 0.85 | 0.059 | 0.85 | 0.72 | 0.088 | 3-2 | EEMD | 0.92 | 0.87 | 0.058 | 0.92 | 0.85 | 0.066 |
WT | 0.95 | 0.89 | 0.054 | 0.88 | 0.76 | 0.085 | WT | 0.96 | 0.91 | 0.053 | 0.96 | 0.88 | 0.064 | |||
H(II) | EEMD | 0.92 | 0.87 | 0.054 | 0.88 | 0.70 | 0.091 | EEMD | 0.93 | 0.88 | 0.057 | 0.86 | 0.73 | 0.082 | ||
WT | 0.96 | 0.91 | 0.050 | 0.91 | 0.74 | 0.088 | WT | 0.97 | 0.92 | 0.052 | 0.89 | 0.76 | 0.079 | |||
H(III) | EEMD | 0.84 | 0.72 | 0.075 | 0.76 | 0.63 | 0.097 | EEMD | 0.89 | 0.80 | 0.064 | 0.89 | 0.71 | 0.086 | ||
WT | 0.87 | 0.76 | 0.069 | 0.79 | 0.67 | 0.094 | WT | 0.92 | 0.84 | 0.059 | 0.92 | 0.74 | 0.083 | |||
H(IV) | EEMD | 0.93 | 0.91 | 0.053 | 0.89 | 0.82 | 0.078 | EEMD | 0.93 | 0.90 | 0.055 | 0.92 | 0.85 | 0.064 | ||
WT | 0.97 | 0.95 | 0.049 | 0.92 | 0.86 | 0.075 | WT | 0.97 | 0.94 | 0.051 | 0.96 | 0.88 | 0.062 | |||
3-1 | H(II) | EEMD | 0.90 | 0.86 | 0.057 | 0.90 | 0.71 | 0.095 | ||||||||
WT | 0.93 | 0.90 | 0.052 | 0.93 | 0.75 | 0.092 | ||||||||||
H(III) | EEMD | 0.92 | 0.86 | 0.057 | 0.91 | 0.82 | 0.085 | |||||||||
WT | 0.96 | 0.90 | 0.052 | 0.95 | 0.86 | 0.082 | ||||||||||
H(IV) | EEMD | 0.90 | 0.86 | 0.056 | 0.90 | 0.83 | 0.084 | |||||||||
WT | 0.93 | 0.90 | 0.051 | 0.93 | 0.87 | 0.081 |
Station/model/method . | Performance criteria . | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | Train . | Test . | . | . | Train . | Test . | ||||||||
. | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . |
State 1 | ||||||||||||||||
1 | SC(I) | EEMD | 0.91 | 0.88 | 0.047 | 0.87 | 0.84 | 0.067 | 2 | EEMD | 0.91 | 0.9 | 0.045 | 0.88 | 0.85 | 0.062 |
WT | 0.95 | 0.88 | 0.041 | 0.92 | 0.85 | 0.061 | WT | 0.91 | 0.9 | 0.045 | 0.88 | 0.85 | 0.062 | |||
SC(II) | EEMD | 0.92 | 0.91 | 0.035 | 0.83 | 0.81 | 0.069 | EEMD | 0.92 | 0.92 | 0.041 | 0.8 | 0.82 | 0.087 | ||
WT | 0.98 | 0.95 | 0.025 | 0.93 | 0.88 | 0.055 | WT | 0.98 | 0.97 | 0.022 | 0.96 | 0.91 | 0.044 | |||
SC(III) | EEMD | 0.95 | 0.94 | 0.028 | 0.96 | 0.90 | 0.039 | EEMD | 0.97 | 0.95 | 0.029 | 0.95 | 0.92 | 0.045 | ||
WT | 0.96 | 0.97 | 0.023 | 0.97 | 0.93 | 0.037 | WT | 0.98 | 0.96 | 0.028 | 0.98 | 0.94 | 0.041 | |||
SC(IV) | EEMD | 0.92 | 0.91 | 0.036 | 0.82 | 0.82 | 0.065 | EEMD | 0.91 | 0.88 | 0.049 | 0.87 | 0.79 | 0.074 | ||
WT | 0.96 | 0.92 | 0.034 | 0.92 | 0.85 | 0.062 | WT | 0.96 | 0.91 | 0.042 | 0.93 | 0.85 | 0.062 | |||
3 | SC(I) | EEMD | 0.96 | 0.93 | 0.049 | 0.88 | 0.82 | 0.064 | ||||||||
WT | 0.95 | 0.95 | 0.046 | 0.94 | 0.89 | 0.056 | ||||||||||
SC(II) | EEMD | 0.94 | 0.82 | 0.049 | 0.91 | 0.89 | 0.055 | |||||||||
WT | 0.98 | 0.92 | 0.027 | 0.97 | 0.91 | 0.033 | ||||||||||
SC(III) | EEMD | 0.97 | 0.95 | 0.029 | 0.94 | 0.91 | 0.038 | |||||||||
WT | 0.97 | 0.96 | 0.027 | 0.96 | 0.93 | 0.033 | ||||||||||
SC(IV) | EEMD | 0.92 | 0.91 | 0.036 | 0.82 | 0.82 | 0.065 | |||||||||
WT | 0.95 | 0.9 | 0.045 | 0.93 | 0.84 | 0.064 | ||||||||||
State 2 | ||||||||||||||||
2-1 | H(I) | EEMD | 0.91 | 0.85 | 0.059 | 0.85 | 0.72 | 0.088 | 3-2 | EEMD | 0.92 | 0.87 | 0.058 | 0.92 | 0.85 | 0.066 |
WT | 0.95 | 0.89 | 0.054 | 0.88 | 0.76 | 0.085 | WT | 0.96 | 0.91 | 0.053 | 0.96 | 0.88 | 0.064 | |||
H(II) | EEMD | 0.92 | 0.87 | 0.054 | 0.88 | 0.70 | 0.091 | EEMD | 0.93 | 0.88 | 0.057 | 0.86 | 0.73 | 0.082 | ||
WT | 0.96 | 0.91 | 0.050 | 0.91 | 0.74 | 0.088 | WT | 0.97 | 0.92 | 0.052 | 0.89 | 0.76 | 0.079 | |||
H(III) | EEMD | 0.84 | 0.72 | 0.075 | 0.76 | 0.63 | 0.097 | EEMD | 0.89 | 0.80 | 0.064 | 0.89 | 0.71 | 0.086 | ||
WT | 0.87 | 0.76 | 0.069 | 0.79 | 0.67 | 0.094 | WT | 0.92 | 0.84 | 0.059 | 0.92 | 0.74 | 0.083 | |||
H(IV) | EEMD | 0.93 | 0.91 | 0.053 | 0.89 | 0.82 | 0.078 | EEMD | 0.93 | 0.90 | 0.055 | 0.92 | 0.85 | 0.064 | ||
WT | 0.97 | 0.95 | 0.049 | 0.92 | 0.86 | 0.075 | WT | 0.97 | 0.94 | 0.051 | 0.96 | 0.88 | 0.062 | |||
3-1 | H(II) | EEMD | 0.90 | 0.86 | 0.057 | 0.90 | 0.71 | 0.095 | ||||||||
WT | 0.93 | 0.90 | 0.052 | 0.93 | 0.75 | 0.092 | ||||||||||
H(III) | EEMD | 0.92 | 0.86 | 0.057 | 0.91 | 0.82 | 0.085 | |||||||||
WT | 0.96 | 0.90 | 0.052 | 0.95 | 0.86 | 0.082 | ||||||||||
H(IV) | EEMD | 0.90 | 0.86 | 0.056 | 0.90 | 0.83 | 0.084 | |||||||||
WT | 0.93 | 0.90 | 0.051 | 0.93 | 0.87 | 0.081 |
Results of SSD modeling
Modeling based on raw data
Accurate prediction of the suspended sediment discharge in rivers or streams is crucial for sustainable water resources and environmental systems. In this study, the suspended sediment discharge in selected stations was assessed via a KELM kernel-based approach. The previous parameters of flow and sediment discharge were used for development of the models. Table 5 and Figure 7 show the results of KELM models. The obtained results indicated that in the state of modeling based on each station's own data, the model SD(III) with input parameters of Qsc(t–1), Qsc(t–2), Qd(t–1), Qd(t–2) performed more successfully than other models. Kisi et al. (2012) showed that in suspended sediment modeling via genetic programming, the model whose inputs were the current water discharge and one previous water discharge and sediment load performed better. Also, the models whose inputs were current and one immediate previously recorded water discharge and one and two previous sediment loads, as well as models whose input were current water discharge and one previous sediment load, were ranked as second and third best models, respectively. In this study, from the obtained results, it could be stated that the model D(II) with two input parameters of Qsd(t–1), Qsd(t–2) yielded the desired accuracy. Therefore, the sediment discharge can be predicted using only the previous 1 and 2 days' sediment discharge variables.
Station/model . | Performance criteria . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | . | Train . | Test . | ||||||||||
R . | DC . | RMSE . | R . | DC . | RMSE . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | ||
State 1 | ||||||||||||||
1 | SD(I) | 0.89 | 0.83 | 0.048 | 0.88 | 0.80 | 0.057 | 2 | 0.86 | 0.84 | 0.045 | 0.85 | 0.79 | 0.065 |
SD(II) | 0.90 | 0.86 | 0.038 | 0.89 | 0.83 | 0.056 | 0.88 | 0.85 | 0.031 | 0.87 | 0.82 | 0.058 | ||
SD(III) | 0.92 | 0.88 | 0.032 | 0.91 | 0.84 | 0.054 | 0.88 | 0.87 | 0.028 | 0.87 | 0.84 | 0.053 | ||
3 | SD(I) | 0.89 | 0.85 | 0.042 | 0.86 | 0.82 | 0.062 | |||||||
SD(II) | 0.92 | 0.88 | 0.035 | 0.88 | 0.85 | 0.052 | ||||||||
SD(III) | 0.94 | 0.90 | 0.028 | 0.90 | 0.86 | 0.048 | ||||||||
State 2 | ||||||||||||||
2-1 | D(I) | 0.87 | 0.78 | 0.038 | 0.84 | 0.73 | 0.094 | 3-2 | 0.87 | 0.81 | 0.034 | 0.86 | 0.77 | 0.066 |
D(II) | 0.87 | 0.81 | 0.038 | 0.83 | 0.74 | 0.092 | 0.88 | 0.83 | 0.032 | 0.86 | 0.80 | 0.062 | ||
D(III) | 0.89 | 0.84 | 0.032 | 0.85 | 0.75 | 0.090 | 0.89 | 0.85 | 0.033 | 0.87 | 0.84 | 0.055 | ||
3-1 | D(II) | 0.88 | 0.83 | 0.035 | 0.88 | 0.83 | 0.061 | |||||||
D(III) | 0.87 | 0.81 | 0.038 | 0.87 | 0.80 | 0.066 | ||||||||
3-2-1 | D(IV) | 0.90 | 0.88 | 0.031 | 0.89 | 0.87 | 0.054 |
Station/model . | Performance criteria . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Train . | Test . | . | Train . | Test . | ||||||||||
R . | DC . | RMSE . | R . | DC . | RMSE . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | ||
State 1 | ||||||||||||||
1 | SD(I) | 0.89 | 0.83 | 0.048 | 0.88 | 0.80 | 0.057 | 2 | 0.86 | 0.84 | 0.045 | 0.85 | 0.79 | 0.065 |
SD(II) | 0.90 | 0.86 | 0.038 | 0.89 | 0.83 | 0.056 | 0.88 | 0.85 | 0.031 | 0.87 | 0.82 | 0.058 | ||
SD(III) | 0.92 | 0.88 | 0.032 | 0.91 | 0.84 | 0.054 | 0.88 | 0.87 | 0.028 | 0.87 | 0.84 | 0.053 | ||
3 | SD(I) | 0.89 | 0.85 | 0.042 | 0.86 | 0.82 | 0.062 | |||||||
SD(II) | 0.92 | 0.88 | 0.035 | 0.88 | 0.85 | 0.052 | ||||||||
SD(III) | 0.94 | 0.90 | 0.028 | 0.90 | 0.86 | 0.048 | ||||||||
State 2 | ||||||||||||||
2-1 | D(I) | 0.87 | 0.78 | 0.038 | 0.84 | 0.73 | 0.094 | 3-2 | 0.87 | 0.81 | 0.034 | 0.86 | 0.77 | 0.066 |
D(II) | 0.87 | 0.81 | 0.038 | 0.83 | 0.74 | 0.092 | 0.88 | 0.83 | 0.032 | 0.86 | 0.80 | 0.062 | ||
D(III) | 0.89 | 0.84 | 0.032 | 0.85 | 0.75 | 0.090 | 0.89 | 0.85 | 0.033 | 0.87 | 0.84 | 0.055 | ||
3-1 | D(II) | 0.88 | 0.83 | 0.035 | 0.88 | 0.83 | 0.061 | |||||||
D(III) | 0.87 | 0.81 | 0.038 | 0.87 | 0.80 | 0.066 | ||||||||
3-2-1 | D(IV) | 0.90 | 0.88 | 0.031 | 0.89 | 0.87 | 0.054 |
Note: in state 2, 3-2-1 means that the SSD values of station 3 are predicted based on data from stations 1 and 2.
In the second state, it could be seen that between three stations the third station SSD modeling based on data from both first and second stations performed more successfully. However, modeling based on the station's own data led to more desirable predictions. Artificial intelligence methods are very powerful tools and when the interrelationships among the relevant variables are difficult to understand and conventional mathematical analysis methods cannot provide analytical solutions, these methods can be used successfully. Choubin et al. (2018) evaluated the use of a Classification and Regression Tree (CART) model to estimate SSD based on hydro-meteorological data. They indicated that the CART as an artificial intelligence model can be a helpful tool in basins where hydro-meteorological data are readily available. The scatter plots of the KELM-best model for each state are shown in Figure 7. The term 3–2–1 in this figure means that the SSD values of station 3 are predicted based on the data from stations 1 and 2.
Modeling based on pre-processing data
The impact of pre-processing of data on predicting the SSD variable was assessed. The input combinations were decomposed using WT and EEMD methods. It was found that the db4 mother wavelet is more similar to the SSD signals and it led to better outcomes. Also, a decomposition level of 5 was used. In the EEMD method, time series were decomposed into 10 IMFs and one residual signal. The results of the integrated pre-processing models are listed in Table 6 and shown in Figure 8. According to the results, it could be stated that data pre-processing significantly improved the SSD prediction accuracy. It was observed that the applied pre-processing methods improved the models' efficiency between approximately 8 and 12% in training sets and between 10 and 18% in testing sets. From the results, it could be indicated that in the case of modeling based on each station data, the model SD(III) with input parameters of Qsc(t–1), Qsc(t–2), Qd(t–1), Qd(t–2) led to more accurate results. In the state of investigating the relationship between stations, modeling the station 2 sediment discharge based on the first station data in the term of QSd1(t), QSd1(t–1) performed more successfully. While in station 3, using both stations 1 and 2 data led to better predictions. This issue showed the impact of the previous station's information on the modeling process. It was also observed that in the case of pre-processing data, the maximum and minimum amounts of time series were predicted more accurately.
Station/model/method . | Performance criteria . | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | Train . | Test . | . | . | Train . | Test . | ||||||||
. | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . |
State 1 | ||||||||||||||||
1 | SD(I) | EEMD | 0.94 | 0.88 | 0.037 | 0.92 | 0.84 | 0.044 | 2 | EEMD | 0.91 | 0.87 | 0.035 | 0.89 | 0.83 | 0.042 |
WT | 0.95 | 0.93 | 0.036 | 0.93 | 0.89 | 0.042 | WT | 0.92 | 0.89 | 0.034 | 0.90 | 0.87 | 0.040 | |||
SD(II) | EEMD | 0.95 | 0.92 | 0.031 | 0.93 | 0.88 | 0.042 | EEMD | 0.93 | 0.91 | 0.025 | 0.91 | 0.87 | 0.037 | ||
WT | 0.96 | 0.94 | 0.030 | 0.94 | 0.91 | 0.040 | WT | 0.93 | 0.92 | 0.024 | 0.92 | 0.90 | 0.031 | |||
SD(III) | EEMD | 0.97 | 0.94 | 0.025 | 0.95 | 0.92 | 0.039 | EEMD | 0.95 | 0.95 | 0.022 | 0.91 | 0.92 | 0.027 | ||
WT | 0.98 | 0.96 | 0.024 | 0.95 | 0.94 | 0.037 | WT | 0.96 | 0.96 | 0.021 | 0.95 | 0.94 | 0.024 | |||
3 | SD(I) | EEMD | 0.94 | 0.90 | 0.033 | 0.90 | 0.86 | 0.048 | ||||||||
WT | 0.95 | 0.93 | 0.031 | 0.91 | 0.91 | 0.046 | ||||||||||
SD(II) | EEMD | 0.96 | 0.94 | 0.029 | 0.92 | 0.91 | 0.040 | |||||||||
WT | 0.98 | 0.95 | 0.028 | 0.95 | 0.93 | 0.038 | ||||||||||
SD(III) | EEMD | 0.98 | 0.95 | 0.022 | 0.94 | 0.95 | 0.034 | |||||||||
WT | 0.99 | 0.97 | 0.021 | 0.95 | 0.96 | 0.033 | ||||||||||
State 2 | ||||||||||||||||
2-1 | D(I) | EEMD | 0.92 | 0.87 | 0.030 | 0.88 | 0.77 | 0.071 | 3-2 | EEMD | 0.92 | 0.87 | 0.027 | 0.90 | 0.81 | 0.050 |
WT | 0.93 | 0.89 | 0.028 | 0.89 | 0.81 | 0.067 | WT | 0.93 | 0.89 | 0.025 | 0.91 | 0.85 | 0.047 | |||
D(II) | EEMD | 0.92 | 0.91 | 0.031 | 0.87 | 0.79 | 0.069 | EEMD | 0.93 | 0.91 | 0.025 | 0.90 | 0.86 | 0.048 | ||
WT | 0.93 | 0.92 | 0.030 | 0.88 | 0.81 | 0.066 | WT | 0.94 | 0.92 | 0.024 | 0.91 | 0.87 | 0.046 | |||
D(III) | EEMD | 0.97 | 0.91 | 0.025 | 0.89 | 0.82 | 0.065 | EEMD | 0.97 | 0.93 | 0.026 | 0.91 | 0.89 | 0.039 | ||
WT | 0.98 | 0.93 | 0.024 | 0.95 | 0.84 | 0.062 | WT | 0.98 | 0.94 | 0.025 | 0.95 | 0.92 | 0.037 | |||
3-1 | D(II) | EEMD | 0.93 | 0.87 | 0.027 | 0.92 | 0.87 | 0.046 | ||||||||
WT | 0.94 | 0.89 | 0.026 | 0.93 | 0.88 | 0.044 | ||||||||||
D(III) | EEMD | 0.92 | 0.91 | 0.031 | 0.91 | 0.85 | 0.051 | |||||||||
WT | 0.93 | 0.92 | 0.029 | 0.92 | 0.87 | 0.046 | ||||||||||
3-2-1 | D(IV) | EEMD | 0.93 | 0.92 | 0.025 | 0.92 | 0.91 | 0.038 | ||||||||
WT | 0.97 | 0.96 | 0.022 | 0.93 | 0.93 | 0.035 |
Station/model/method . | Performance criteria . | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | Train . | Test . | . | . | Train . | Test . | ||||||||
. | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . | . | . | R . | DC . | RMSE . | R . | DC . | RMSE . |
State 1 | ||||||||||||||||
1 | SD(I) | EEMD | 0.94 | 0.88 | 0.037 | 0.92 | 0.84 | 0.044 | 2 | EEMD | 0.91 | 0.87 | 0.035 | 0.89 | 0.83 | 0.042 |
WT | 0.95 | 0.93 | 0.036 | 0.93 | 0.89 | 0.042 | WT | 0.92 | 0.89 | 0.034 | 0.90 | 0.87 | 0.040 | |||
SD(II) | EEMD | 0.95 | 0.92 | 0.031 | 0.93 | 0.88 | 0.042 | EEMD | 0.93 | 0.91 | 0.025 | 0.91 | 0.87 | 0.037 | ||
WT | 0.96 | 0.94 | 0.030 | 0.94 | 0.91 | 0.040 | WT | 0.93 | 0.92 | 0.024 | 0.92 | 0.90 | 0.031 | |||
SD(III) | EEMD | 0.97 | 0.94 | 0.025 | 0.95 | 0.92 | 0.039 | EEMD | 0.95 | 0.95 | 0.022 | 0.91 | 0.92 | 0.027 | ||
WT | 0.98 | 0.96 | 0.024 | 0.95 | 0.94 | 0.037 | WT | 0.96 | 0.96 | 0.021 | 0.95 | 0.94 | 0.024 | |||
3 | SD(I) | EEMD | 0.94 | 0.90 | 0.033 | 0.90 | 0.86 | 0.048 | ||||||||
WT | 0.95 | 0.93 | 0.031 | 0.91 | 0.91 | 0.046 | ||||||||||
SD(II) | EEMD | 0.96 | 0.94 | 0.029 | 0.92 | 0.91 | 0.040 | |||||||||
WT | 0.98 | 0.95 | 0.028 | 0.95 | 0.93 | 0.038 | ||||||||||
SD(III) | EEMD | 0.98 | 0.95 | 0.022 | 0.94 | 0.95 | 0.034 | |||||||||
WT | 0.99 | 0.97 | 0.021 | 0.95 | 0.96 | 0.033 | ||||||||||
State 2 | ||||||||||||||||
2-1 | D(I) | EEMD | 0.92 | 0.87 | 0.030 | 0.88 | 0.77 | 0.071 | 3-2 | EEMD | 0.92 | 0.87 | 0.027 | 0.90 | 0.81 | 0.050 |
WT | 0.93 | 0.89 | 0.028 | 0.89 | 0.81 | 0.067 | WT | 0.93 | 0.89 | 0.025 | 0.91 | 0.85 | 0.047 | |||
D(II) | EEMD | 0.92 | 0.91 | 0.031 | 0.87 | 0.79 | 0.069 | EEMD | 0.93 | 0.91 | 0.025 | 0.90 | 0.86 | 0.048 | ||
WT | 0.93 | 0.92 | 0.030 | 0.88 | 0.81 | 0.066 | WT | 0.94 | 0.92 | 0.024 | 0.91 | 0.87 | 0.046 | |||
D(III) | EEMD | 0.97 | 0.91 | 0.025 | 0.89 | 0.82 | 0.065 | EEMD | 0.97 | 0.93 | 0.026 | 0.91 | 0.89 | 0.039 | ||
WT | 0.98 | 0.93 | 0.024 | 0.95 | 0.84 | 0.062 | WT | 0.98 | 0.94 | 0.025 | 0.95 | 0.92 | 0.037 | |||
3-1 | D(II) | EEMD | 0.93 | 0.87 | 0.027 | 0.92 | 0.87 | 0.046 | ||||||||
WT | 0.94 | 0.89 | 0.026 | 0.93 | 0.88 | 0.044 | ||||||||||
D(III) | EEMD | 0.92 | 0.91 | 0.031 | 0.91 | 0.85 | 0.051 | |||||||||
WT | 0.93 | 0.92 | 0.029 | 0.92 | 0.87 | 0.046 | ||||||||||
3-2-1 | D(IV) | EEMD | 0.93 | 0.92 | 0.025 | 0.92 | 0.91 | 0.038 | ||||||||
WT | 0.97 | 0.96 | 0.022 | 0.93 | 0.93 | 0.035 |
The RMSE error criterion was used to graphically compare the performance of single and integrated KELM models. The results are shown in Figure 9. As can be seen, in both SSC and SSD modeling processes, the RMSE values were smaller for integrated methods, and the WT-KELM model led to more accurate results.
Uncertainty analysis results
Timescale . | Station . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|
95 PPU . | d-Factor . | . | 95 PPU . | d-Factor . | . | 95 PPU . | d-Factor . | ||
KELM | |||||||||
SSC | 1 | 70.48% | 0.235 | 2 | 70.48% | 0.211 | 3 | 70.7% | 0.257 |
2-1 | 77.7% | 0.214 | 3-2-1 | 74.3% | 0.129 | ||||
SSD | 1 | 76.41% | 0.215 | 2 | 74.40% | 0.223 | 3 | 72.7% | 0.224 |
2-1 | 74.6% | 0.105 | 3-2-1 | 73.9% | 0.187 | ||||
WT-KELM | |||||||||
SSC | 1 | 89.2% | 0.105 | 2 | 80.1% | 0.108 | 3 | 81.5% | 0.216 |
2-1 | 84.2% | 0.108 | 3-2-1 | 80.9% | 0.095 | ||||
SSD | 1 | 89.51% | 0.108 | 2 | 87.1% | 0.102 | 3 | 83.5% | 0.204 |
2-1 | 87.32% | 0.115 | 3-2-1 | 84.9% | 0.109 |
Timescale . | Station . | Performance criteria . | |||||||
---|---|---|---|---|---|---|---|---|---|
95 PPU . | d-Factor . | . | 95 PPU . | d-Factor . | . | 95 PPU . | d-Factor . | ||
KELM | |||||||||
SSC | 1 | 70.48% | 0.235 | 2 | 70.48% | 0.211 | 3 | 70.7% | 0.257 |
2-1 | 77.7% | 0.214 | 3-2-1 | 74.3% | 0.129 | ||||
SSD | 1 | 76.41% | 0.215 | 2 | 74.40% | 0.223 | 3 | 72.7% | 0.224 |
2-1 | 74.6% | 0.105 | 3-2-1 | 73.9% | 0.187 | ||||
WT-KELM | |||||||||
SSC | 1 | 89.2% | 0.105 | 2 | 80.1% | 0.108 | 3 | 81.5% | 0.216 |
2-1 | 84.2% | 0.108 | 3-2-1 | 80.9% | 0.095 | ||||
SSD | 1 | 89.51% | 0.108 | 2 | 87.1% | 0.102 | 3 | 83.5% | 0.204 |
2-1 | 87.32% | 0.115 | 3-2-1 | 84.9% | 0.109 |
Investigating the most effective sub-series
Sensitivity analysis is used to evaluate the effect of each sub-series obtained from WT and EEMD on the modeling process. For evaluating the impact of each sub-series, the model SC(III) in the SSC predicting of station 2 was selected and run with all sub-series and then, one of the input sub-series was eliminated and the integrated KELM model was re-run. DC error criterion was used as an indication of the significance of each parameter. Figure 10 shows the sensitivity analysis results. Based on the results, it could be deduced that the IMF9 in the EEMD method and A5 approximation sub-series in the WT method were the most important sub-series in the prediction process.
CONCLUSIONS
The accurate prediction of SSC and SSD of rivers is an important factor in improving water management. This study assessed the capability of time series pre-processing methods for the SSC and SSD modeling. In this regard, in the first step, the raw time series (without any data processing) were imposed to the KELM model. Then, time series were decomposed to several sub-series using WT and EEMD and used as inputs of KELM. According to the results, it was found that using both WT and EEMD pre-processing methods increased the model's accuracy. The applied pre-processing method enhanced the KELM model performance between approximately 10 and 18%. It was observed that in estimation of the SSC, using previous suspended sediment concentration led to more accurate results and the use of sediment discharge had no significant impact on the modeling process. It showed that modeling based on each station's own data led to more desirable results. In this state, the model with inputs QSC(t–1), QSC(t–2) in SSC modeling and the model with inputs QSC(t–1), QSC(t–2), Qd(t–1), Qd(t–2) in SSD modeling were superior. However, using the integrated KELM approaches, the previous station data could be used when the station's own data were unavailable. Sensitivity analysis results suggested that the IMF9 in EEMD method and A5 sub-series in WT method were the most effective sub-series in SSC prediction process. Also, it was found that the maximum and minimum values of SSC and SSD variables were well predicted using the integrated models. Therefore, the integration of the KELM model with pre-processing models could be a suitable solution for more accurate prediction of hydrological variables such as suspended sediment concentration and suspended sediment discharge. It should, however, be noted that the KELM is a data-driven model and the KELM-based model is data sensitive, so further studies using data ranges out of this study should be carried out in the future to determine the merits of the applied model in the SSC and SSD modeling.
DATA AVAILABILITY STATEMENT
All relevant data are available from https://waterdata.usgs.gov/nwis/sw