Abstract

The estimation of the suspended sediment load in rivers is one of the main issues in hydraulic engineering. Different traditional methods such as the sediment rating curve (SRC) can be used to estimate the suspended sediment load of rivers. The main problem with this method is its low accuracy and uncertainty. In this study, the ability of three intelligence models namely: gene expression programming (GEP), artificial neural networks (ANN) and adaptive neuro fuzzy inference system (ANFIS) were compared with the SRC method. The daily flow discharge and sediment discharge at two hydrometric stations of the Kasilian and Telar rivers in the period of 1964–2014 were used to develop intelligence models. The performance of these methods indicated that all intelligence models give reliable results in the estimation of the suspended sediment load and their performance was better than the SRC method. Moreover, results showed that the GEP model with a high coefficient of determination (R2) and a low mean absolute error (MAE) was better than both the ANN and ANFIS models for the estimation of daily suspended sediment load of the two sub-basins of the Kasilian and Telar rivers.

INTRODUCTION

Sedimentation is one of the main problems for hydraulic structures, dam reservoirs and hydroelectric power plants due to its effects on the operations of these structures. The correct estimation of the sediment load carried by a river is a crucial issue in water engineering. Thus, many studies have been carried out by various researchers to estimate the amount of transported sediment by relating this parameter to hydraulic parameters (discharge, flow velocity, water depth), geometric parameters of a river (slope, cross section area) and sediment properties (mean diameter of sediment, density, kind of sediment). However, these equations have not been used widely because of substantial inaccuracies. Previous research implied that these approaches disrupt projects due to overestimating or underestimating the volume of sediment (Aytek & Kişi 2008). An example of these regression models is presented by Zhang et al. (2012). In this research, the suspended sediment load is related to the flow flux as a dependent variable. Zhang et al. (2012) used sediment rating curves (SRCs) to investigate the variations in relationships between water discharge (Q) and suspended sediment concentration (Qs) of three major rivers of the Pearl River Delta. Results indicate that the parameters of SRC vary with time. Harrington & Harrington (2013) assessed the ability of the SRC method for estimation of the suspended sediment load of the Rivers Bandon and Owenabue in Ireland. They found that the rating curves could provide acceptable estimates of suspended sediment load in both the Rivers Bandon and Owenabue.

In recent decades, many research studies have been carried out on black box models aimed at solving nonlinear problems. For example, the adaptive neuro fuzzy inference system (ANFIS), artificial neural networks (ANN) (such as radial basis function (RBF), multi-layer perceptron (MLP), etc.), genetic programming (GP), support vector machine (SVM) and gene expression programming (GEP) have been introduced as an alternative application in water resource problems. The research implies that these approaches present reasonable results using artificial intelligence methodology and are becoming effective tools for solving nonlinear problems of hydraulic engineering and water resource management. These innovative methods have been used widely in diverse areas of water engineering (Azamathulla & Ahmad 2012; Bahramifara et al. 2013; Haddadchi et al. 2013; Lafdani et al. 2013; Kashi et al. 2014; Ghorbani et al. 2015). For instance, Kisi & Shiri (2012) applied the GEP, ANFIS, ANN and SVM models to estimate the daily suspended sediment load. A comparison of the findings illustrated that the GEP was superior to the ANFIS, ANN and SVM techniques. Shamaei & Kaedi (2016) used GP and neuro fuzzy systems to predict suspended sediment concentration. Haddadchi et al. (2013) compared the ANN model and suspended load formulae to estimate the suspended load transport rate of gravel bed rivers and sandy bed rivers. They concluded that the performance of the ANN model was significantly better than traditional suspended load formulae. In this study, the suspended sediment load at two hydrometric stations: Kasilian (on the Kasilian River) and Telar (on the Telar River), was estimated by intelligence models. These stations were chosen as the two sub-basins of Kasilian and Telar have different catchment areas (342.89 km2 and 1,768.6 km2), and the discharge and sediment discharge in these two basins are different in terms of quantity. The first objective of this study was to develop and test the GEP, ANN and ANFIS models to estimate the suspended sediment load of these rivers under different conditions in terms of discharge, sediment discharge and physiographic characteristics of the catchment area. The secondary goal was to compare both the performance of these models with each other and with the SRC.

DATA, METHODS AND MODELS

Study area

The Telar basin with a catchment area of 2,900 km2 is located in the north of Iran. Kasilian and Telar are the two sub-basins of this basin (Figure 1). The Great or main Telar River originates from the mountainous areas of Savadkuh and the central and eastern Alborz mountains and passes through the cities of Savadkuh and Ghaemshahr in Mazandaran Province. The Telar River with a length of 152 km is one of the most important rivers in Mazanderan Province and collects the runoff of extensive areas to convey to the Caspian Sea (Figure 1). Mazandaran has a variety of climates, including the mild and humid climate of the Caspian shoreline, a moderate climate and the cold climate of the mountainous regions. The study area is located within the cold semi-arid climatic regions. The prevailing climate in the study area is known as a local steppe climate and, based on the Köppen-Geiger climate classification, is classified as BSK. The main Telar River is fed by two main tributaries, namely the Kasilian and the Telar. These two rivers meet each other in the Ravat Sar near the city of Shirgah and form the main Telar River.

Figure 1

Geographical location of the study area in the Mazandaran Province of Iran.

Figure 1

Geographical location of the study area in the Mazandaran Province of Iran.

The catchment area of the Kasilian River sub-basin is 342.89 km2. The geographical location of this basin is between 35° 58′ N to 36° 18′ N and 52° 56′ E to 53° 42′ E. The altitude of this basin is between 240 m and 3,440 m, with an average altitude of 890 m above sea level. The river of this basin with a length of 50 km flows from the south to the northwest. The Kasilian basin is situated in mountainous and forested areas. Its bed slope is relatively high (see Table 1). The average annual precipitation in this basin during the period of 1964–2014 was 784 mm. The maximum annual precipitation in this basin was 1404.3 mm. In addition, the maximum 24-hour rainfall of this basin was 58 mm, which occurred on May 13, 1990.

Table 1

Physiographic characteristics of the Kasilian and Telar basins

CharacteristicsUnitBasin
KasilianTelar
Basin Area km2 342.4 1768.6 
Basin Slope m/m 0.2896 0.3692 
Average Overland Flow 503.64 547.33 
Basin Length 47267.82 68448.26 
Perimeter 160617.96 340052.61 
Shape Factor km2/ km2 10.51 4.26 
Mean Basin Elevation 993.27 1980.84 
Max Flow Distance 60114.69 104003.66 
Max Flow Slope m/m 0.0488 0.0280 
Max Stream Length 58921.18 103020.48 
Max Stream Slope m/m 0.0363 0.0259 
CharacteristicsUnitBasin
KasilianTelar
Basin Area km2 342.4 1768.6 
Basin Slope m/m 0.2896 0.3692 
Average Overland Flow 503.64 547.33 
Basin Length 47267.82 68448.26 
Perimeter 160617.96 340052.61 
Shape Factor km2/ km2 10.51 4.26 
Mean Basin Elevation 993.27 1980.84 
Max Flow Distance 60114.69 104003.66 
Max Flow Slope m/m 0.0488 0.0280 
Max Stream Length 58921.18 103020.48 
Max Stream Slope m/m 0.0363 0.0259 

The Telar sub-basin catchment occupies the north-west part of the Telar basin and is situated between 35° 43.0′ N to 36° 18.8′ N and 52° 36.7′ E to 53° 24.2′ E (Figure 1). It has a catchment area of 1768.6 km2, which is nearly 5.2 times larger than the Kasilian sub-basin. Similar to the Kasilian sub-basin, this basin receives precipitation in the form of both rain and snow. The mean annual precipitation of this sub-basin is 577.1 mm. The highest rainfall of 752.5 mm was recorded during the year 2012 and the lowest of 343.4 mm during the year 2010. The area receives the highest rainfall in the month of November (69.7 mm) and the lowest in the month of July (32.1 mm).

The two basins have different soil types including loamy, loamy clay, clay and silty clay loamy. The geological formations of the largest areas of the basins are related to the Mesozoic geological period and formed from thick layers of limestone, sand, and tuff with the Paleozoic core and the effects of the Precambrian geological period.

Since the two sub-basins have different catchment areas (342.89 km2 and 1768.6 km2), the discharge and sediment discharge in these two basins are different in terms of quantity, and so the data of these two rivers (Kasilian and Telar) were used to investigate the ability of the models to estimate the sediment discharge. This study uses the recorded daily flow discharge and sediment discharge at two hydrometric stations (Kasilian and Telar) in the period of 1964–2014 to develop GEP, ANN and ANFIS models. The daily flow discharge (Qd) and the daily sediment discharge (Qs) of the two hydrometric stations (Kasilian and Telar) were collected from the regional water authority of Mazandaran. The minimum, mean, maximum and standard deviation of the collected data are presented in Table 2. The whole data set covers 50 years (1964–2014) and was divided into two parts: the training set of 35 years (1964–1999), and the testing set of 15 years (1999–2014).

Table 2

Statistical indices of collected data

StationQd (m3/ s)
Qs (ton/day)
MinMeanMaxStd. deviationMinMeanMaxStd. deviation
Kasilian 0.05 4.82 816.50 34.14 0.21 262.17 51421.12 2772.73 
Telar 0.76 8.45 143.57 10.39 2.60 3422.32 632308.53 30313.71 
StationQd (m3/ s)
Qs (ton/day)
MinMeanMaxStd. deviationMinMeanMaxStd. deviation
Kasilian 0.05 4.82 816.50 34.14 0.21 262.17 51421.12 2772.73 
Telar 0.76 8.45 143.57 10.39 2.60 3422.32 632308.53 30313.71 

Intelligence models

We used three intelligence models namely the GEP, ANN and ANFIS to estimate the suspended load. The GEP was introduced and developed by Ferreira (2001). The model is an extension of two previous evolutionary algorithms (genetic algorithm (GA) and GP) (Ferreira 2001). In this model, the population of individuals is selected and used based on fitness. Genetic variations are introduced using one or more genetic operators by GEP (Ferreira 2001, 2006). These three algorithms have basic differences which refer to the nature of individuals. In other words, in the GA individuals are chromosomes which are coded as linear strings of fixed length while in the GP, individuals are nonlinear entities of different sizes and shapes that are expressed as parse trees. In the GEP, individuals are encoded as linear strings of fixed length (the chromosome or genome) which are expressed as nonlinear entities of different sizes and shapes (expression tree) (Ferreira 2001, 2006). The first step of the processing in the GEP is to create a random population of initial chromosomes, which is like other evolutionary algorithms. The evaluation of these initial chromosomes is carried out by using fitness functions such as the mean square error (MSE), relative square error (RSE), root relative square error (RRSE) and root mean square error (RMSE) which can be used in the GEP model. The RRSE fitness function was selected in the present study because it was used in previous research as an example (Kisi & Shiri 2012; Emamgolizadeh et al. 2015; Emamgholizadeh et al. 2017).

ANNs were proposed and improved by McCulloch & Pitts (1943), inspired by and imitating the human brain. Following this, some types of ANNs improved (e.g. multilayer perceptron (MLP), radial basis function neural network (RBF), self-organizing network, and fuzzy neural network). The most significant advantages of these networks are generalization, ability to learn, the need for the least information, shorter performance time and simpler performance (Chang & Chen 2003). The ANN model is composed of simple units called neurons connected to each other by unidirectional links which carry distinct information (Nagy et al. 2002). Neurons are expressed by mathematical language and filter the signals in the whole of the network (Emamgholizadeh 2012). The MLP is the most common neural network that has had successful results with nonlinear problems (Emamgholizadeh et al. 2014). Most neural networks use a back propagation algorithm (BP) in training which was proposed by Rumelhart et al. (1988). In the BP algorithm, the neural networks process the information in processor elements (e.g., neurons, units or nodes). The MLP/BP structure was used in the present study to train the ANN.

The ANFIS was introduced by Jang in 1993 (Jang 1993). This approach is a combination of the ANN and fuzzy logic. In this way, the learning capability of neural networks is integrated into a fuzzy inference system (FIS) (Emamgholizadeh et al. 2014). Three types of FIS are common, based on the types of inference operation if-then rules. These include Tsukamoto's system, Mamdani's system and Sugeno's system (Kişi 2007). The first order of the third system was applied to the present study.

Model developments (selection of input vectors)

One of the most important issues in model development is to find possible input variables for the modeling. Different methods such as the trial and error method and correlation analysis can be used for this purpose. Using the first method involves spending a lot of time and the second method does not give the exact lag values. Therefore, in this study, statistical parameters such as the auto-correlation function (ACF), partial autocorrelation function (PACF) and cross-correlation function (CCF) were used to find out the significant lag values of input variables. The mathematical relation of these parameters and their details can be found in Salas et al. (1980) and Senthil Kumar et al. (2011). Possible scenarios for input combinations are presented in Table 3. For finding the best combination of input vectors, a collection of time lagged Qd (Qd-1, Qd-2, … , Qd-n) and Qs (Qs-1, Qs-2, … , Qs-n) was considered.

Table 3

Scenarios for input combinations

ScenariosInput parameter(s)Output
Qd Qs 
Qs-1 Qs 
Qd, Qd-1 Qs 
Qd, Qs-1 Qs 
Qs-1, Qs-2 Qs 
Qd, Qs-1, Qs-2 Qs 
Qd, Qd-1, Qd-2 Qs 
Qd, Qd-1, Qd-2,Qs-1 Qs 
Qd, Qd-1, Qd-2, Qs-1, Qs-2 Qs 
ScenariosInput parameter(s)Output
Qd Qs 
Qs-1 Qs 
Qd, Qd-1 Qs 
Qd, Qs-1 Qs 
Qs-1, Qs-2 Qs 
Qd, Qs-1, Qs-2 Qs 
Qd, Qd-1, Qd-2 Qs 
Qd, Qd-1, Qd-2,Qs-1 Qs 
Qd, Qd-1, Qd-2, Qs-1, Qs-2 Qs 

Modeling performance criteria

In order to evaluate the performance of GEP, ANN and ANFIS models, two statistical parameters, namely the coefficient of determination (R2) and mean absolute error (MAE), were used. 
formula
(1)
 
formula
(2)
where N is the number of data, Oi is the observed data, Pi is the predicted data and the bar denotes the mean of variables.

RESULTS AND DISCUSSION

Input vector selection

This paper uses the daily discharge and daily sediment discharge at two hydrometric stations on the Kasilian and Telar rivers. The whole data set covers 50 years (1964–2014), and was divided into two parts: the training set of 35 years (1964–1999), and the testing set of 15 years (1999–2014). Figure 2 shows the scatter plot between the daily flow discharge (Qd) and daily suspended sediment discharge (Qs) for these two stations.

Figure 2

The scatter plots of log Qd and log Qs for (a) the Kasilian station and (b) the Telar station.

Figure 2

The scatter plots of log Qd and log Qs for (a) the Kasilian station and (b) the Telar station.

The ACF and PACF of daily suspended sediment discharge (Qs) for the Kasilian and Telar rivers are presented in Figure 3(a), 3(b), 3(d) and 3(e). The CCF between daily suspended sediment discharge (Qs) and the daily flow discharge (Qd) is given in Figure 3(c) and 3(f). For the Kasilian river, the auto-correlation and partial auto-correlation coefficient of the suspended sediment discharge were less than 0.135 for all lag values barring an exception for lag 1. Also, for this river the cross-correlation coefficient of the suspended sediment discharge with the flow discharge for lag 0 was 0.693 and it was higher than all other lagged cross-correlation coefficient values. For the Telar river, the ACF and PACF of daily suspended sediment discharge (Qs) for lag 0 was 0.208 and for other lags they were less than confidence levels. The cross-correlation coefficients of the suspended sediment discharge with the flow discharge for lags 0, 1, 2 were 0.598, 0.270 and 0.182, respectively. Overall, for training the intelligence models, based on the calculated values of PACF and CCF of the data series, the following input vectors (Equations (3) and (4)) were selected for the Kasilian and Telar rivers, respectively: 
formula
(3)
 
formula
(4)
Figure 3

(a) and (d) Autocorrelation of the suspended sediment load for the Kaslian and Telar rivers, respectively; (b) and (e) partial autocorrelation of the suspended sediment load for the Kaslian and Telar rivers; (c) and (f) cross-correlation between suspended sediment load and discharge for the Kaslian and Telar rivers.

Figure 3

(a) and (d) Autocorrelation of the suspended sediment load for the Kaslian and Telar rivers, respectively; (b) and (e) partial autocorrelation of the suspended sediment load for the Kaslian and Telar rivers; (c) and (f) cross-correlation between suspended sediment load and discharge for the Kaslian and Telar rivers.

GEP development

The first step in the GEP development is to investigate the fitness function. In the present study, Equation (5) was used as a fitness function. In this equation, RRSE is the root relative square error. Equation (6) was used to calculate this parameter as follows: 
formula
(5)
 
formula
(6)
where ranges from 0 to 1,000 (1,000 corresponds to a chromosome with ideal fitness). In addition, in Equation (6), Pij and Tj are the predicted value for the individual chromosome i and the target value for fitness case j. The bar sign also denotes average values (Ferreira 2006).

Then, a set of terminals (T) must be selected for generating genes. In the current study, the time lagged daily flow discharge (Qd) and daily sediment discharge (Qs) were chosen as terminal sets. Moreover, geometric and trigonometric functions like +, - , ×, ÷, sin, tan−1, root square and log were used. Next, the number of genes and the length of the head of the gene were selected. The number of genes determines the number of sub-Ets. The best number for this is 1 to 3 to optimize the GEP model (Ferreira 2001). Moreover, the head length was selected by trial and error. The results indicated that the GEP performance did not improve significantly by increasing the head length to more than 8 for the Kasilian station and 7 for the Telar station. So, the head lengths were selected to be 8 and 7 for the Kasilian and Telar stations, respectively. The number of chromosomes selected was 30 to give the best results. The next step was to select the genetic operators and their rates. These operators are presented in Table 4.

Table 4

Parameters of the GEP model

ParameterDescription of parameterSetting of parameter
P1 Function set +, ̶, ×, ÷, , sin x, cos x, tan-1 x, ex, ln 
P2 Mutation rate 0.044 
P3 Inversion rate 0.1 
P4 IS rate 0.1 
P5 RIS rate 0.1 
P6 Gene transposition rate 0.1 
P7 One point recombination rate 0.3 
P8 Two point recombination rate 0.3 
P9 Gene recombination rate 0.1 
ParameterDescription of parameterSetting of parameter
P1 Function set +, ̶, ×, ÷, , sin x, cos x, tan-1 x, ex, ln 
P2 Mutation rate 0.044 
P3 Inversion rate 0.1 
P4 IS rate 0.1 
P5 RIS rate 0.1 
P6 Gene transposition rate 0.1 
P7 One point recombination rate 0.3 
P8 Two point recombination rate 0.3 
P9 Gene recombination rate 0.1 

Finally, it was essential to select the linking function. In the present study, the addition (+) was selected as it was also used by previous researchers such as Hashmi & Shamseldin (2014), Azamathulla & Ahmad (2012), Kisi & Shiri (2012) and Emamgolizadeh et al. (2015). The number of generations selected was 50,000 because the variation of results was not significant after 50,000 generations. In other words, the fitness function converged to a certain value and after that no changes were seen.

The column diagrams of the GEP performance on the data from both stations are presented in Figure 4 to better compare each input combination performance.

Figure 4

The effect of input combinations on the performance of GEP; (a) the Kasilian station and (b) the Telar station.

Figure 4

The effect of input combinations on the performance of GEP; (a) the Kasilian station and (b) the Telar station.

Figure 5

Expression trees (ETs) of GEP performance; (a) the Kasilian station and (b) the Telar station.

Figure 5

Expression trees (ETs) of GEP performance; (a) the Kasilian station and (b) the Telar station.

Based on the coefficient of determination (R2), the sixth input combination (Qd, Qs-1, Qs-2) and the eighth input combination (Qd, Qd-1, Qd–2, Qs-1) for the Kasilian station and the Telar station, respectively, demonstrated more accurate results, (see Figure 4). This finding is in agreement with the proposed combination of vectors based on PACF and CCF, and also with results of other studies such as Aytek & Kişi (2008) and Guven & Talu (2010). Another important finding was that the second input combination of both stations (Qs-1) displayed the weakest results. This confirms the influence of the flow water discharge to estimate the suspended sediment load as a dependent parameter. The results of the optimized input combinations of GEP are presented in Table 5.

Table 5

GEP performance results

StationInput combinationR2R2Fitness function
TrainingMAE (ton/day)TestingMAE (ton/day)TrainingTesting
Kasilian Qd, Qs-1, Qs-2 0.992 322.76 0.942 876.3 925.25 783.2 
Telar Qd, Qd-1, Qd-2,Qs-1 0.967 1666.6 0.752 1269.7 847.48 478.1 
StationInput combinationR2R2Fitness function
TrainingMAE (ton/day)TestingMAE (ton/day)TrainingTesting
Kasilian Qd, Qs-1, Qs-2 0.992 322.76 0.942 876.3 925.25 783.2 
Telar Qd, Qd-1, Qd-2,Qs-1 0.967 1666.6 0.752 1269.7 847.48 478.1 

The expression trees (ETs) of the GEP model for the Kasilian and the Telar stations are presented in Figure 5. By using the corresponding values, the explicit formulations of the GEP for the suspended sediment load (Qs) as a function of flow discharge (Qd) were obtained as shown in Equations (7) and (8):

  • (a)
    for the Kasilian station: 
    formula
    (7)
  • (b)
    for the Telar station: 
    formula
    (8)
It should be noted that these equations are valid for parameters ranging between the maximum and minimum presented in Table 2.

Artificial neural network

The best results of ANN developments on the data of both stations were obtained from training and testing by one hidden layer. The column diagrams of coefficient of determination variations versus transfer functions and the results of optimized input combinations of ANN are presented in Figure 6 and Table 6, respectively. As seen in Figure 6, the best results of the ANN were obtained when the sigmoid transfer function was used. Moreover, Figure 6(a) indicates that the secant hyperbolic transfer function similar to the other transfer functions is not able to estimate the suspended sediment load of the Kasilian station. The results in Table 6 show that the sixth and ninth combinations of the data set were the best input combinations for the Kasilian and the Telar stations, respectively.

Figure 6

Coefficient of determination variations versus transfer functions; (a) the Kasilian station and (b) the Telar station.

Figure 6

Coefficient of determination variations versus transfer functions; (a) the Kasilian station and (b) the Telar station.

Table 6

ANN development results

StationInput combinationR2R2
TrainingMAE (ton/day)TestingMAE (ton/day)
Kasilian Qd, Qs-1, Qs-2 0.971 678.4 0.926 1385.12 
Telar Qd, Qd-1, Qd-2, Qs-1, Qs-2 0.828 2034.54 0.610 3023.45 
StationInput combinationR2R2
TrainingMAE (ton/day)TestingMAE (ton/day)
Kasilian Qd, Qs-1, Qs-2 0.971 678.4 0.926 1385.12 
Telar Qd, Qd-1, Qd-2, Qs-1, Qs-2 0.828 2034.54 0.610 3023.45 

Adaptive neuro fuzzy inference system

The ANFIS model was developed using 100 epochs and the linear function of the output layer. The linear plots of coefficient of determination variations versus transfer functions and the results of the optimized ANFIS model are presented in Figure 7 and Table 7, respectively. The statistical results of the ANFIS model (R2 and MAE) for training and testing sets proved that Gbellmf and Trimf transfer functions showed the best results for the Kasilian and Telar stations. Moreover, the best input combination for both stations was the ninth set of data which consisted of Qd, Qd-1, Qd-2, Qs-1 and Qs-2. Overall, the ANFIS model accurately learned to map the non-linear relationship between the input data and sediment discharge.

Figure 7

Coefficient of determination variations versus different transfer functions; (a) the Kasilian station and (b) the Telar station.

Figure 7

Coefficient of determination variations versus different transfer functions; (a) the Kasilian station and (b) the Telar station.

Table 7

ANFIS development results

StationInput combinationR2R2
TrainingMAE (ton/day)TestingMAE (ton/day)
Kasilian Qd, Qd-1, Qd-2, Qs-1, Qs-2 0.912 1423.4 0.875 1875.62 
Telar Qd, Qd-1, Qd-2, Qs-1, Qs-2 0.783 2145.34 0.510 3245.50 
StationInput combinationR2R2
TrainingMAE (ton/day)TestingMAE (ton/day)
Kasilian Qd, Qd-1, Qd-2, Qs-1, Qs-2 0.912 1423.4 0.875 1875.62 
Telar Qd, Qd-1, Qd-2, Qs-1, Qs-2 0.783 2145.34 0.510 3245.50 

Comparison of SRC, GEP, ANN and ANFIS models

To further assess the capability of the GEP, ANN and ANFIS models to estimate Qs, their results were compared with those of the SRC model. The SRC empirically describes the relationship between the suspended sediment (Qs) and water discharge (Qd) for a certain location (Jansson 1996; Syvitski et al. 2000; Horowitz 2003; Morehead et al. 2003; Harrington & Harrington 2013). The most commonly used SRC is a power function and it can be expressed with the following relationship (Zhang et al. 2012): 
formula
(9)
where Qs is the suspended sediment discharge and Qd is flow discharge. Constants values of a and b were calculated from data via a linear regression between log Qs and log Qd. Equations (10) and (11) obtained for the Kasilian and the Telar stations are as follows, respectively: 
formula
(10)
 
formula
(11)
The statistical metrics for this method and also for all intelligence models are given in Table 8 for the Kasilian and Telar stations. As seen in this table, the SRC method with R2 of 0.52 and 0.39 and MAE of 4805.58 and 6832.80 ton/day (respectively for the Kasilian and the Telar stations) is not an effective tool for estimating the suspended sediment load.
Table 8

Comparison between the results of SRC, GEP, ANN and ANFIS models for the Kasilian and Telar stations

Stations
Kasilian
Telar
ModelR2MAE (ton/day)R2MAE (ton/day)
GEP 0.942 876.30 0.75 1269.7 
ANN 0.926 1385.12 0.61 3023.45 
ANFIS 0.875 1875.62 0.51 3245.50 
SRC 0.520 4805.58 0.39 6732.80 
Stations
Kasilian
Telar
ModelR2MAE (ton/day)R2MAE (ton/day)
GEP 0.942 876.30 0.75 1269.7 
ANN 0.926 1385.12 0.61 3023.45 
ANFIS 0.875 1875.62 0.51 3245.50 
SRC 0.520 4805.58 0.39 6732.80 

For the Kasilian station, the results in Table 8 showed that the GEP, ANN and ANFIS models presented the best results compared to the SRC approach. The other findings implied that the GEP estimation is much more accurate than both the ANN and ANFIS models. The results indicated that the R2 of the GEP performance increased by approximately 81.1%, 2.3% and 7.7% compared to the SRC equation, ANN and ANFIS models, respectively.

Similarly, for the Telar station, the findings in Table 8 illustrate that the GEP performance is much more accurate than the SRC equation, ANN and ANFIS models. According to these results, the R2 of the GEP performance increased by approximately 76.1%, 12.6% and 34.7% compared to the SRC, ANN and ANFIS models, respectively. The scatter plots of the GEP performance for both stations are presented in Figure 8. In addition, the measured daily data from 2014 to 2016 were used for the validation of the GEP model. The performance of the GEP model was compared with two different types of statistics parameters: the coefficient of determination and MAE. The MAE was 954.6 ton/day for the Kasilian River whereas it was 1658.6 ton/day for the Telar River in the validation phase. Similarly, the R2 was 0.935 and 0.676 for the Kasilian and Telar rivers, respectively. The implementation of the model at the validation stage showed that compared to the results of the model at the testing stage, the accuracy of the model for the two rivers decreased. However, for water resources planning and management, the efficiency of the GEP model to estimate the suspended sediment discharge (Qs) was fairly acceptable. Comparing the results of the models for the Kasilian and Telar rivers with different sized catchment areas illustrates that the capability of all models for the Telar sub-basin was less than that for the Kasilian sub-basin. In other words, when the size of the catchment area increased, the discharge and sediment discharge of the river increased and the capability of all the models in the estimation of river sediment discharge decreased. However, the suspended sediment discharge of the rivers was extremely nonlinear and as a result models might not be able to catch this nonlinear functional relationship.

Figure 8

The scatter plots of the GEP performance for the Kasilian station (above) and the Telar station (bottom).

Figure 8

The scatter plots of the GEP performance for the Kasilian station (above) and the Telar station (bottom).

SENSITIVITY ANALYSIS

Sensitivity tests were conducted to determine the relative significance of each input variable on the suspended sediment discharge (Qs). The GEP model was chosen as the best model to estimate Qs, and the importance of the input data variable to this model was also investigated.

Table 9 shows the statistical indices of the GEP models without a specific input variable along with the best GEP model. As illustrated, for the Kasilian River the GEP model without Qd has the highest MAE and lowest R2. In other words, the ability of the GEP model to estimate the suspended sediment discharge (Qs) was significantly degraded when the model was run without the Qd. This shows that the Qd has the most significant impact on the suspended sediment discharge (Qs). Overall, the effect of input variables on the suspended sediment discharge (Qs) for the Kasilian River can be ranked from higher to lower as Qd, Qs-1 and Qs-2. Similar to the Kasilian River, sensitivity tests were carried out for the Telar River. As the results in Table 9 show, the ability of the GEP model without the Qd was significantly decreased (R2 = 0.145, MAE = 15470.5 ton/day) in estimating the suspended sediment discharge. Compared to the best model, the MAE increased by almost 33.4% when the GEP was run without the Qd. In addition, the results demonstrated that the ability of the model was reduced by eliminating the two parameters of Qd-1 and Qd-2 from the input variables of the GEP model. Overall, for the Kasilian and Telar Rivers, the Qd had a more significant impact on the performance of the GEP model to estimate the suspended sediment discharge (Qs) rather than other variables such as Qd-1, Qd-2 and Qs-1.

Table 9

Sensitivity analysis of the governing variables on suspended sediment discharge (Qs)

MethodKasilian
MethodTelar
MAE (ton/day)R2MAE (ton/day)R2
The best GEP 876.3 0.942 The best GEP 1269.7 0.752 
GEP without Qd 12128.8 0.139 GEP without Qd 15470.5 0.145 
GEP without Qs-1 9928.1 0.297 GEP without Qd-1 12809.6 0.221 
GEP without Qs-2 2010.4 0.868 GEP without Qd-2 2962.4 0.587 
– – – GEP without Qs-1 2895.6 0.598 
MethodKasilian
MethodTelar
MAE (ton/day)R2MAE (ton/day)R2
The best GEP 876.3 0.942 The best GEP 1269.7 0.752 
GEP without Qd 12128.8 0.139 GEP without Qd 15470.5 0.145 
GEP without Qs-1 9928.1 0.297 GEP without Qd-1 12809.6 0.221 
GEP without Qs-2 2010.4 0.868 GEP without Qd-2 2962.4 0.587 
– – – GEP without Qs-1 2895.6 0.598 

CONCLUSION

In this paper, the GEP, ANN and ANFIS models are developed in order to estimate the suspended sediment load of the Telar and Kasilan Rivers located in the north-east of Iran. The results showed that the use of time lagged daily flow discharge (Qd) and sediment discharge (Qs) as input combinations would increase the accuracy of intelligence models. Furthermore, the results indicated that the GEP performance provided much more accurate results compared to the ANN and ANFIS models. For the Kasilian station, the estimated suspended sediment load (Qs) by the GEP, ANN and ANFIS models had an MAE of 876.30 ton/day, 1390.02 ton/day and 1875.62 ton/day, respectively. Corresponding R2 values were 0.942, 0.921 and 0.875. Similarly, for the Telar station, the estimated suspended sediment load (Qs) by the GEP, ANN and ANFIS models had an MAE of 1269.7 ton/day, 3023.45 ton/day and 3245.5 ton/day, respectively. Corresponding R2 values were 0.752, 0.610 and 0.510. Overall, the results indicated that intelligence models were effective and reliable methods for estimating the suspended sediment load. The results of the GEP, ANN and ANFIS performance were compared with the results of the SRC equation. The findings showed that the GEP, ANN and ANFIS models were much more accurate than the SRC method to estimate the suspended sediment load of the Telar and Kasilan Rivers. When using the GEP model, it was also shown that the MAE decreased by approximately 36.7%, 53.3% and 81.8% for the Kasilian station and 58.0%, 60.9% and 81.1% for the Telar station compared to the ANN, ANFIS and SRC models, respectively. The most obvious finding to emerge from this study was that the GEP, ANN and ANFIS models were reliable approaches for estimating the suspended sediment load of rivers.

REFERENCES

REFERENCES
Aytek
A.
&
Kişi
Ö.
2008
A genetic programming approach to suspended sediment modelling
.
Journal of Hydrology
351
(
3
),
288
298
.
Azamathulla
H. M.
&
Ahmad
Z.
2012
Gene-expression programming for transverse mixing coefficient
.
Journal of Hydrology
434
,
142
148
.
Bahramifara
A.
,
Shirkhanib
R.
&
Mohammadic
M.
2013
An ANFIS-based approach for predicting the Manning roughness coefficient in alluvial channels at the bank-full stage
.
IJE Trans. B Appl.
26
(
2
),
177
186
.
Chang
F.-J.
&
Chen
Y.-C.
2003
Estuary water-stage forecasting by using radial basis function neural network
.
Journal of Hydrology
270
(
1
),
158
166
.
Emamgholizadeh
S.
2012
Neural network modeling of scour cone geometry around outlet in the pressure flushing
.
Glob. Nest. J.
14
,
540
549
.
Emamgholizadeh
S.
,
Kashi
H.
,
Marofpoor
I.
&
Zalaghi
E.
2014
Prediction of water quality parameters of Karoon River (Iran) by artificial intelligence-based models
.
International Journal of Environmental Science and Technology
11
(
3
),
645
656
.
Emamgolizadeh
S.
,
Bateni
S.
,
Shahsavani
D.
,
Ashrafi
T.
&
Ghorbani
H.
2015
Estimation of soil cation exchange capacity using genetic expression programming (GEP) and multivariate adaptive regression splines (MARS)
.
Journal of Hydrology
529
,
1590
1600
.
Emamgholizadeh
S.
,
Bahman
K.
,
Bateni
S. M.
,
Ghorbani
H.
,
Marofpoor
I.
&
Nielson
J. R.
2017
Estimation of soil dispersivity using soft computing approaches
.
Neural Computing and Applications
28
(
1
),
207
216
.
Ferreira
C.
2001
Algorithm for solving gene expression programming: a new adaptive problem
.
Complex Systems
13
(
2
),
87
129
.
Ferreira
C.
2006
Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence
.
Springer-Verlag
,
Berlin & Heidelberg, Germany
.
Ghorbani
H.
,
Kashi
H.
,
Hafezi Moghadas
N.
&
Emamgholizadeh
S.
2015
Estimation of soil cation exchange capacity using multiple regression, artificial neural networks, and adaptive neuro-fuzzy inference system models in Golestan Province, Iran
.
Communications in Soil Science and Plant Analysis
46
(
6
),
763
780
.
Haddadchi
A.
,
Movahedi
N.
,
Vahidi
E.
,
Omid
M. H.
&
Dehghani
A. A.
2013
Evaluation of suspended load transport rate using transport formulas and artificial neural network models (Case study: Chelchay Catchment)
.
Journal of Hydrodynamics, Ser. B
25
(
3
),
459
470
.
Hashmi
M. Z.
&
Shamseldin
A. Y.
2014
Use of gene expression programming in regionalization of flow duration curve
.
Advances in Water Resources
68
,
1
12
.
Jang
J.-S.
1993
ANFIS: adaptive-network-based fuzzy inference system
.
IEEE Transactions on Systems, Man, and Cybernetics
23
(
3
),
665
685
.
Kashi
H.
,
Emamgholizadeh
S.
&
Ghorbani
H.
2014
Estimation of soil infiltration and cation exchange capacity based on multiple regression, ANN (RBF, MLP), and ANFIS models
.
Communications in Soil Science and Plant Analysis
45
(
9
),
1195
1213
.
Kişi
Ö.
2007
Streamflow forecasting using different artificial neural network algorithms
.
Journal of Hydrologic Engineering
12
(
5
),
532
539
.
McCulloch
W. S.
&
Pitts
W.
1943
A logical calculus of the ideas immanent in nervous activity
.
The Bulletin of Mathematical Biophysics
5
(
4
),
115
133
.
Morehead
M. D.
,
Syvitski
J. P.
,
Hutton
E. W.
&
Peckham
S. D.
2003
Modeling the temporal variability in the flux of sediment from ungauged river basins
.
Global and Planetary Change
39
(
1
),
95
110
.
Nagy
H.
,
Watanabe
K.
&
Hirano
M.
2002
Prediction of sediment load concentration in rivers using artificial neural network model
.
Journal of Hydraulic Engineering
128
(
6
),
588
595
.
Rumelhart
D. E.
,
Hinton
G. E.
&
Williams
R. J.
1988
Learning representations by back-propagating errors
.
Cognitive Modeling
5
(
3
),
1
.
Salas
J. D.
,
Delleur
J. W.
,
Yevjevich
V.
&
Lane
W. L.
1980
Applied Modelling of Hydrologic Time Series
.
Water Resources Publications
,
Littleton
.
Senthil Kumar
A.
,
Ojha
C.
,
Goyal
M. K.
,
Singh
R.
&
Swamee
P.
2011
Modeling of suspended sediment concentration at Kasol in India using ANN, fuzzy logic, and decision tree algorithms
.
Journal of Hydrologic Engineering
17
(
3
),
394
404
.
Syvitski
J. P.
,
Morehead
M. D.
,
Bahr
D. B.
&
Mulder
T.
2000
Estimating fluvial sediment transport: the rating parameters
.
Water Resources Research
36
(
9
),
2747
2760
.
Zhang
W.
,
Wei
X.
,
Jinhai
Z.
,
Yuliang
Z.
&
Zhang
Y.
2012
Estimating suspended sediment loads in the Pearl River Delta region using sediment rating curves
.
Continental Shelf Research
38
,
35
46
.