Abstract
Prediction of suspended sediment concentrations (SSC) in arid and semi-arid areas has aroused increasing interest in recent years because of its primary role in water resources planning and management. Today, given its simplicity and reliability, SSC modeling by artificial neural networks (ANNs) and adaptive neuro-fuzzy inference system (ANFIS) are the most developed and widely used methods. The main aim of this study is suspended sediment concentrations modeling using ANN and ANFIS methods at the five largest basins in eastern Algeria: the Constantinois Coastal, Highlands, Kébir-Rhumel, Seybouse, and Soummam basin, which are characterized by high water erosion and a lack of SSC measurements. An application was given for historical time series: liquid flows Ql and solid flows Qs as inputs, and daily SSC as outputs, for the 14 hydrometric stations controlling the entire area. The best models were achieved using a multi-layer perceptron (MLP) feed forward networks (FFN) trained with a Levenberg-Marquardt (LM) algorithm for ANN modeling and a first-order Takagi-Sugeno-Kang (TSK) FFN with a hybrid learning method for ANFIS modeling. The reliability of the created models was evaluated using five validation criteria: determination coefficient R2, Nash-Sutcliffe coefficient NSE, mean square error MSE, root-mean-square error RMSE, and the mean absolute error MAE. The ANN and ANFIS models showed high accuracy, confirmed by excellent R2 values ranging from 0.77 to 0.98. The NSE ranged from 0.67 to 0.97. The error values were very good, the MAE varies from 0.004 g/L to 0.028 g/L for both models. The comparison of the ANN and ANFIS models revealed that ANN models slightly outperformed the ANFISs; both of them had high accuracy in SSC prediction.
HIGHLIGHTS
Eastern Algeria basins are prone to water erosion phenomenon.
The ANN and ANFIS models are new mathematical tools capable of SSC prediction.
Historical time series: liquid flow (Ql), solid flow (Qs), and SSC, are used for ANN and ANFIS modeling.
High accuracy was showed for both the ANN and ANFIS models' prediction.
Graphical Abstract
INTRODUCTION
Water erosion risk is a serious problem in Algeria. This disastrous phenomenon affects ∼ 45% of the north of country and manifests particularly on degraded sloppy soils at the mountainous areas (Arar & Chenchouni 2014). The importance and severity of this process are apparent in virtually all Algerian watersheds. Applying different water erosion simulation models in watersheds allows researchers to estimate the water and sedimentation yields and to predict vulnerable points within the watershed (De Vente & Poesen 2005). Imprecise suspended sediment load (SSL) modeling and prediction can reduce the amount of water stored by dam reservoirs, which can have an enormous negative impact on domestic and agricultural water supply, and also on dam structures (Zhang et al. 2020). SSL can be measured as the suspended sediment concentration (SSC), which is communicated as the proportion of the sediment mass in a water–sediment mixture to the water–sediment volume (Mustafa et al. 2012). Models based on computational intelligence are another means of modeling nonlinear systems; hence, they are an appropriate alternative to statistical models (Moeeni & Bonakdari 2018). Some studies have predicted SSL at daily scale using data-driven methods such as machine learning algorithms and soft computing models (Nourani & Andalib. 2015; Choubin et al. 2018; Kaveh et al. 2021). Other studies worldwide seeking to enhance the precision of the SSL estimation have used machine learning techniques such as an adaptive neuro-fuzzy inference system (ANFIS) (Azamathulla et al. 2012; Kisi & Ay 2012; Choubin et al. 2018), artificial neural network (ANN) (Kisi & Ay 2012; Nourani & Andalib 2015; Wang et al. 2018; Liu et al. 2019), support vector machine (SVM) (Kisi & Ay 2012; Choubin et al. 2018), and multilayer perceptron (MLP) (Gholami et al. 2016; Romano et al. 2018). ANFIS and ANN models are widely used for predicting hydrological variables given their high potential, high accuracy, and easy learning for modelers. Furthermore, the extensive capability of soft computing models in other engineering fields makes the mentioned models be present as the models used in the study. (Darabi et al. 2021).
(Lafdani et al. 2013) investigated about abilities of SVM and ANN models to predict daily SSL in the Doiraj River in the west of Iran. Stream flow discharge and rainfall were inputs of their model, and SSL was an output of it. They concluded that the combination of SVM and ANN is a suitable tool for prediction of SSL. Ebtehaj & Bonakdari (2013), found that the ANN model outperformed the SVM and GP models in forecasting sediment movement by applying the assumptions of self-filtering sewer models. Afan et al. (2015) estimated the daily sediment load by using two different ANN algorithms, the feed forward neural network (FFNN) and the radial basis function (RBF). They showed that ANN algorithms gave an accurate model to predict daily SSL. In addition, information about the sediment structure which is obtained by ANN (e.g., concentration of sediment, relationship between sediment and discharge) sometimes is impossible to capture by conventional models.
Furthermore, the ANN approach is capable of long-term modeling (for instance, monthly) in suspended sediment flux with fairly good accuracy. Compared with conventional regression models (e.g., multiple linear regression), ANN can generate a promising fit in similar conditions. This advantage is more highlighted specially in predictions for extremely high or low values (Adib & Mahmoodi 2017). (Idrees et al. 2021), has applied six ML models for daily SSL inflow prediction at Sangju Weir, South Korea, including ANNs, ANFIS, radial basis function neural network (RBFNN), SVM, genetic programming (GP), and deep learning (DL). The ANN model outperformed other models with correlation coefficient (R2) = 0.821, mean absolute error (MAE) = 4.244 tons/day, percent bias (PBIAS) = 0.055, WI = 0.891, Nash-Sutcliffe efficiency (NSE) = 0.991, root mean square error (RMSE) = 11.692 tons/day, PCC = 0.826. The models were ranked based on their SSL prediction capabilities as ANN > ANFIS > DL > RBFNN > SVM > GP from best to worst. Hanoon et al. 2022) has applied four machine learning techniques, namely gradient boost regression (GBT), random forest (RF), SVM, and ANN to predict SSL at the Rantau Panjang station on the Johor River basin, Malaysia. RMSE, and NSE were used as evaluation criteria. The ANN model shows more reliable results than other models with R2 of 0.989, RMSE of 0.011053, and NSE of 0.979. Therefore, the proposed ANN approach is recommended as the most accurate model for SSL prediction.
In spite of the suitable flexibility of ANN in modeling hydrologic time series, sometimes it may be difficult to train an ANN when signal fluctuations are highly non-stationary and the physical hydrologic process operates under a large range of scales varying from one day to several decades. In such an uncertain situation, the fuzzy inference system (FIS) may be applied in the estimation of uncertainties in real situations. Developing Hybrids of an ANN and a FIS is a current research focus, which can make use of the advantages of both ANN and FIS, these Hybrids are called neuro-fuzzy (NF) systems. NF systems are capable of capturing the benefits of both these techniques in a single framework and have been applied to a number of problems in water resources and environmental engineering, including ecological status simulation in surface waters (Ocampo-Duque et al. 2007). (Najafzadeh et al. 2016) indicated that the ANFIS model is more performant than the SVM. (Nourani & Andalib 2015). has applied different artificial intelligence models: SVM, ANFIS, feed-forward neural network (FFNN), and one conventional multilinear regression (MLR) modes, for modeling SSL in Katar catchment, Ethiopia. The ANFIS model provides higher efficiency than the other developed single models. Out of all developed ensemble models, the nonlinear ANFIS model combination method was found to be the most accurate method and could increase the efficiency of SVM, MLR, ANFIS, and FFNN models by 19.02%, 37%, 9.73%, and 16.3%, respectively, at the verification stage. In general, the literature review showed that ANN and ANFIS models are applied as effective approaches to handle nonlinear and noisy data, especially in situations where the relations among physical processes are not fully understood.
The aim of this research is to investigate the efficiency of the ANN model trained with the Levenberg-Marquardt (LM) algorithm and the ANFIS model trained with the hybrid learning algorithm for predicting daily SSC at the five largest Algerian basins located in the north-east of Algeria: Constantine Coastal, Highlands, Kebir-Rhumel, Seybouse, and Soummam. Each basin was considered a homogeneous area or a similar unit. From a geographical and hydrological point of view. The ANN and ANFIS models were developed using liquid flow Ql, solid flow Qs, and SSC time series provided by the national water resources agency ANRH (Algiers), which belong to 14 hydrometric stations controlling the study area. The models' adequacy was evaluated by comparing the observed SSC to those predicted using five different quantitative statistical metrics, including: R2, NSE, MSE, RMSE, and MAE. ANN and ANFIS models are compared in order to introduce the best-performing model for daily SSC prediction for each basin.
METHODS
Study area
The study area consists of the five largest basins located in the north-eastern part of Algeria: Constantine Coast 03, Highlands 07, Kebir-Rhumel 10, Seybouse 14, and Soummam 15. The Mediterranean Sea borders this hydrographic region to the north, Tunisia to the east, the Sahara hydrographic basin to the south, and the Algerois-Hodna-Soummam hydrographic basin to the west. It has a total area of 45,531 km2.
The eastern part of Algeria offers two distinct water systems. Exorheic flow basins in the north, where the mountainous nature of the Tell and and abundant rainfall allow an outlet to rivers toward the Mediterranean Sea, and the endorheic flow basins, in the south, linked to the bowled topography and the dominant semi-aridity (Mebarki 2005) (Figure 1). The Constantine Coastal is made up of three basins; the West, Center, and East Constantine Coastal. Constantine Coastal streams carry very abundant flows and they contrast sharply with endorheic basin rivers (Mebarki 2005). The Bounamoussa and El Kebir-East rivers are the main water sources in the Eastern Constantine Coast. Their confluence flows into the Mafragh River, (Zenati & Messadi 2014). The Highlands Basin contains a branched and less dense hydrographic stream system of endorheic type, drained by two large rivers named Chemora-Boulefreis.
The Kebir-Rhumel watercourse is the junction of two major watercourses: the Rhumel watercourse and EndjaWadi, where the confluence gives birth to Kebir-Wadi. The El-Kébir river runs into the Mediterranean Sea. (Tamrabet et al. 2019). The Bouhamdene and Cherf rivers are the main tributaries of the Seybouse River, which stretches over 150 kilometers (Guidoum 2017). The Soummam Basin covers a mountainous region in wich the Sahel and Boussellem rivers are the main drainage systems of the Soummam watershed (Zenati & Messadi 2014).
Washout is the most serious erosion form in Algerian basins, accounting for more than 50% of the yearly sediment transport. This most prevalent form is observed in cultivated soils along the Constantine coast, which is especially vulnerable to erosion during the autumn and winter seasons. The annual average specific erosion in this basin ranges from 252 to 10,375 t/km2. The Kébir Rhumel Basin belongs to the Tell area where erosion is generated by the erosive dynamical nature of mass movements and washouts alternate on slopes. Streams evacuate the largest quantities of sediments during the wet season of the year (Zaghouane et al. 2006). According to (Tamrabet et al. 2019), the annual average specific erosion in the aforementioned basin is predicted to be 1030.05 t/km2/year.
In the Highlands Basin, conventional cereal production techniques are being blamed for water erosion phenomena (Zaghouane et al. 2006). The erosion rate varies between 794 and 2,621 t/km2 in this basin. The river draining the Seybouse Basin carries between 70 and 86% of the annual tonnage during the fall (Zaghouane et al. 2006). With an average loss of 742 t/km2. The rate of sediment load in the Soummam Basin, is about 535 t/km2/year. Sheet erosion reaches 41% of the annual yield; washout erosion comes in second with 33% of the annual yield. The northern part of the basin is much more affected by mass movement erosion (Iskounen & Bougherara 2019).
Data and software used
The data series constituted by: liquid flows Ql in (m3/s), solid flows Qs in (kg/s), and the SSC in (g/l), carried out by the ANRH services (National Water Resources Agency) (Figure 2). (Table 1). The ANRH's technique for measuring the SSC begins by sampling a single point on the edge or in the middle of the bed using plastic bottles. During a flood, samples are taken at time intervals that vary depending on the river regime (Marouf & Remini 2011).
Basin . | Code . | Station . | Coordinates . | ||
---|---|---|---|---|---|
X(Km) . | Y(Km) . | Z(Km) . | |||
Constantine Coastal Basin | 030310 | El Mkaceb | 773.52 | 393.11 | 167 |
030334 | Chedia | 779.75 | 386.70 | 437 | |
030901 | Khemakhem | 878.95 | 370.30 | 628 | |
031101 | Ain Charchar | 909.60 | 393.15 | 278 | |
Highlands Basin | 070501 | Chemora | 850.50 | 260.25 | 942 |
070702 | Foum El Gueiss | 885.20 | 247.30 | 1305 | |
KébirRhumel Basin | 100109 | Tassaadane | 785.70 | 359.40 | 955 |
100301 | Athmania | 821.99 | 332.07 | *** | |
100601 | Grarem | 821.55 | 363.35 | 772 | |
100701 | El Ancer | 807.60 | 395.45 | *** | |
100702 | El Milia | 819.05 | 391.90 | 386 | |
Seybousse Basin | 140301 | Medjez Amar | 912.30 | 358.75 | 785 |
Soummam Basin | 150601 | Farmatou | 742.02 | 329.40 | 1,205 |
150702 | Magraouna | 713.65 | 333 | 1,000 |
Basin . | Code . | Station . | Coordinates . | ||
---|---|---|---|---|---|
X(Km) . | Y(Km) . | Z(Km) . | |||
Constantine Coastal Basin | 030310 | El Mkaceb | 773.52 | 393.11 | 167 |
030334 | Chedia | 779.75 | 386.70 | 437 | |
030901 | Khemakhem | 878.95 | 370.30 | 628 | |
031101 | Ain Charchar | 909.60 | 393.15 | 278 | |
Highlands Basin | 070501 | Chemora | 850.50 | 260.25 | 942 |
070702 | Foum El Gueiss | 885.20 | 247.30 | 1305 | |
KébirRhumel Basin | 100109 | Tassaadane | 785.70 | 359.40 | 955 |
100301 | Athmania | 821.99 | 332.07 | *** | |
100601 | Grarem | 821.55 | 363.35 | 772 | |
100701 | El Ancer | 807.60 | 395.45 | *** | |
100702 | El Milia | 819.05 | 391.90 | 386 | |
Seybousse Basin | 140301 | Medjez Amar | 912.30 | 358.75 | 785 |
Soummam Basin | 150601 | Farmatou | 742.02 | 329.40 | 1,205 |
150702 | Magraouna | 713.65 | 333 | 1,000 |
Daily SSC data from the five investigated basins (Figures 3–7) reveals high SSC quantity fluctuations from one basin to the other, which can reach a value of around 500 g/l for Kébir Rhumel Basin. The observed SSC are used for ANN and ANFIS modeling. The SSC time series graph is a useful tool for describing SSC fluctuations across the entire study region.
Data normalization
MODEL DESCRIPTIONS
FFN
The MLP network is a model with one or more hidden layers, which can use various input sets by a set of suitable outputs (Choubin et al. 2018). The training algorithms are introduced to search for the optimum value of weight connections. Classical training algorithms such as a back propagation algorithm and gradient descent algorithm are widely applied to calibrate the MLP parameters (Darabi et al. 2021). It is considerable that FFNNs have found an application in variety of learning models. The reason lies in the fact that these networks have the capability to directly use input samples to provide approximation of complex nonlinear mappings. The other cause is that FFNN can successfully tackle the problem of constructing models for a vast category of events for which traditional methods have not the ability to efficiently perform the learning process. (Saberi-Movahed et al. 2020). A typical FFNN architecture, with two hidden layers is shown in Figure 8.
The ANN architecture consists of an MLP with three layers: An input layer that will receive the inputs data: Ql and Qs. The hidden layers consist of outputs and inputs neurons set up using a couple of tansig activation functions for the hidden layer and a linear activation function in the output layer. The simulated SSC are provided by the output layer. 2/3 of the available database (Ql, Qs, and SSC) are reserved for ANN model creation. The remaining 1/3 is reserved for the verification of the created model. To avoid overfitting, these 2/3 of the data are divided into three subsets using software (Matlab 2017): the first subset, which includes 70% of the data, is intended for the learning phase (training), 15% for the validation phase, and the remaining 15% is reserved for testing the resulting model.
The ‘validation’ and ‘test’ phases are necessary before proceeding to operate the created model. The validation phase consists of an estimation of the model's performance. It is possible to avoid ANN overfitting by comparing the test error with the learning error. When the model is created, 1/3 of the data is provided to the software to check its performance. Here, only the input vectors (Ql, Qs) are transmitted to the created model. Many ANN models have been built and examined; the optimal model is the one providing the best results in performance curves (learning, validation, and testing), as well as the best validation criteria values.
ANFIS
ANFIS combines the advantages of both neural networks (e.g., learning capabilities, optimization capabilities, and connectionist structures) and FISs (e.g., human-like ‘IF-THEN’ rule thinking and ease of incorporating expert knowledge). The basic idea behind these neuro-adaptive learning techniques is very simple. They provide a methodology for the fuzzy modeling procedure to learn information about a data set, in order to compute the membership function parameters that best allow the associated FIS to track the given input-output data. ANFIS is based on the premise of mapping a FIS into a neural network structure so that the membership functions and consequent part parameters are optimized using a hybrid learning algorithm (Najafzadeh et al. 2016).
In a first-order Sugeno system, a typical rule set has two fuzzy IF-THEN rules. As the MATLAB toolbox provides a graphical user interface for ANFIS models trained with the hybrid and back-propagation (BP) algorithms, these models have been widely used by researchers in the field of water resources and environmental engineering (Kaveh et al. 2017).
The ‘anfisedit’ tool in the Fuzzy Logic Toolbox within MATLAB 2017 software was used for ANFIS modeling. A first-order TSK FIS was applied using a feed-forward network and a hybrid learning method. In this computation analysis, grid partitioning can be easily used for solving problems with fewer than six input variables (Genç et al. 2014). In ANFIS grid partitioning, fuzzy clustering and hybrid learning algorithms are applied to determine the input data structures in combination with the BP gradient descent method (Kisi & Ay 2012). The grid partitioning which includes eight membership function types such as trimf, trapmf, gbellmf, gaussmf, gauss2mf, pimf, dsigmf, and psigmf for our FIS modeling, generates a mn number of fuzzy rules (n = 2 number of input variables such as Ql, Qs, and m: the number of MFs per input, which is changing from 2 to 6). The rule base was developed using the OR logical operation by changing the number of rules from two to six, with each rule having several parameters of membership. A single output SSC with a constant MFs type was employed, individually 12 (two input MFs × one output MF× and six rules). The ANFIS models were created using a simple ANFIS model approach with a 0.005 error tolerance (RMSE)and a maximum iteration of 100.
Three data splitting processes were used for ANFIS models, with 60% of the data reserved for training, 20% for validation, and 20% for testing. The ANFIS model was designated using training data sets, where the validation phase was used to control the overfitting learning process. The testing data phase was performed as an independent data set that was unknown to the created model.
Modeling for both ANN and ANFIS continues until the error value is less than the desired value. The adjusting parameters of the ANFIS model include the number of representatives (inputs and outputs), the membership function in the grid partition, the iteration number, and the desired RMSE value. The best ANFIS model was obtained by trial, considering the least RMSE error and the higher determination coefficient R2 Different steps of the study are summarized in the flow chart (Figure 10).
Implementations and results
SSCobs and tSSCsim: represent respectively the SSC observed values, and SSC simulated values via ANN and ANFIS models. represent respectively the average of observed SSC; the average of simulated SSC; N: total data numbers.
The statistical and results of the proposed ANFIS and ANN models for training and testing are presented for each basin of the five investigated basins. The best adjustment between inputs and outputs is represented by the regression graph, in which the circles indicate SSC points and the line represents the best adjustment.
The ANN architecture is the same for all the selected models: an MLP with a Levenberg-Marquardt learning algorithm, two inputs: Ql, Qs, one to three hidden layers, with two activation functions: tansig for the hidden layer and pureline for the output layer, the latter providing the simulated SSC. Each model has a different number of neurons, but never exceeds 23 neurons. Beyond 30 neurons, we notice that the model deviates and the magnitude of errors (MSE, RMSE, and MAE) between the observed SSC and the predicted one increases. This leads to ANN overfitting. To allow proper modeling and conformity to probabilistic criteria, the observed data was used randomly. To achieve the best accuracy for the ANFIS models, various Gaussian membership function types, number of rules, and grid partition were employed. Choosing the number of MFs for each input reflected the complexity of the ANFIS model for selecting parameters. The number of MFs in this part ranged from 2 to 4. Following the ANN and ANFIS modeling evaluation, it is necessary to compare the performance criteria (R2, NSE, MSE, RMSE, and MAE) of these two models to introduce the best one for daily SSC prediction for each basin.
Constantine Coastal Basin
A total of 3309 data sets were collected from four hydrometric sites in the Constantine Coastal Basin: El Mkaceb, Chedia, Khemakhem, and Ain Charchar, which were used for ANN and ANFIS modeling. The training, validation, and testing curves for the ANN model all show successful performance, with an MSE error of around 0.00013 at epoch 1000 (Figure 11).
Table 3 gives the performance evaluation of the ANN and ANFIS models through the validation criteria: R2, NSE, MAE, MSE, and RMSE, with various architectures (number of neurons for ANN and number of MFs for ANFIS) during the training phase for the Constantine Coastal Basin. SSC modeling via ANN and ANFIS obtained for the investigated basin shows an excellent concordance between the observed SSC and those simulated, indicated by an excellent determination coefficient R2 and a very good NSE (Figures 12 and 13) (Table 1). Scatter plots for ANN and ANFIS models (Figures 12 and 13) allow for a visual comparison of simulated and observed SSC. The disadvantage of the latter is that model performance can only be obtained qualitatively. Furthermore, scatter plots can be simply modified to make them look better.
Layers . | Node . | Function . | Input/output parameters . |
---|---|---|---|
Layer 1 | Adaptive | Membership function variation for given fuzzy set | Premise parameters |
Layer 2 | Fixed | Fuzzy and node function | Product of all incoming signals |
Layer 3 | Fixed | Firing strength ratio calculation | Normalized firing strength |
Layer 4 | Adaptive | Firing strength normalisation | Consequent parameters |
Layer 5 | Fixed | Summation of incoming signals | Overall output (f1.f2) |
Layers . | Node . | Function . | Input/output parameters . |
---|---|---|---|
Layer 1 | Adaptive | Membership function variation for given fuzzy set | Premise parameters |
Layer 2 | Fixed | Fuzzy and node function | Product of all incoming signals |
Layer 3 | Fixed | Firing strength ratio calculation | Normalized firing strength |
Layer 4 | Adaptive | Firing strength normalisation | Consequent parameters |
Layer 5 | Fixed | Summation of incoming signals | Overall output (f1.f2) |
. | Model . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN model | SSCsim = 0.98*SSCobs + 0.00063 | FFNN 2-23-1 tansig-purelin | Levenberg-Marquardt algorithm | 1000 | 0.98 | 0.97 | 0.004 | 5.57 × 10−5 | 0.007 |
ANFIS | SSCsim = 0.84*SSCobs + 0.01 | MFs type Input : gaussmf Output:constant MFs Number 2-3 | Hybrid learning | 100 | 0.81 | 0.80 | 0.0149 | 5.36 × 10−4 | 0.049 |
. | Model . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN model | SSCsim = 0.98*SSCobs + 0.00063 | FFNN 2-23-1 tansig-purelin | Levenberg-Marquardt algorithm | 1000 | 0.98 | 0.97 | 0.004 | 5.57 × 10−5 | 0.007 |
ANFIS | SSCsim = 0.84*SSCobs + 0.01 | MFs type Input : gaussmf Output:constant MFs Number 2-3 | Hybrid learning | 100 | 0.81 | 0.80 | 0.0149 | 5.36 × 10−4 | 0.049 |
It is evident from Table 3 that the optimal model has the highest R2 and NSE values: 0.98 and 0.97 for the ANN model trained with the LM algorithm, indicating very good efficiencies of the models. (Mustafa et al. 2012) has trained an MLP neural network model using different algorithms to predict the suspended sediment discharge of a river (Pari River at Silibin) in Peninsular Malaysia. The results indicated that the Levenberg–Marquardt algorithm could reach fast convergence.
In a general evaluation, the proposed ANN model (2-23-1) seems to perform the SSC prediction more precisely than the comparable ANFIS model (2-3) trained by hybrid algorithm for the Constantine Coastal Basin. Idrees et al. (2021) has applied various artificial models for daily SSL inflow prediction at Sangju Weir, South Korea. ANN model outperformed the ANFIS model, with R2 = 0.821, MAE = 4.244 tons/day, NSE = 0.991, RMSE = 11.692 tons/day. As can be seen, the error values that represent the variances between the observed and predicted SSC show the lowest values obtained with 1000 bootstrap (iteration) samples used for the ANN training phase, as can be seen: MAE = 0.004, MSE = 5.57 × 10−5, and RMSE = 0.007 were obtained for ANN model. MAE = 0.0149, MSE = 5.36 × 10−4, and RMSE = 0.049 are all excellent error values for the ANFIS model.
Highlands Basin
For the Highlands Basin, a total of 2093 data sets were collected from two hydrometric stations: the Chemora and the Foum El Gueiss station. Figures 13 and 14 show the trend of predicted versus observed data, with high agreement for the two models, ANN and ANFIS.
Based on the visualized prediction performance for the daily SSC (Figures 15 and 16), as well as the training performance metrics shown in Table 4 for the Highlands Basin, it can be seen that the ANN model (2-17-1) outperformed the ANFIS model (MFS: 3-2), with an ANN model R2 slightly exceeding the ANFIS model R2: R2 ANN = 0.92, R2 ANFIS = 0.90. very good value for NSE ANN = 0.88, and a good value for the NSE ANFIS = 0.75. The ANN model has more satisfactory accuracy than the ANFIS one.
. | Model . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN model | SSCsim = 0.94*SSCobs + 0.0024 | FFNN 2-17-1 tansig – purelin | Levenberg-Marquardt algorithm | 72 | 0.92 | 0.88 | 0.0185 | 0.0024 | 0.048 |
ANFIS | SSCsim = 0.81*SSCobs + 0.0027 | MFs type Input: gaussmf Output: constant MFs number 3 2 | Hybrid learning | 100 | 0.90 | 0.75 | 0.0283 | 0.0011 | 0.059 |
. | Model . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN model | SSCsim = 0.94*SSCobs + 0.0024 | FFNN 2-17-1 tansig – purelin | Levenberg-Marquardt algorithm | 72 | 0.92 | 0.88 | 0.0185 | 0.0024 | 0.048 |
ANFIS | SSCsim = 0.81*SSCobs + 0.0027 | MFs type Input: gaussmf Output: constant MFs number 3 2 | Hybrid learning | 100 | 0.90 | 0.75 | 0.0283 | 0.0011 | 0.059 |
(Lafdani et al. 2013) used the nu-SVR and ANN models to predict SSL. The reliability of models was evaluated based on performance criteria such as RMSE, MAE, and R2. The obtained results show that ANN models have better performance.
The correlation between observed SSC and predicted ones shows an excellent performance curve with the best validation performance (MSE = 0.00078 with 72 iterations) in the training phase for the ANN modeling (Figure 14). During the training phase, it was shown that the scattering plot of predicted to experimental data was appropriate and that the error values were very good. For the ANN model, MAE = 0.0185, MSE = 0.0024, RMSE = 0.048 (2-17-1). However, the ANFIS model (3-2) revealed the following errors: MAE = 0.028, MSE = 0.0011, RMSE = 0.059, As it can be seen, MAE ANN > MAE ANFIS, while MSE ANN > MSE ANFIS. The RMSE also tends to give more weight to high values than low values because errors in high values are usually greater in absolute value than errors in low values (Gan et al. 2009). MAE is more robust to data with outliers. The MAE value is decisive for the best model selection.
Kebir-Rhumel Basin
A total of 5539 data sets from five hydrometric stations controlling the investigated basin, including Tassadane, Athmania, Grarem, El Ancer, and El Milia, have been used for ANN and ANFIS modeling. The training, validation, and testing curves for the ANN model all show successful performance, with an MSE error of around 0.00049 at epoch 19 (Figure 17). Scatter plots generated during the training phase show excellent concordance of observed and predicted SSC values for both the ANN and ANFIS models (Figures 18 and 19), as evidenced by significant R2 and NSEs such as R2 = 0.88, NSE = 0.75 for the ANFIS model and R2 = 0.77, NSE = 0.67 for the ANN model.
Then we opt for the ANFIS model trained with the hybrid algorithm for the combination (2–3), which represents the best performance for SSC prediction in the Kébir-Rhumel Basin. Nourani et al. (2021) showed that the ANFIS model provides higher efficiency than the other developed artificial intelligence models: SVM, ANFIS, FFNN, and one conventional MLR model, for S modeling.
The achieved results for the latter predictability displayed a value of MAE around 0.0085, an RMSE of 0.020, these errors value are very excellent, the observed SSC were very similar to those simulated by ANFIS model for the daily SSC prediction in this basin (Figure 19). Despite the fact that the ANFIS model is the most efficient, the MAE ANN is slightly less efficient than the MAE ANFIS, as seen in Table 5. The comparison of the two validation criteria, R2 and NSE values, was decisive in this case for the best model selection.
. | Model . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN | SSCsim = 0.79*SSCobs + 0.0018 | FFFN tansig – purelin 2-3-1 | Levenberg-Marquardt algorithm | 19 | 0.77 | 0.67 | 0.0043 | 5.78 × 10−4 | 0.024 |
ANFIS | SSCsim = 0.91*SSCobs + 0.006 | MFs type Input: gaussmf Output: constant MFs Number 2 3 | Hybrid learning | 100 | 0.88 | 0.75 | 0.0085 | 4.26 × 10−4 | 0.020 |
. | Model . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN | SSCsim = 0.79*SSCobs + 0.0018 | FFFN tansig – purelin 2-3-1 | Levenberg-Marquardt algorithm | 19 | 0.77 | 0.67 | 0.0043 | 5.78 × 10−4 | 0.024 |
ANFIS | SSCsim = 0.91*SSCobs + 0.006 | MFs type Input: gaussmf Output: constant MFs Number 2 3 | Hybrid learning | 100 | 0.88 | 0.75 | 0.0085 | 4.26 × 10−4 | 0.020 |
Seybouse Basin
For Medjez-Ammar station (a total of 5540 observations), which is the only station covering the Seybouse Basin, there was a very strong agreement between observed SSC and those simulated by ANN and ANFIS models. The best results for the ANN model were obtained at the 169th epoch (bootstrap), with the lowest rmse error of around 0.00036 (Figure 20).
Comparing the validation criteria of the two created models (Table 6), it can be seen that the ANN model trained with the LM algorithm using an architecture of (2-3-1) had the highest NSE value (0.83) and the lower MAE error. With regard to the high R2 values achieved for both models, ANN and ANFIS, 0.90 and 0.92 (Figure 21; Figure 22), respectively, the distinction was as clear for the NSE value, which implied that the ANN model outperformed the ANFIS model. The largest disadvantage of the Nash-Sutcliffe efficiency is the fact that is sensitive to extreme values due to the squared differences. For our study case, the SSC prediction is overestimated during floods and underestimated during low flow conditions. The two selected models converge well toward minimum errors, which describes the variances between the observed and predicted SSC, showing the lowest values obtained in the training phase. MAE ANN < MAE ANFIS, RMSE ANN < RMSE ANFIS (Table 6). The validation criteria confirm that the ANN model can be considered as the best model with an effective accuracy. Afan et al. (2017) has applied three modeling approaches using a neural network (NN), to predict the daily time series SSL. The performances of the NN, including both accuracy and simplicity is compared through several comparative predicted and error statistics. The NN model showed validation criteria: R = 0.91, MAE = 20.17 (tonnes/year), RMSE = 33.09 (tonnes/year).
. | Models . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN | SSCsim = 0.91*SSCobs + 0.0009 | FFFN tansig – purelin 2-3-1 | Levenberg-Marquardt algorithm | 169 | 0.90 | 0.83 | 0.0050 | 8.70 × 10−5 | 0.0093 |
ANFIS | SSCsim = 1.175*SSCobs + 0.0005 | MFs type Input: gaussmf Output: constant MFs number 2 3 | Hybrid learning | 100 | 0.92 | 0.73 | 0.0058 | 5.32 × 10−4 | 0.0169 |
. | Models . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN | SSCsim = 0.91*SSCobs + 0.0009 | FFFN tansig – purelin 2-3-1 | Levenberg-Marquardt algorithm | 169 | 0.90 | 0.83 | 0.0050 | 8.70 × 10−5 | 0.0093 |
ANFIS | SSCsim = 1.175*SSCobs + 0.0005 | MFs type Input: gaussmf Output: constant MFs number 2 3 | Hybrid learning | 100 | 0.92 | 0.73 | 0.0058 | 5.32 × 10−4 | 0.0169 |
Soummam Basin
The Farmatou and Magroua stations controlling the Soummam Basin provided 1582 observations, that were used for the ANN and ANFIS modeling. The finding showed a high performance in the scatter plots (Figures 22 and 23). The Levenberg-Marquardt algorithm converges to a lower MSE = 0,0014 for the best validation performance, with a minimum number of iterations (43/bootstrap) (Figure 23).
Through the numerical statistics performance of the two studied models (Table 7), it is very obvious that the proposed modeling of NF and the NN model (3-3) successfully estimates the daily SSC for the Soummam Basin using the antecedents: Ql, Qs, and SSC values. (Rezaei & Vadiati 2020), has compared various methods: fuzzy logic (FL), two ANFISs (i.e., ANFIS-GP and ANFIS-FCM models), an ANN, and least squares support vector machine (LSSVM). To predict SSL at the Seyra gauging station on the Karaj River in Iran. The results showed that the ANFIS models outperformed the ANN. Overall, the performances of the artificial intelligence models used were satisfactory in predicting the non-linear behaviour of the SSL.
Validation criteria . | Models . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN | SSCsim = 0.85*SSCobs + 0.012 | FFFN tansig – purelin 2-13-1 | Levenberg-Marquardt algorithm | 43 | 0.87 | 0.82 | 0.026 | 0.0047 | 0.068 |
ANFIS | SSCsim = 0.89*SSCobs + 0.002 | MFs type Input: gauss2mf Output: constant MFs number 3-3 | Hybrid learning | 100 | 0.85 | 0.84 | 0.021 | 0.0011 | 0.031 |
Validation criteria . | Models . | Architecture . | Learning method . | Iterations . | R2 . | NSE . | MAE . | MSE . | RMSE . |
---|---|---|---|---|---|---|---|---|---|
ANN | SSCsim = 0.85*SSCobs + 0.012 | FFFN tansig – purelin 2-13-1 | Levenberg-Marquardt algorithm | 43 | 0.87 | 0.82 | 0.026 | 0.0047 | 0.068 |
ANFIS | SSCsim = 0.89*SSCobs + 0.002 | MFs type Input: gauss2mf Output: constant MFs number 3-3 | Hybrid learning | 100 | 0.85 | 0.84 | 0.021 | 0.0011 | 0.031 |
According to Table 7, and Figures 24 and 25 there is no significant variance for R2 and NSE between the two selected models. The R2 and the NSE are slightly different. R2 ANN = 0.87, R2 ANFIS = 0.85, NSE ANN = 0.82, NSE ANFIS = 0.84, while we agree that NSE is more stringent than R2. Then the ANFIS model seemed to perform better the daily SSC prediction for the Soummam Basin than the ANN model. This was more confirmed by the lowest error values: MAE = 0.021, MSE = 0.0011, RMSE = 0.031 for the ANFIS model. Olyaie et al. (2015) has applied various soft computing techniques (e.g., ANN, ANFIS, and WANN) were compared for estimation of suspended sediment load in Flathead River station and Santa Clara River station. The comparison of the ANN and ANFIS models revealed that ANFIS models slightly outperformed the ANNs, but differences between the results of the two approaches were not high, and both of them might be considered as alternative tools for simulating SSL.
The daily SSC prediction among the five investigated basins shows excellent results with high accuracy for both the ANN and ANFIS models. The performance of the selected models was evaluated for each basin by comparing the concordance of observed SSC versus those simulated via scatter plots for both models: ANN-ANFIS and learning curves for the ANN model during the training phase. Positive correlations confirmed the findings, with R2 ranging from 0.77 to 0.98 for the ANN model and 0.81 to 0.92 for the ANFIS model. The NSE for the ANN model ranged from 0.67 to 0.97, whereas the NSE for the ANFIS model ranged from 0.73 to 0.84. The obtained results, were confirmed by positive correlations: the R2 ranged from 0.77 to 0.98 for the ANN model and from 0.81 to 0.92 for the ANFIS model. The NSE ranged from 0.67 to 0.97 for the ANN model and from 0.73 to 0.84 for the ANFIS model. After comparing the quantitative performance of the ANN and ANFIS models, it was revealed that the level of uncertainty for these two predictions presented by the error values is variable for the MAE: from 0.004 g/L to 0.026 g/L for the ANN model, and from 0.0058 g/L to 0.0283 g/L for the ANFIS model. The MSE varies from 5.57 × 10−5 g/L to 0.0047 g/L for the ANN model and from 4.26 × 10−4 g/L to 0.11 g/L for the ANFIS model.
For the ANN modeling, the Constantine Coastal Basin (architecture: 2-23-1) has the best accuracy compared to other models, achieving the highest R2 and NSE values, which are: 0.98 and 0.97, respectively. The ANFIS superior model (2-3) for the Seybouse basin has an R2 of 0.92 and an NSE of 0.73. By comparing prediction performance among ANN-ANFIS models, it was revealed that the ANN models have the best accuracy for three basins, which are: the Constantine Coastal Basin, the Highlands Basin, and the Seybouse Basin. However, the ANFIS model generated more favorable outcomes compared with the ANN model for only two basins: the Kébir-Rhumel Basin and the Soummam Basin. Finally, it is concluded that the ANFIS model is slightly less uncertain than the ANN one.
CONCLUSION
The current research applied two artificial intelligence models, ANN and ANFIS, for SSC prediction among the five largest basins in eastern Algeria (the Constantine Coastal, the Highlands, the Kebir-Rhumel, the Seybouse, and the Soummam basins). Findings show high accuracy for both ANN and ANFIS approaches for daily SSC prediction among the five investigated basins (high R2 and NSE, lower error values: MAE, MSE, and RMSE). The perfect selection of network architecture plays a major role in achieving the best findings. The MLP with an FFNN, Levenberg-Marquardt algorithm, showed high convergence speed with a minimum number of iterations (bootstrap) for a time that does not exceed a few minutes. The neuron number parameter is the most critical input parameter in the ANN network. Increasing the latter beyond 30 may lead to overfitting of the network. The best ANFIS models were developed by applying a TSK FIS with a hybrid learning algorithm to achieve the RMSE error tolerance. The type of membership function is the most critical parameter for the ANFIS model. The comparisons of ANN and ANFIS accuracy showed that the performance of both models are approximately of the same order. The ANN models outperformed the ANFIS ones slightly. They were much more accurate than the ANFIS model in three basins: the Constantine Coastal Basin, the Highlands Basin, and the Seybouse Basin. ANFIS model is more accurate than the ANN one for only two basins: the Kébir-Rhumel Basin and the Soummam Basin. The most obvious finding to emerge from this study was that the application of ANN and ANFIS were were used successfully as soft computing tools for SSC prediction. They can deal well with the complexity of nonlinear SSC. This study can help as basic research for SSC regionalization, particularly at ungauged or poorly gauged sites in eastern Algeria.
ACKNOWLEDGEMENTS
The authors are thankful to the ANRH for providing the necessary data for the study reported in the paper free of charge; and to the anonymous reviewers for their careful reading and precious comments.
CONFLICT OF INTERESTS
The authors declare that they have no conflict of interest.
COMPETING INTERESTS
The authors are not affiliated with or involved with any organization or entity with any financial interest or nonfinancial interest in the subject matter or materials discussed in this paper.
FUNDING
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
DATA AVAILABILITY STATEMENT
All relevant data are included in the paper or its Supplementary Information.