Abstract
Soil is one of the main elements of natural resources. Accurate estimation of soil erosion is very important in optimum soil resources development and management. Analyzing soil erosion by water on cultivated lands is an important task due to the numerous problems caused by erosion. In this study, the performance of three different data-driven approaches, e.g. multilayer perceptron artificial neural network (ANN), grid partitioning (GP), and subtractive neuro-fuzzy (NF) models were evaluated for estimating soil erosion. Land use, slope, soil and upland erosion amount were used as input parameters of the applied models and the erosion values obtained by MPSIAC method were considered as the benchmark for evaluating the ANN and NF models. The applied models were assessed using the coefficient of determination (R2), the root mean square error (RMSE), the BIAS, and the variance accounted for (VAF) indices. The results showed that the subtractive NF model presented the most accurate results with the minimum RMSE value (3.775) and GP, NF and ANN models were ranked successively.
INTRODUCTION
The planning of measures for protection and conservation of soil in watersheds is an important issue in soil conservation, hydrology and optimum watershed management. Study of soil erosion by water on cultivated lands is crucial due to the numerous problems caused by the erosion as well as the significant environmental and economic consequences (Tarolli & Sofia 2016). Soil erosion is a serious environmental problem that negatively influences agricultural productivity, water quality, aquatic ecology, and river morphology (Peter et al. 2010). Land degradation is defined as the loss in potential utility or productivity of land resources in a country that is usually subjected to serious soil erosion. It is a major worldwide environmental problem, especially in arid and semi-arid regions.
Erosion adversely affects climate, soil fertility, vegetation cover, economy and human welfare (Polyakov & Lal 2008). Soil loss may be reduced in cropped soils by optimum soil management and tillage practices (Boix-Fayos et al. 2005; Cerdan et al. 2010; Vanwalleghem et al. 2011). For erosion control, a sustainable land management perspective is required. Watershed management practices are variable, taking into consideration land use and vegetation cover with the intention of rehabilitation of degraded lands and protection of soil and water systems (Alemayehu et al. 2009). However, watershed management approaches need to be adapted to the local situation (Darghouth et al. 2008). There are numerous methods for estimating watershed sediment yield. The PSIAC (Pacific Southwest Inter-Agency Committee) model (PSIAC 1968) is the commonly used method for evaluating soil erosion (Clark 2001).
The PSIAC model was developed primarily for application in arid and semi-arid areas in the southwestern USA, and is believed to be appropriate for the same environmental conditions in other regions, e.g. Iran (Bagherzadeh 1993; Sadeghi 1993). Usually, the evaluation of the erosion processes is carried out by using the conceptual models, which present good results. However, those models are difficult to develop and the calibration of the model parameters is subjective. Alternatively, experimental models as well as data-driven techniques (e.g. artificial neural network (ANN) or neuro-fuzzy (NF) techniques) may be used, which connect inputs and output by means of a mathematical function without an explicit relationship with the catchment specifications (Pandey et al. 2016).
ANN is a computing method that mimics the human brain and nervous system. It has a mathematical structure able to approximate arbitrarily intricate nonlinear processes that relate the inputs and output of any system. ANN has been applied successfully for modeling intricate nonlinear input/output time series relationships, classification, pattern recognition and other problems in a wide diversity of fields (Kisi 2005). The high degrees of empiricism and approximation in the analysis of hydrologic systems are highly suitable for the application of ANN (Hsu et al. 1995). ANNs are used for simulating hydrologic variables such as stream flow, temperature, snow melting, rainfall-runoff and suspended materials. Among others, ANNs have been successfully employed in estimation of maximum flood (Bodri & Cermak 2000), estimation of flow rate (Dibike & Solomatine 2001), modeling precipitation and runoff (Nagy et al. 2002), reservoir operation (Chang et al. 2005), hydrologic time series modeling and sediment transport prediction (Firat & Gungor 2004; Agarwal et al. 2006; Singh et al. 2012), nutrient concentrations in surface runoff (Kim & Gilley 2008), direct runoff estimation (Dhamge et al. 2012), and prediction of event-based storm-water runoff quantity and quality (He et al. 2011). Moreover, the adaptive NF inference system has been applied to simulate/predict different factors in hydrologic sciences (e.g. Besalatpour et al. 2013). Only limited researches have been carried out on application of ANN and NF techniques for predicting soil erosion rate, including, for example, Metternicht & Gonzalez (2005), Kim & Gilley (2008), and Demirel & Tüzün (2011). Nonetheless, there is not any comparative study on assessing the NF and ANN models' capabilities for simulating soil erosion in different watersheds. Also, it is of great importance to establish models (e.g. ANN and NF) which can simulate soil erosion with limited input parameters. The objective of this study is to evaluate the capabilities of NF and ANN (based on limited input parameters) in modeling soil erosion, using the standard MPSIAC erosion values as benchmark.
MATERIALS AND METHODS
Study area
Observed data from the Jooneghan watershed, located in Chaharmahal and Bakhtiari province in the west of Iran, were used here for developing and validating the applied models. The watershed is situated between latitudes 32°29′ to 32°40′ N and longitudes 50°47′ to 50°20′ E. The watershed covers 903.92 km2. Figure 1 shows the location of the studied watershed. The highest elevation in the watershed is 3,580 m. The average annual rainfall in the watershed is 512.21 mm, and about 80% of the annual rainfall occurs between January and February. The mean monthly temperature varies between −0.8°C and 22.2°C. Range condition over the entire watershed is poor and has contributed to excessive runoff and soil erosion. The dominant geology in the watershed is units of Quaternary sediments in the basin outcrops, shale, marl and limestone with shale and marl, sandstone (high- and mid-Asmari) and nummolities of white limestone, clastic conglomerate deposits with medium to low degree of cementation.
PSIAC model
The PSIAC method estimates total annual sediment yield comprising the sheet and rill erosions (PSIAC 1968). This model was first applied over the watershed of Walnut Gulch in south-east Arizona, United States. Later, considering the modification applied by Johnson & Gebhardt (1982), it was called modified PSIAC (MPSIAC hereafter). The successful applications of this model for estimating the sediment yield of watersheds in semi-arid areas of Iran were reported in several previous studies (Tangestani 2006; Khaledian et al. 2012). The method is based on a review of a few representative points within a given sub-catchment, which are then used to project average values for the entire watershed area. The procedure considers nine factors for erosion estimation: surface geology, soils, climate, runoff, topography, ground cover, land use, channel erosion, and upland erosion. The procedure was developed for sub-catchments in the western United States greater than 30 km2; however, it has also been applied to smaller basins (Noori et al. 2016). Compared with other experimental methods, the MPSIAC model considers the greatest number of factors, so the results are more realistic (Tangestani 2006; Daneshvar & Bagherzadeh 2012). Each factor is subdivided to different categorical classes, and a weighting value is assigned to each class using the model tables based on the degree of impact of each factor class (PSIAC 1968). Calibration in MPSIAC is a process of parameter adjustment (automatic or manual), until catchment and model behavior show a sufficiently (to be specified by the hydrologist) high degree of similarity.
Topography
Topography is one the principal factors affecting the soil erosion, which has the following sub-criteria: slope, vector, size and shape of the basin. The erosion rate directly changes with any change in the length, steepness and the shape of the slope, while apart from the size, the shape of the basin is also important in erosion formation. In addition, five different classes of slope theme expressed in percentages are constructed in ArcGIS 10.2. These classes are defined as flat and gentle <11%, medium 11–24%, steep 24–38%, very steep 38–57%, and extreme >57%.
Climate
Climatic factors include rainfall and temperature. Duration and density of the rain directly affects the erosion. A high temperature increases fragmentation of organic substances, which will result in a decline in the plant cover and increase in the erosion rate.
Runoff
The runoff factor in the watershed was calculated as 0.29 of total average runoff and the peak special discharge. The Soil Conservation Service Curve Number (SCS-CN) model was used to estimate runoff in this area. This model has a long, fruitful application history and is generally referred to as ‘blue collar’ hydrology (Hawkins et al. 2010).
Channel erosion
This factor indicates the rate of erosion from river and drainage channels. The slope steepness, type of bedrock, and the potential energy of floods are the major factors affecting the channel erosion.
Upland erosion
Upland erosion was obtained based on the method suggested by Bureau of Land Management (BLM) (Johnson & Gebhardt 1982).
Land use and land cover
The surface of the land has a layer that protects it from erosion. As this layer weakens, the risk of erosion increases. The following are determined as the sub-criteria for this factor: (a) plant cover – the land may be covered by forest or an agricultural plant; (b) land use – land can be used for a dense agricultural application, which can harm the soil.
Figure 2 displays the lithology and land use map of the studied region. The study area in this research presents various agricultural and industrial activities and infrastructures (see Figure 3). The high traditional agricultural vocation, and use of wrong farming techniques and heavy grazing, is marked in the figure. The industrial activities in the region include the extraction of aggregates in the riverbed in the western area.
Geology map
In terms of geomorphology, as well as a glimpse of land in the area, two types may be identified: a mountain in the north, east and west basin lands, and plains in the center and south of the watershed. The area has a high diversity of opinion of geology and the lithological units (Table 1).
Geological units of the study area
Row . | Age . | Lithology . | Formation . |
---|---|---|---|
1 | OM(1,2,3) | Gray marl and limestone rocks with layers and layers of sand | Asmari |
2 | K | Limestone | Bangestan |
3 | K5 | Limestone with orbitolinids | Darian |
4 | Q | Quaternary | Flood deposits |
5 | QT | Quaternary | Deposits of alluvial fans |
6 | K1 | Thin-bedded limestone with shale and marl | Sarvak |
7 | JK | Lime and shale | Surmeh |
8 | K8 | Lime thin layer of cream to brown | Fahlian |
9 | E | Sandstone and conglomerate and mudstone and chert with radiolarites | Kashkan |
10 | P1 | Conglomerate and sandstone | Equivalent Bakhtiari |
11 | K2 | Red conglomerate and sandstone | Equivalent Kazhdumi |
12 | K7 | Marl and limestone with Ammonites, Orbitolina | Garu |
Row . | Age . | Lithology . | Formation . |
---|---|---|---|
1 | OM(1,2,3) | Gray marl and limestone rocks with layers and layers of sand | Asmari |
2 | K | Limestone | Bangestan |
3 | K5 | Limestone with orbitolinids | Darian |
4 | Q | Quaternary | Flood deposits |
5 | QT | Quaternary | Deposits of alluvial fans |
6 | K1 | Thin-bedded limestone with shale and marl | Sarvak |
7 | JK | Lime and shale | Surmeh |
8 | K8 | Lime thin layer of cream to brown | Fahlian |
9 | E | Sandstone and conglomerate and mudstone and chert with radiolarites | Kashkan |
10 | P1 | Conglomerate and sandstone | Equivalent Bakhtiari |
11 | K2 | Red conglomerate and sandstone | Equivalent Kazhdumi |
12 | K7 | Marl and limestone with Ammonites, Orbitolina | Garu |
Using the percentage cover, the area, and the range of annual sediment yield, the mean sediment yield of the studied area was computed using the PSIAC model (Table 2). The dominant erosion potential categories are low to moderate degrees (80.9% of total area), while the areas with very high erosion potential cover only 3.3% of the sub-catchment area. Results of the model also show a sediment yield range of 954.67 m3/km2/year for the very low erosion potential category, and 862,748.42 m3/km2/year in the regions with very high erosion potential (Figure 4).
Sediment yield of study area
Sediment yield (m3/km2) . | Mean sediment yield . | Area (ha) . |
---|---|---|
954.67 | 862,748.42 | 903.71 |
Sediment yield (m3/km2) . | Mean sediment yield . | Area (ha) . |
---|---|---|
954.67 | 862,748.42 | 903.71 |
It is noted that the MPSIAC method is based on a review of a few representative points within a given sub-catchment, which are then used to project average values for the entire sub-catchment area (Tangestani 2006). So, the obtained erosion values belong to the average of the erosion values of a polygon. Moreover, the region was divided into polygons based on nine factors affecting erosion in the MPSIAC model. To begin with, all factors were combined and the final MPSIAC score was calculated on the map. Then, the borders between homogeneous areas that have equal MPSIAC points were removed in order to produce final polygons. The dissolve tool in GIS is designed for performing this action.
ANN-MLP model
The ANN model has a similar structure to the human neural network system, where it imitates the structure of the human brain and its operational programs and conducts predictions based on repeated trainings. The structure of an ANN model is flexible, so, generally, its input parameters are selected from the available data. They are composed of neurons, which are arranged in groups called layers and connected through weights. In the input layer, the neurons in this first layer propagate the weighted data and bias randomly selected through the hidden layers. Once the net sum at a hidden node is determined, an output response is provided at the node using a transfer function. Each neuron receives many inputs from other neurons through weighted connections. These weighted inputs are further added up and produce the argument for a transfer function such as a linear, logistic or hyperbolic tangent function which in turn produces the final output of the neuron (Talebizadeh et al. 2009). Further theoretical information about ANNs can be found in, for example, Bishop (1995) and Haykin (1999). As ANN does not require detailed information about the physical governing rules of the phenomena, it might be effectively employed for modeling complex hydrological processes. In this study, the multilayer perceptron (MLP) algorithm was applied. The MLP model is a flexible type of ANN composed of one input layer, one or more hidden layers, and one output layer (Rai et al. 2005). The MLP is a network formed by simple neurons called perceptrons. The perceptron calculates a single output from multiple real-valued inputs by forming compounds of linear relationships according to input weights and even nonlinear transfer functions. Using only one hidden layer is recommended because using more layers worsens the problem of local minima (Rai et al. 2005).
For all training algorithms, the tangent sigmoid transfer function was used in the hidden layers, and purelin transfer function in the output layer. In this study, different back-propagation algorithms including Levenberg-Marquardt (lm), gradient descent (gd), gradient descent with adaptive learning rate (gda), gradient descent with momentum and adaptive learning rate back- propagation (gdx), and scaled conjugate gradient (scg) have been utilized for erosion prediction. The optimal numbers of neurons were determined by trial and error procedure. At each training process, 100 networks were examined and the optimum structure of each case (transfer functions) was chosen. The minimum and maximum values of weight decay in the hidden layer were found to be 0.0001 and 0.002.
NF model
NF systems, which are based on rule-based fuzzy systems, use the capability of neural networks' learning algorithm for adapting their rule-base parameters (Jang 1993). NF uses neural network learning algorithms and fuzzy reasoning to map an input to an output space. The fuzzy decision rules are implemented as membership functions (MFs) and the model has the advantages of both neural networks and fuzzy control systems. There are numbers of methods expressed for partitioning the input space. More details about NF can be found in Jang (1993) and Jang et al. (1997). There are different methods to optimize the NF parameters, e.g. grid partitioning (GP) and sub-clustering (SC) methods. In this study, both the GP and SC methods were utilized. The MF may take different types, e.g. the difference between two sigmoidal (dsig), product of two sigmoidal (psig) and generalized bell (gbellmf), which were evaluated in the present study. For each MF, different values of functions including 2, 3 and 4 were tried, and finally the best function type and its numbers were selected based on the lowest root mean square error (RMSE) (Russell & Campbell 1996). The sub-model parameter sensitivity analysis on the basis of changes in the range of 0 to 1 was conducted.
Performance assessment indices


APPLICATION AND RESULTS
The first step in establishing the NF- and ANN-based soil erosion models with different influential factors is the selection of the independent variables. Table 3 sums up the correlation values between the soil erosion and some affecting parameters. From the table, it is seen that the slope and plant are the most influential parameters on soil erosion followed by the land, lithology, current erosion value and soil. An adequate value of the linear cross correlation for an accurate simulation must be higher than 0.6, as stated by Bechrakis & Sparis (2004). However, to avoid system complexity and instability, selection of the minimum input parameters to produce an accurate estimate of soil erosion was attempted. So, based on multi-collinearity between the input parameters, the model inputs were selected as land use, soil, slope, and current erosion data.
Correlation matrix of the variables
Variables . | Value erosion . | Lithology . | Soil . | Current erosion . | Plant . | Slope . | Land . |
---|---|---|---|---|---|---|---|
Value erosion | 1 | 0.919 | 0.782 | 0.915 | 0.988 | 0.990 | 0.980 |
Lithology | 1 | 0.602 | 0.726 | 0.899 | 0.913 | 0.921 | |
Soil | 1 | 0.887 | 0.776 | 0.763 | 0.758 | ||
Current erosion | 1 | 0.900 | 0.896 | 0.871 | |||
Plant | 1 | 0.991 | 0.992 | ||||
Slope | 1 | 0.991 | |||||
Land | 1 |
Variables . | Value erosion . | Lithology . | Soil . | Current erosion . | Plant . | Slope . | Land . |
---|---|---|---|---|---|---|---|
Value erosion | 1 | 0.919 | 0.782 | 0.915 | 0.988 | 0.990 | 0.980 |
Lithology | 1 | 0.602 | 0.726 | 0.899 | 0.913 | 0.921 | |
Soil | 1 | 0.887 | 0.776 | 0.763 | 0.758 | ||
Current erosion | 1 | 0.900 | 0.896 | 0.871 | |||
Plant | 1 | 0.991 | 0.992 | ||||
Slope | 1 | 0.991 | |||||
Land | 1 |
The comparison of different ANN training algorithms is presented in Table 4. In this table, 4-2-1 indicates an ANN model comprising 4 input, 2 hidden and 1 output nodes. The number of the hidden nodes was determined iteratively. For each ANN model, the hidden node number increased from 1 to 10 and the optimum were chosen based on RMSE criterion. Training of the lm, gd, gda, gdm, gdx and scg algorithms was stopped after 1,000, 50,000, 50,000, 50,000, 50,000 and 1,000 iterations, respectively, because the increase in accuracy was too small after these epochs. The tangent sigmoid and linear activation functions were applied for the hidden and output nodes, respectively. It is clear from Table 4 that gradient descent with momentum and adaptive learning rate back-propagation algorithm present more accurate results than the other algorithms in the test period.
Statistical parameters of different ANN training algorithms during the train and test period
Training algorithm . | Model structure . | Iteration number . | Training period . | Test period . | ||||||
---|---|---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF . | RMSE . | R2 . | BIAS . | VAF . | |||
Levenberg-Marquardt | 4-9-1 | 1,000 | 0.001 | 0.989 | −0.001 | 7.42 × 10−9 | 6.048 | 0.741 | 1.131 | 0.428 |
Gradient descent | 4-10-1 | 50,000 | 0.387 | 0.980 | 0.001 | 0.0008 | 6.450 | 0.546 | −1.980 | 0.457 |
Gradient descent with adaptive learning rate | 4-1-1 | 50,000 | 1.143 | 0.992 | −0.029 | 0.008 | 4.510 | 0.920 | 3.328 | 0.112 |
Gradient descent with momentum | 4-10-1 | 50,000 | 0.363 | 0.990 | 0.002 | 0.0007 | 7.100 | 0.422 | −1.594 | 0.579 |
Gradient descent with momentum and adaptive learning rate | 4-1-1 | 50,000 | 1.119 | 0.992 | −0.0002 | 0.007 | 4.262 | 0.932 | −1.804 | 0.180 |
Scaled conjugate gradient | 4-1-1 | 1,000 | 1.110 | 0.993 | 0.002 | 0.007 | 5.724 | 0.933 | −2.985 | 0.289 |
Training algorithm . | Model structure . | Iteration number . | Training period . | Test period . | ||||||
---|---|---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF . | RMSE . | R2 . | BIAS . | VAF . | |||
Levenberg-Marquardt | 4-9-1 | 1,000 | 0.001 | 0.989 | −0.001 | 7.42 × 10−9 | 6.048 | 0.741 | 1.131 | 0.428 |
Gradient descent | 4-10-1 | 50,000 | 0.387 | 0.980 | 0.001 | 0.0008 | 6.450 | 0.546 | −1.980 | 0.457 |
Gradient descent with adaptive learning rate | 4-1-1 | 50,000 | 1.143 | 0.992 | −0.029 | 0.008 | 4.510 | 0.920 | 3.328 | 0.112 |
Gradient descent with momentum | 4-10-1 | 50,000 | 0.363 | 0.990 | 0.002 | 0.0007 | 7.100 | 0.422 | −1.594 | 0.579 |
Gradient descent with momentum and adaptive learning rate | 4-1-1 | 50,000 | 1.119 | 0.992 | −0.0002 | 0.007 | 4.262 | 0.932 | −1.804 | 0.180 |
Scaled conjugate gradient | 4-1-1 | 1,000 | 1.110 | 0.993 | 0.002 | 0.007 | 5.724 | 0.933 | −2.985 | 0.289 |
Figures in bold indicate the superior models.
Different NF-GP models possessing different types of MFs are compared in Table 5. In this table, 4-2-2-2 indicates an NF-GP model with 4, 2, 2 and 2 MFs for the land use, soil, upland erosion and slope inputs. The NF-GP method proposes independent blocks of each antecedent variable via defining the MFs of all antecedent variables. Fuzzy MFs could take different forms, and the best number of MFs is selected by trial and error. In choosing the number of MFs, large numbers of MFs or parameters should be avoided to save time and computational costs (Kisi & Shiri 2012). So, two or three numbers of MFs were used in the applied NF models. It is observed from the table that the NF-GP model with generalized bell MFs has the best accuracy based on RMSE and R2 values.
Statistical parameters of the ANFIS-GP structures during the train and test periods
Type MFs . | Model structure . | Training period . | Test period . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF . | RMSE . | R2 . | BIAS . | VAF . | ||
Psig | 4-2-3-4 | 0.0008 | 0.998 | 0.0005 | 3.27e − 09 | 10.708 | 0.858 | 7.322 | 0.740 |
Dsig | 2-3-3-2 | 0.072 | 0.997 | 6.44e − 05 | 2.97e − 05 | 13.560 | 0.467 | 9.524 | 1.130 |
Generalized bell (gbellmf) | 2-2-2-3 | 0.030 | 0.998 | 7.13e − 05 | 5e − 06 | 6.259 | 0.837 | 3.440 | 0.331 |
Type MFs . | Model structure . | Training period . | Test period . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF . | RMSE . | R2 . | BIAS . | VAF . | ||
Psig | 4-2-3-4 | 0.0008 | 0.998 | 0.0005 | 3.27e − 09 | 10.708 | 0.858 | 7.322 | 0.740 |
Dsig | 2-3-3-2 | 0.072 | 0.997 | 6.44e − 05 | 2.97e − 05 | 13.560 | 0.467 | 9.524 | 1.130 |
Generalized bell (gbellmf) | 2-2-2-3 | 0.030 | 0.998 | 7.13e − 05 | 5e − 06 | 6.259 | 0.837 | 3.440 | 0.331 |
Figures in bold indicate superiority of the Generalized bell function
Train and test results of the optimal ANN and NF models are given in Table 6. It is obvious from the table that the NF-SC model has the lowest RMSE and the highest R2 for both training and test stages. Figure 5 shows the erosion values for different applied models in comparison to the target values. The graph clearly shows the NF-SC's superiority to the NF-GP and ANN models. Figure 6 illustrates the scatterplots of the observed vs. simulated soil erosion using ANN, NF-GP and NF-SC methods for the test stage. From this figure, it is clear that all applied models can simulate soil erosion with high accuracies. Analyzing the regression line equation (y = ax + b) shows that the a and b values are closer to 1 and 0, respectively, which demonstrates the capability of the model in simulating soil erosion. The observed vs. predicted erosion values of the best models are presented in Figure 7 (plotted as double logarithmic for better representation) for the whole data (comprising train and test patterns). It is clearly seen from the scatterplots that the estimates of the NF-SC model are closer to the exact fit line than those of the NF-GP and ANN models, especially for the peak values. As can be seen from the figures, there are some scatters between the observed and simulated erosion values in the scatterplots for larger erosion amounts, while the models show good fit with the corresponding observed values in the rest of the points. Although this might be considered as a potential weakness of the applied models in reproducing larger erosion values by taking into account the input variables, a possible reason for such over and under estimations might be the complex behavior of the erosion at higher magnitudes, which needs the involvement of further (possibly not considered here) input variables for simulation.
Error statistics of the optimal ANFIS-GP, ANFIS-SC, and ANN models
Model . | Training period . | Test period . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF . | RMSE . | R2 . | BIAS . | VAF . | |
ANN | 1.119 | 0.992 | −0.0002 | 0.007 | 4.262 | 0.932 | −1.804 | 0.180 |
ANFIS-GP | 0.030 | 0.998 | 7.13e − 05 | 5e − 06 | 6.259 | 0.837 | 3.440 | 0.331 |
ANFIS-SC | 0.0006 | 0.998 | 6.18e − 08 | 2.77e − 11 | 3.775 | 0.898 | 0.816 | 0.165 |
Model . | Training period . | Test period . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF . | RMSE . | R2 . | BIAS . | VAF . | |
ANN | 1.119 | 0.992 | −0.0002 | 0.007 | 4.262 | 0.932 | −1.804 | 0.180 |
ANFIS-GP | 0.030 | 0.998 | 7.13e − 05 | 5e − 06 | 6.259 | 0.837 | 3.440 | 0.331 |
ANFIS-SC | 0.0006 | 0.998 | 6.18e − 08 | 2.77e − 11 | 3.775 | 0.898 | 0.816 | 0.165 |
Relationships between the predicted and measured logarithm erosion rate values (using all available patterns).
Relationships between the predicted and measured logarithm erosion rate values (using all available patterns).
In order to assess the sensitivity of the applied model to each input variable, the NF-SC and ANN models were established using single-input configurations, and the corresponding results are summarized in Table 7 for the best ANN and NF models. From the table it is seen that the single-input ANN and NF-SC models that comprise the land slope as the sole input variable produce the most accurate results, while the models relying on soil as an input variable produce the highest RMSE values, which depict the lowest performance accuracy. This might be explained through comparing the correlation values presented in Table 3, where the slope shows the highest correlation values of 0.990 (positive; increasing effect), while the soil shows the lowest correlation values of 0.782 (positive; increasing effect) with erosion. Although both parameters show the positive correlations with increasing effect on soil erosion, the magnitude of the linear correlation is the highest for slope. Nevertheless, it should be noted that the interrelations between the independent and target parameters are usually nonlinear, and taking into consideration the linear relations might produce partially valid conclusions. However, the outcomes produced by the ANN and NF models confirmed the relations presented by the correlation analysis. Moreover, the higher performance accuracy of the NF model might be linked to its capability in using both the fuzzy inference system and neural network algorithm, which make it easy to simulate complex phenomena such as soil erosion.
Error statistics of the single input models for the test data
Input . | ANN . | ANFIS-SC . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF (%) . | RMSE . | R2 . | BIAS . | VAF (%) . | |
Lithology | 9.587 | 0.594 | 7.486 | 56.25 | 6.991 | 0.592 | 0.9453 | 41.84 |
Soil | 23.947 | 0.607 | 23.199 | 57.30 | 24.165 | 0.605 | 23.462 | 59.394 |
Current erosion | 19.109 | 0.607 | 15.086 | 66.74 | 20.163 | 0.838 | 19.532 | 69.654 |
Plant | 13.943 | 0.418 | 10.326 | 9.23 | 8.788 | 0.970 | −6.328 | 54.925 |
Slope | 3.428 | 0.911 | 0.9778 | 86.91 | 3.367 | 0.915 | 0.8107 | 87.05 |
Land | 5.477 | 0.650 | −0.256 | 63.73 | 12.124 | 0.938 | −7.0179 | 18.473 |
Input . | ANN . | ANFIS-SC . | ||||||
---|---|---|---|---|---|---|---|---|
RMSE . | R2 . | BIAS . | VAF (%) . | RMSE . | R2 . | BIAS . | VAF (%) . | |
Lithology | 9.587 | 0.594 | 7.486 | 56.25 | 6.991 | 0.592 | 0.9453 | 41.84 |
Soil | 23.947 | 0.607 | 23.199 | 57.30 | 24.165 | 0.605 | 23.462 | 59.394 |
Current erosion | 19.109 | 0.607 | 15.086 | 66.74 | 20.163 | 0.838 | 19.532 | 69.654 |
Plant | 13.943 | 0.418 | 10.326 | 9.23 | 8.788 | 0.970 | −6.328 | 54.925 |
Slope | 3.428 | 0.911 | 0.9778 | 86.91 | 3.367 | 0.915 | 0.8107 | 87.05 |
Land | 5.477 | 0.650 | −0.256 | 63.73 | 12.124 | 0.938 | −7.0179 | 18.473 |
Summarizing, it could be stated that, when relying on suitable input configuration, both the NF and ANN models display ability for mapping the interrelations between soil erosion and its influential parameters. By using these models, one can simulate the erosion magnitudes using limited input variables, which would be of great interest for practical issues. Further studies are needed for strengthening these conclusions using data from other catchments and using different models.
CONCLUSIONS
In this study, the ability of three data-driven methods, ANN, NF-GP and NF-SC, were investigated to predict the erosion rate using geographical input data. Land use, soil, slope, and current erosion were used as input parameters for training and testing the applied models. The erosion values produced by MPSIAC model were used as benchmark patterns. Different training algorithms were used for ANN models and gradient descent with momentum and adaptive learning rate back-propagation algorithm was found to be better than the other algorithms. Various MFs were also tried in NF-GP models and generalized bell MFs gave the best accuracy. The NF-SC model generally performed better than the other models. According to results, data-driven approaches such as ANN and NF can be a good alternative for the standard PSIAC model. Summarizing, the data-driven models can perform as well as MPSIAC even if they use only part of the information used as input in the benchmark model. This alternative is not only economically beneficial for studies but can also be used in areas with data scarcity. In the present study, a 70% training, 30% testing hold-out strategy was applied for developing and testing the models, which is common in similar studies. Nevertheless, the conclusions obtained through this strategy might be limited and need the robust k-fold testing strategy. Moreover, similar studies might be carried out using data from other regions with similar erosion characteristics to make generalized soft computing-based erosion models. These could be a subject for future studies.