Abstract
Recently, the capabilities of artificial neural networks (ANNs) in simulating dynamic systems have been proven. However, the common training algorithms of ANNs (e.g., back-propagation and gradient algorithms) are featured with specific drawbacks in terms of slow convergence and probable entrapment in local minima. Alternatively, novel training techniques, e.g., particle swarm optimization (PSO) and differential evolution (DE) algorithms might be employed for conquering these shortcomings. In this paper, ANN-PSO and ANN-DE models were applied for modeling groundwater qualitative parameters, i.e., SO4 and sodium adsorption ratio (SAR). Three statistical parameters including root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2) were used for assessing the models' capabilities. The results showed that the ANN-DE presents more accurate results than ANN-PSO in modeling SAR and electrical conductivity (EC).
INTRODUCTION
Groundwater is one of the major sources of water supply of domestic as well as agricultural activities. Modeling groundwater quality is needed to develop better strategies for water resources planning and management (Liu et al. 2009; Najah et al. 2014). Traditional water resources management approaches considered surface water and groundwater systems as two separate entities. However, the recent developments in land and water resources analysis have demonstrated that these systems could affect each other, from both qualitative and quantitative points of view. Nonetheless, groundwater contamination, either by anthropogenic activities, or by inherent aquifer material composition, reduces groundwater supply capacity or restricts its exploitation.
Meanwhile, agricultural activities, which might include the uncontrolled use of fertilizers and pesticides, influence and cause the deterioration of groundwater quality, although variations in groundwater quality can be influenced by geological formations and anthropogenic activities, too (Yesilnacar et al. 2008). Variation of groundwater quality is a component of physical and chemical parameters that are enormously impacted by geological formations and anthropogenic activities (Almasri & Kaluarachchi 2005; Subramani et al. 2005; Yesilnacar & Sahinkaya 2012).
SAR, which is determined by the concentrations of solids dissolved in the water, is a significant parameter for analyzing the suitability of irrigation water. Higher SAR values (high Na+ and low Ca2+ and Mg2+ magnitudes) may cause the dispersion of clay particles and destroy the soil structure (Yesilnacar & Sahinkaya 2012). Nevertheless, groundwater sulfate might be provided to the point and non-point sources. The maximum permissible and allowable concentrations of sulfate in drinking water are 200 and 400 mg/l, respectively (WHO 1984). High sulfate concentrations affect the water taste. Therefore, groundwater SAR and sulfate simulation are important tasks in groundwater resources management and planning.
The traditional groundwater quality analysis approach is mainly based on mathematical modeling, e.g., time series analysis, probability statistics, etc., which usually assume a linear relationship between the dependent and independent variables, thus the model's overall accuracy is not high (Luo et al. 2003). Owing to the existing difficulties in simulating groundwater quality (Omran 2012), novel computational approaches are required.
As an alternative to the traditional statistical approaches, artificial intelligence techniques might be used to solve this problem. Among others, artificial neural networks (ANNs) have been widely applied in numerous disciplines, e.g., qualitative/quantitative groundwater modeling (Cheng et al. 2005; Liu & Chung 2014). Yesilnacar et al. (2008) predicted groundwater nitrate concentration in Harran Plain in Turkey using ANNs. Yesilnacar & Sahinkaya (2012) developed an ANN model for predicting groundwater sulfate (SO4) and SAR concentration. Kuo et al. (2004) utilized back-propagation ANN to predict the variations of groundwater quality (in terms of seawater salinization and arsenic pollutant factors) in Taiwan. Khaki et al. (2015) evaluated the potential of adaptive neuro-fuzzy inference system (ANFIS) and ANN to simulate TDS and electrical conductivity (EC) levels.
Despite ANNs' capability in modeling nonlinear systems, establishing these models with conventional training algorithms may produce non-optimum outcomes because of limitations for adapting the best synaptic weights. Alternatively, ANN models might be integrated with some evolutionary algorithms (EAs), e.g., differential evolution (DE) and particle swarm optimization (PSO) to optimize the models' structures.
DE is a meta-heuristic population-based algorithm, which can be used for multidimensional real-valued functions. The PSO algorithm evolves a population of particle individuals through an iterative process to find the optimized solution. Unlike most EAs, PSO has low computational costs and its implementation is straightforward. Each potential solution in PSO is represented by a particle, flies in a multidimensional search space with a velocity dynamically adjusted by the particle's own former information and the experience of the other particles. Numerous applications of PSO have been reported in solving real-world optimization problems (e.g., Liu et al. 2007; Melin et al. 2013; Selakov et al. 2014).
Karterakis et al. (2007) applied DE for the solution of coastal subsurface water management problems. Gaur et al. (2011) applied an analytic element method coupled with PSO for groundwater management and reported that the developed model is efficient in identifying the optimal location and discharge of the pumping wells. Sudheer & Shashi (2012) developed a PSO trained ANN for aquifer parameter estimation. Gaur et al. (2013) applied ANN and PSO for management of groundwater and reported that the ANN-PSO model is capable of identifying the optimal location of wells efficiently. Chiu (2014) applied DE for parameter structure identification in groundwater modeling. Elci & Ayvaz (2014) applied a DE algorithm-based optimization for the site selection of groundwater production wells. Based on a review study, Ketabchi & Ataie-Ashtiani (2015) investigated the literature associated with the application of evolutionary algorithms (e.g., PSO and DE algorithms) in coastal groundwater management problems. Overall, they concluded that the PSO algorithm is among the superior EAs.
In the present study, the capability of PSO and DE algorithms were evaluated in modeling groundwater quality parameters (i.e., SO4 and SAR).
MATERIALS AND METHODS
Site description
This study was conducted in Neyshabur plain, Iran, located between 35°41′ (°N) and 58°20′ (°E) (Figure 1). The average altitude of the region is 1,500 m above mean sea level. Mean annual precipitation and temperature values are 233.7 mm and 14.5 °C, respectively (Mansouri Daneshvar et al. 2013).
Groundwater sampling and measurement
Monthly groundwater records were collected from 60 observational wells during a 16-year period (1997–2013). Geographical coordinates and elevation of each sampling location was recorded using a handheld global positioning system (GPS). A few locations were also cross-checked with a differential GPS. Collected samples were analyzed in the laboratory to measure the concentration of the qualitative parameters using the existing standard procedures (Table 1).
Parameter . | Method . |
---|---|
Electrical conductivity (μS/cm) | Conductivity bridge (Richards 1954) |
pH | pH meter (Thomas 1996) |
Sodium (mg/L) | Flame photometric (Osborn & Johns 1951) |
Calcium (mg/L) | EDTA titration (Richards 1954) |
Magnesium (mg/L) | EDTA titration (Richards 1954) |
Bicarbonate (mg/L) | Acid titration (Hesse 1971) |
Chloride (mg/L) | Mohr's titration (Hesse 1971) |
Hardness (mg CaCO3/L) | EDTA titration (Richards 1954) |
Total dissolved solids (ppm or mg/L) | Water quality analyzer (APHA 1995) |
Parameter . | Method . |
---|---|
Electrical conductivity (μS/cm) | Conductivity bridge (Richards 1954) |
pH | pH meter (Thomas 1996) |
Sodium (mg/L) | Flame photometric (Osborn & Johns 1951) |
Calcium (mg/L) | EDTA titration (Richards 1954) |
Magnesium (mg/L) | EDTA titration (Richards 1954) |
Bicarbonate (mg/L) | Acid titration (Hesse 1971) |
Chloride (mg/L) | Mohr's titration (Hesse 1971) |
Hardness (mg CaCO3/L) | EDTA titration (Richards 1954) |
Total dissolved solids (ppm or mg/L) | Water quality analyzer (APHA 1995) |
EDTA, ethylenediaminetetraacetic acid.
In this study, groundwater qualitative parameters, i.e., SO4 and SAR were modeled using two different evolutionary neural networks, namely, ANN-PSO and ANN-DE. Calcium, magnesium, sodium, hardness, electrical conductivity, TDS, pH, bicarbonate, and chloride parameters were used as input variables to estimate the SO4 and SAR. Table 2 sums up the statistical parameters of the applied data. Variability class of the coefficient of variation (CV) was obtained based on the criterion presented by Wilding (1983). Based on this criterion, the CV values less than 15% denote the low variability class, while the CV values higher than 35% stand for the high variability class. The CV values between 15% and 35% correspond to the medium variability class. Considering the results presented in Table 1, high variations were observed in groundwater qualitative parameters (from 41.92% to 123.98%), except pH, which shows low variability with a CV value of 4.72%. For developing the applied models, 50% of data (1,200 patterns) were used for training, while the remaining 25% and 25% (600 and 600 patterns) was used for validating and testing the models, respectively.
Parameter . | Unit . | Min . | Max . | Mean . | Std . | CV . |
---|---|---|---|---|---|---|
EC | μS/cm | 4.40 | 35,200.00 | 2,946.69 | 3,102.26 | 105.28 |
pH | – | 6.30 | 9.50 | 8.00 | 0.37 | 4.72 |
Sodium | mg/L | 0.00 | 127.30 | 18.52 | 18.89 | 101.99 |
Calcium | mg/L | 0.00 | 40.60 | 4.75 | 5.26 | 110.58 |
Magnesium | mg/L | 0.00 | 44.80 | 5.051 | 4.43 | 87.79 |
Bicarbonate | mg/L | 0.00 | 11.00 | 2.93 | 1.23 | 41.92 |
Chloride | mg/L | 0.00 | 142.50 | 17.60 | 21.83 | 123.98 |
TH | mg CaCO3/L | 0.00 | 3,625.00 | 490.47 | 448.69 | 91.48 |
TDS | ppm or mg/L | 2.77 | 2,2176.00 | 1856.42 | 1954.42 | 105.26 |
SO4 | mg/L | 0.00 | 50.00 | 8.49 | 7.64 | 90.01 |
SAR | – | 0.00 | 37.89 | 8.06 | 6.77 | 84.03 |
Parameter . | Unit . | Min . | Max . | Mean . | Std . | CV . |
---|---|---|---|---|---|---|
EC | μS/cm | 4.40 | 35,200.00 | 2,946.69 | 3,102.26 | 105.28 |
pH | – | 6.30 | 9.50 | 8.00 | 0.37 | 4.72 |
Sodium | mg/L | 0.00 | 127.30 | 18.52 | 18.89 | 101.99 |
Calcium | mg/L | 0.00 | 40.60 | 4.75 | 5.26 | 110.58 |
Magnesium | mg/L | 0.00 | 44.80 | 5.051 | 4.43 | 87.79 |
Bicarbonate | mg/L | 0.00 | 11.00 | 2.93 | 1.23 | 41.92 |
Chloride | mg/L | 0.00 | 142.50 | 17.60 | 21.83 | 123.98 |
TH | mg CaCO3/L | 0.00 | 3,625.00 | 490.47 | 448.69 | 91.48 |
TDS | ppm or mg/L | 2.77 | 2,2176.00 | 1856.42 | 1954.42 | 105.26 |
SO4 | mg/L | 0.00 | 50.00 | 8.49 | 7.64 | 90.01 |
SAR | – | 0.00 | 37.89 | 8.06 | 6.77 | 84.03 |
Std: standard deviation, CV: coefficient of variation (%).
Applied algorithms
Artificial neural networks
ANNs are interconnected groups of artificial neurons (processors) designed for information processing through a computational model. They are generally utilized to simulate the output vectors according to the given input vectors, especially in dynamic systems where the interrelationships between the input-target parameters are non-linear (Omkar & Senthilnath 2011; Balouchi et al. 2015). In an ANN structure, input and output vectors are placed as the first and last layers. Among these layers, hidden layer(s) with several neurons are considered. In this study, a neural network with one hidden layer was established and the number of neurons in the hidden layer was determined iteratively. The schematic diagram of the applied feed-forward ANN is shown in Figure 2.
Particle swarm optimization
PSO is an evolutionary computation algorithm, based on iterative optimization (Kennedy & Eberhart 1955). PSO consists of a group of particles (individuals) which refine their knowledge of the search space. Each particle has two main characteristics of position and velocity. In the PSO, the iterative method is used to reach the optimal solution according to the fitness values of each particle, which is determined by optimization function. Each particle adjusts its trajectory by tracking two pieces of information: (1) the best visited position (Pbest) and (2) the global extremum attained by species (Gbest) (Assareh et al. 2010). At each generation (iteration) step, each particle is accelerated toward the previous Pbest and the Gbest position of the particle. A new velocity magnitude is calculated for each particle based on its current velocity and its distance from its previous Pbest and Gbest. The updated velocity magnitude is then utilized to calculate the next position of the particle through the search space. The iterative process is continued a set number of times, or until achieving a minimum error.
the previous step velocity term, affected by the constant inertia weight, ω;
the cognitive learning term, which is the difference between the particle's best position so far found (called , local best) and the particle current position ;
the social learning term, which is the difference between the global best position found thus far in the entire swarm (called gk, global best) and the particle's current position .
Differential evolution
The conventional ANN models utilize gradient-based algorithms (GBAs) (e.g., back-propagation) for identifying the weights. For the GBAs, in the calibration (training) period, it is very easy to get trapped in a local minima (Kumar et al. 2002; Sudheer et al. 2003). The evolutionary algorithms (e.g., PSO, DE) are more robust than the existing direct search methods (e.g., GBAs) because they combine the stochastic and direct search. Evolutionary algorithms (EA) provide the global optimum without being trapped in local optima as in the GBAs (Mantoglou et al. 2004; Karterakis et al. 2007).
Goodness-of-fit of the model
RESULTS AND DISCUSSION
For ANN implementation, first the number of hidden neurons was considered as twice the input numbers, according to Bhattacharyya & Pendharkar (1998). Then, various particle swarm/population sizes were tried. According to Geethanjali et al. (2008), the typical ranges for the number of particles are 20–40, and 10 particles are large enough to get good results for most of the problems.
In the present research, four different particle sizes, i.e., 10, 20, 30, and 40 were tried with 10,000 iterations for the ANN-PSO models, with hidden node number of 18 (2 × 9 inputs). Then, the hidden node number was decreased to the number of inputs (nine nodes). The sensitivity analysis of different ANN-PSO models with respect to hidden node numbers is presented in Table 3. From the table it is seen that the RMSE values vary between 3.19 mg/L and 9.43 mg/L for the SO4 and between 3.87 and 8.03 for the SAR. It is clear that the ANN-PSO is very sensitive to hidden node numbers. The model with 16 hidden nodes presents the best results in estimating SO4 (the lowest RMSE and the highest R2 values). In the case of SAR, however, the ANN-PSO model with 18 hidden nodes outperforms the other models.
Hidden node number . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |
SO4 | |||||||||
18 | 2.90 | 1.88 | 0.846 | 3.95 | 2.54 | 0.766 | 3.76 | 2.46 | 0.854 |
17 | 2.64 | 1.77 | 0.873 | 5.66 | 4.84 | 0.852 | 7.38 | 5.86 | 0.847 |
16 | 2.59 | 1.76 | 0.877 | 3.20 | 2.32 | 0.847 | 3.19 | 2.46 | 0.888 |
15 | 3.01 | 2.06 | 0.834 | 10.2 | 9.61 | 0.751 | 9.14 | 8.51 | 0.832 |
14 | 3.00 | 2.02 | 0.835 | 8.52 | 7.92 | 0.798 | 9.43 | 8.59 | 0.797 |
13 | 2.53 | 1.76 | 0.884 | 4.21 | 3.27 | 0.776 | 3.79 | 3.14 | 0.909 |
12 | 2.49 | 1.71 | 0.887 | 4.03 | 2.78 | 0.766 | 3.43 | 2.64 | 0.897 |
11 | 2.73 | 1.85 | 0.863 | 3.62 | 2.42 | 0.812 | 3.31 | 2.11 | 0.874 |
10 | 2.71 | 1.74 | 0.865 | 6.05 | 5.44 | 0.833 | 6.06 | 5.26 | 0.877 |
9 | 2.72 | 1.80 | 0.864 | 5.59 | 5.20 | 0.843 | 4.51 | 3.43 | 0.847 |
SAR | |||||||||
18 | 3.05 | 1.98 | 0.802 | 4.71 | 3.03 | 0.739 | 3.87 | 2.61 | 0.810 |
17 | 3.11 | 2.03 | 0.793 | 10.1 | 9.32 | 0.549 | 6.89 | 6.11 | 0.759 |
16 | 3.67 | 2.48 | 0.712 | 5.20 | 2.95 | 0.512 | 4.55 | 2.95 | 0.709 |
15 | 3.33 | 2.21 | 0.763 | 6.64 | 5.64 | 0.733 | 5.11 | 4.46 | 0.801 |
14 | 3.29 | 2.11 | 0.768 | 4.90 | 3.73 | 0.711 | 4.01 | 3.06 | 0.784 |
13 | 3.06 | 2.02 | 0.801 | 4.81 | 3.25 | 0.719 | 4.12 | 3.31 | 0.830 |
12 | 3.17 | 2.09 | 0.787 | 4.91 | 3.08 | 0.701 | 5.06 | 3.87 | 0.807 |
11 | 3.34 | 2.23 | 0.760 | 9.73 | 8.89 | 0.636 | 8.03 | 7.20 | 0.721 |
10 | 3.12 | 2.01 | 0.796 | 6.13 | 5.18 | 0.695 | 5.24 | 4.45 | 0.781 |
9 | 2.76 | 1.72 | 0.839 | 6.58 | 5.64 | 0.725 | 5.18 | 4.42 | 0.804 |
Hidden node number . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |
SO4 | |||||||||
18 | 2.90 | 1.88 | 0.846 | 3.95 | 2.54 | 0.766 | 3.76 | 2.46 | 0.854 |
17 | 2.64 | 1.77 | 0.873 | 5.66 | 4.84 | 0.852 | 7.38 | 5.86 | 0.847 |
16 | 2.59 | 1.76 | 0.877 | 3.20 | 2.32 | 0.847 | 3.19 | 2.46 | 0.888 |
15 | 3.01 | 2.06 | 0.834 | 10.2 | 9.61 | 0.751 | 9.14 | 8.51 | 0.832 |
14 | 3.00 | 2.02 | 0.835 | 8.52 | 7.92 | 0.798 | 9.43 | 8.59 | 0.797 |
13 | 2.53 | 1.76 | 0.884 | 4.21 | 3.27 | 0.776 | 3.79 | 3.14 | 0.909 |
12 | 2.49 | 1.71 | 0.887 | 4.03 | 2.78 | 0.766 | 3.43 | 2.64 | 0.897 |
11 | 2.73 | 1.85 | 0.863 | 3.62 | 2.42 | 0.812 | 3.31 | 2.11 | 0.874 |
10 | 2.71 | 1.74 | 0.865 | 6.05 | 5.44 | 0.833 | 6.06 | 5.26 | 0.877 |
9 | 2.72 | 1.80 | 0.864 | 5.59 | 5.20 | 0.843 | 4.51 | 3.43 | 0.847 |
SAR | |||||||||
18 | 3.05 | 1.98 | 0.802 | 4.71 | 3.03 | 0.739 | 3.87 | 2.61 | 0.810 |
17 | 3.11 | 2.03 | 0.793 | 10.1 | 9.32 | 0.549 | 6.89 | 6.11 | 0.759 |
16 | 3.67 | 2.48 | 0.712 | 5.20 | 2.95 | 0.512 | 4.55 | 2.95 | 0.709 |
15 | 3.33 | 2.21 | 0.763 | 6.64 | 5.64 | 0.733 | 5.11 | 4.46 | 0.801 |
14 | 3.29 | 2.11 | 0.768 | 4.90 | 3.73 | 0.711 | 4.01 | 3.06 | 0.784 |
13 | 3.06 | 2.02 | 0.801 | 4.81 | 3.25 | 0.719 | 4.12 | 3.31 | 0.830 |
12 | 3.17 | 2.09 | 0.787 | 4.91 | 3.08 | 0.701 | 5.06 | 3.87 | 0.807 |
11 | 3.34 | 2.23 | 0.760 | 9.73 | 8.89 | 0.636 | 8.03 | 7.20 | 0.721 |
10 | 3.12 | 2.01 | 0.796 | 6.13 | 5.18 | 0.695 | 5.24 | 4.45 | 0.781 |
9 | 2.76 | 1.72 | 0.839 | 6.58 | 5.64 | 0.725 | 5.18 | 4.42 | 0.804 |
Similarly to the ANN-PSO models, four different population sizes of 10, 20, 30, and 40 with 10,000 iterations were tried for the ANN-DE models and hidden node number was set to 18. Then, the hidden node number was decreased to the number of inputs (nine nodes). The sensitivity analysis of different ANN-PSO models with respect to hidden node numbers is presented in Table 4. From the table it is clear that the ANN-DE is not very sensitive to hidden node numbers. Similarly to the ANN-PSO, the ANN-DE model comprising 16 hidden nodes presents the best performance in modeling SO4. In estimating SAR, however, the ANN-DE model with 12 hidden nodes performs better than the other models.
Hidden node number . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |
SO4 | |||||||||
18 | 2.76 | 1.86 | 0.868 | 2.92 | 1.95 | 0.854 | 2.91 | 1.88 | 0.880 |
17 | 2.28 | 1.54 | 0.910 | 3.65 | 1.90 | 0.799 | 2.85 | 1.78 | 0.885 |
16 | 2.11 | 1.43 | 0.923 | 2.88 | 1.72 | 0.872 | 2.22 | 1.64 | 0.932 |
15 | 2.16 | 1.39 | 0.933 | 3.33 | 1.52 | 0.836 | 2.89 | 1.60 | 0.902 |
14 | 2.18 | 1.53 | 0.930 | 4.01 | 2.23 | 0.809 | 2.93 | 1.95 | 0.898 |
13 | 2.28 | 1.45 | 0.905 | 3.53 | 1.93 | 0.790 | 2.90 | 1.78 | 0.878 |
12 | 1.91 | 1.12 | 0.934 | 3.25 | 1.42 | 0.826 | 2.62 | 1.41 | 0.901 |
11 | 1.99 | 1.28 | 0.929 | 2.60 | 1.54 | 0.884 | 2.34 | 1.57 | 0.923 |
10 | 2.51 | 1.89 | 0.892 | 3.22 | 1.95 | 0.818 | 2.85 | 1.99 | 0.885 |
9 | 2.09 | 1.42 | 0.922 | 3.10 | 1.64 | 0.833 | 2.59 | 1.54 | 0.903 |
SAR | |||||||||
18 | 1.98 | 1.48 | 0.920 | 2.55 | 1.62 | 0.865 | 2.22 | 1.60 | 0.896 |
17 | 1.90 | 1.41 | 0.930 | 2.83 | 1.71 | 0.834 | 2.42 | 1.59 | 0.874 |
16 | 2.11 | 1.43 | 0.905 | 3.80 | 1.83 | 0.703 | 2.46 | 1.63 | 0.862 |
15 | 2.15 | 1.55 | 0.918 | 2.82 | 1.47 | 0.831 | 2.27 | 1.42 | 0.883 |
14 | 2.02 | 1.36 | 0.921 | 3.48 | 1.68 | 0.763 | 2.91 | 1.70 | 0.829 |
13 | 2.01 | 1.29 | 0.914 | 3.78 | 1.72 | 0.714 | 2.67 | 1.68 | 0.849 |
12 | 1.84 | 1.29 | 0.935 | 3.18 | 1.62 | 0.786 | 2.11 | 1.35 | 0.902 |
11 | 2.00 | 1.38 | 0.925 | 3.67 | 1.85 | 0.725 | 2.82 | 1.75 | 0.839 |
10 | 1.88 | 1.40 | 0.927 | 2.05 | 1.46 | 0.911 | 2.24 | 1.57 | 0.886 |
9 | 1.75 | 1.23 | 0.937 | 2.20 | 1.48 | 0.898 | 2.32 | 1.70 | 0.886 |
Hidden node number . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |
SO4 | |||||||||
18 | 2.76 | 1.86 | 0.868 | 2.92 | 1.95 | 0.854 | 2.91 | 1.88 | 0.880 |
17 | 2.28 | 1.54 | 0.910 | 3.65 | 1.90 | 0.799 | 2.85 | 1.78 | 0.885 |
16 | 2.11 | 1.43 | 0.923 | 2.88 | 1.72 | 0.872 | 2.22 | 1.64 | 0.932 |
15 | 2.16 | 1.39 | 0.933 | 3.33 | 1.52 | 0.836 | 2.89 | 1.60 | 0.902 |
14 | 2.18 | 1.53 | 0.930 | 4.01 | 2.23 | 0.809 | 2.93 | 1.95 | 0.898 |
13 | 2.28 | 1.45 | 0.905 | 3.53 | 1.93 | 0.790 | 2.90 | 1.78 | 0.878 |
12 | 1.91 | 1.12 | 0.934 | 3.25 | 1.42 | 0.826 | 2.62 | 1.41 | 0.901 |
11 | 1.99 | 1.28 | 0.929 | 2.60 | 1.54 | 0.884 | 2.34 | 1.57 | 0.923 |
10 | 2.51 | 1.89 | 0.892 | 3.22 | 1.95 | 0.818 | 2.85 | 1.99 | 0.885 |
9 | 2.09 | 1.42 | 0.922 | 3.10 | 1.64 | 0.833 | 2.59 | 1.54 | 0.903 |
SAR | |||||||||
18 | 1.98 | 1.48 | 0.920 | 2.55 | 1.62 | 0.865 | 2.22 | 1.60 | 0.896 |
17 | 1.90 | 1.41 | 0.930 | 2.83 | 1.71 | 0.834 | 2.42 | 1.59 | 0.874 |
16 | 2.11 | 1.43 | 0.905 | 3.80 | 1.83 | 0.703 | 2.46 | 1.63 | 0.862 |
15 | 2.15 | 1.55 | 0.918 | 2.82 | 1.47 | 0.831 | 2.27 | 1.42 | 0.883 |
14 | 2.02 | 1.36 | 0.921 | 3.48 | 1.68 | 0.763 | 2.91 | 1.70 | 0.829 |
13 | 2.01 | 1.29 | 0.914 | 3.78 | 1.72 | 0.714 | 2.67 | 1.68 | 0.849 |
12 | 1.84 | 1.29 | 0.935 | 3.18 | 1.62 | 0.786 | 2.11 | 1.35 | 0.902 |
11 | 2.00 | 1.38 | 0.925 | 3.67 | 1.85 | 0.725 | 2.82 | 1.75 | 0.839 |
10 | 1.88 | 1.40 | 0.927 | 2.05 | 1.46 | 0.911 | 2.24 | 1.57 | 0.886 |
9 | 1.75 | 1.23 | 0.937 | 2.20 | 1.48 | 0.898 | 2.32 | 1.70 | 0.886 |
Training, validation, and test results of the ANN-PSO models are given in Table 5. It is clear from the table that the models' accuracy decreases by increasing swarm size of training data, while for the validation and test stages, the accuracies are fluctuating. Analyzing the error statistics presented in Table 3 shows that the ANN-PSO with 30 swarm size has the lowest RMSE (3.76 mg/L) and MAE (2.46 mg/L) values in estimating SO4 in the test stage; ANN-PSO with 40 swarm size produced the most accurate results for estimating SAR.
Swarm size . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |
SO4 | |||||||||
10 | 3.87 | 2.72 | 0.750 | 6.84 | 6.16 | 0.746 | 5.66 | 4.86 | 0.690 |
20 | 3.21 | 2.19 | 0.813 | 6.93 | 6.23 | 0.794 | 7.75 | 6.75 | 0.801 |
30 | 2.90 | 1.88 | 0.846 | 3.95 | 2.54 | 0.766 | 3.76 | 2.46 | 0.854 |
40 | 2.61 | 1.73 | 0.875 | 6.11 | 5.66 | 0.858 | 7.83 | 7.22 | 0.864 |
SAR | |||||||||
10 | 5.26 | 3.74 | 0.442 | 5.83 | 4.07 | 0.428 | 5.44 | 4.18 | 0.498 |
20 | 4.50 | 3.14 | 0.565 | 10.5 | 9.46 | 0.522 | 9.36 | 8.33 | 0.586 |
30 | 4.38 | 3.02 | 0.588 | 5.42 | 3.29 | 0.403 | 5.50 | 3.92 | 0.590 |
40 | 3.05 | 1.98 | 0.802 | 4.71 | 3.03 | 0.739 | 3.87 | 2.61 | 0.810 |
Swarm size . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |
SO4 | |||||||||
10 | 3.87 | 2.72 | 0.750 | 6.84 | 6.16 | 0.746 | 5.66 | 4.86 | 0.690 |
20 | 3.21 | 2.19 | 0.813 | 6.93 | 6.23 | 0.794 | 7.75 | 6.75 | 0.801 |
30 | 2.90 | 1.88 | 0.846 | 3.95 | 2.54 | 0.766 | 3.76 | 2.46 | 0.854 |
40 | 2.61 | 1.73 | 0.875 | 6.11 | 5.66 | 0.858 | 7.83 | 7.22 | 0.864 |
SAR | |||||||||
10 | 5.26 | 3.74 | 0.442 | 5.83 | 4.07 | 0.428 | 5.44 | 4.18 | 0.498 |
20 | 4.50 | 3.14 | 0.565 | 10.5 | 9.46 | 0.522 | 9.36 | 8.33 | 0.586 |
30 | 4.38 | 3.02 | 0.588 | 5.42 | 3.29 | 0.403 | 5.50 | 3.92 | 0.590 |
40 | 3.05 | 1.98 | 0.802 | 4.71 | 3.03 | 0.739 | 3.87 | 2.61 | 0.810 |
Table 6 sums up the training, validation, and test results of the ANN-DE models. It is apparent from the table that the ANN-DE presents the lowest RMSE (2.91 mg/L) and MAE (1.88 mg/L) in estimating SO4 in the test period for the 30 population size (PS), while ANN-DE with 40 PS provided the best accuracy in estimating SAR, similar to the ANN-PSO.
Population . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
Size . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . |
SO4 | |||||||||
10 | 2.53 | 1.79 | 0.884 | 5.80 | 2.29 | 0.564 | 3.17 | 2.15 | 0.854 |
20 | 2.31 | 1.63 | 0.902 | 3.68 | 1.99 | 0.782 | 3.01 | 1.99 | 0.868 |
30 | 2.76 | 1.86 | 0.868 | 2.92 | 1.95 | 0.854 | 2.91 | 1.88 | 0.880 |
40 | 2.50 | 1.73 | 0.886 | 2.88 | 2.01 | 0.852 | 3.03 | 2.20 | 0.869 |
SAR | |||||||||
10 | 2.05 | 1.39 | 0.912 | 3.27 | 1.73 | 0.780 | 2.44 | 1.54 | 0.867 |
20 | 2.05 | 1.42 | 0.919 | 3.25 | 1.78 | 0.793 | 3.00 | 1.67 | 0.814 |
30 | 1.83 | 1.37 | 0.931 | 2.87 | 1.52 | 0.829 | 2.69 | 1.62 | 0.837 |
40 | 1.98 | 1.48 | 0.920 | 2.55 | 1.62 | 0.865 | 2.22 | 1.60 | 0.896 |
Population . | Training . | Validation . | Test . | ||||||
---|---|---|---|---|---|---|---|---|---|
Size . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . |
SO4 | |||||||||
10 | 2.53 | 1.79 | 0.884 | 5.80 | 2.29 | 0.564 | 3.17 | 2.15 | 0.854 |
20 | 2.31 | 1.63 | 0.902 | 3.68 | 1.99 | 0.782 | 3.01 | 1.99 | 0.868 |
30 | 2.76 | 1.86 | 0.868 | 2.92 | 1.95 | 0.854 | 2.91 | 1.88 | 0.880 |
40 | 2.50 | 1.73 | 0.886 | 2.88 | 2.01 | 0.852 | 3.03 | 2.20 | 0.869 |
SAR | |||||||||
10 | 2.05 | 1.39 | 0.912 | 3.27 | 1.73 | 0.780 | 2.44 | 1.54 | 0.867 |
20 | 2.05 | 1.42 | 0.919 | 3.25 | 1.78 | 0.793 | 3.00 | 1.67 | 0.814 |
30 | 1.83 | 1.37 | 0.931 | 2.87 | 1.52 | 0.829 | 2.69 | 1.62 | 0.837 |
40 | 1.98 | 1.48 | 0.920 | 2.55 | 1.62 | 0.865 | 2.22 | 1.60 | 0.896 |
Comparison of Tables 5 and 6 clearly shows that the ANN-DE performs better than the ANN-PSO in estimating SO4 and SAR in all cases. The obtained results revealed that selecting the number of neurons as twice that of the input numbers may not give the optimal results, and should be obtained through a trial and error process.
The optimal ANN-PSO and ANN-DE models are compared in Table 7. From the table it is seen that the ANN-DE models give more accurate results than the ANN-PSO models for all training, validation, and test stages.
Model . | Hidden node number . | Training . | Validation . | Test . | Computational cost (iterations) . | Run time (s) . | Convergence speed (iteration/s) . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |||||
SO4 | |||||||||||||
ANN-PSO | 16 | 2.59 | 1.76 | 0.877 | 3.20 | 2.32 | 0.847 | 3.19 | 2.46 | 0.888 | 1,000 | 23 | 43 |
ANN-DE | 16 | 2.11 | 1.43 | 0.923 | 2.88 | 1.72 | 0.872 | 2.22 | 1.64 | 0.932 | 950 | 21 | 45 |
SAR | |||||||||||||
ANN-PSO | 18 | 3.05 | 1.98 | 0.802 | 4.71 | 3.03 | 0.739 | 3.87 | 2.61 | 0.810 | 1,200 | 28 | 43 |
ANN-DE | 12 | 1.84 | 1.29 | 0.935 | 3.18 | 1.62 | 0.786 | 2.11 | 1.35 | 0.902 | 1,100 | 26 | 42 |
Model . | Hidden node number . | Training . | Validation . | Test . | Computational cost (iterations) . | Run time (s) . | Convergence speed (iteration/s) . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | RMSE . | MAE . | R2 . | |||||
SO4 | |||||||||||||
ANN-PSO | 16 | 2.59 | 1.76 | 0.877 | 3.20 | 2.32 | 0.847 | 3.19 | 2.46 | 0.888 | 1,000 | 23 | 43 |
ANN-DE | 16 | 2.11 | 1.43 | 0.923 | 2.88 | 1.72 | 0.872 | 2.22 | 1.64 | 0.932 | 950 | 21 | 45 |
SAR | |||||||||||||
ANN-PSO | 18 | 3.05 | 1.98 | 0.802 | 4.71 | 3.03 | 0.739 | 3.87 | 2.61 | 0.810 | 1,200 | 28 | 43 |
ANN-DE | 12 | 1.84 | 1.29 | 0.935 | 3.18 | 1.62 | 0.786 | 2.11 | 1.35 | 0.902 | 1,100 | 26 | 42 |
Figure 3 compares the time series of the observed and estimated SO4 values obtained by ANN-DE and ANN-PSO models during the test period. It is clear from the figure that the values produced by the ANN-DE model are closer to the observed values than those of the ANN-PSO model. Scatterplots of the observed vs. simulated SO4 values during the test period are also compared in Figure 4. Assuming the fit line equation as y = ax + b, the a and b coefficients of the ANN-DE model are closer to 1 and 0, respectively, with a higher R2 value (0.931), which demonstrates the superiority of the ANN-DE model. Time variation and scatter plots' comparison of the ANN-DE and ANN-PSO in SAR modeling are shown in Figures 5 and 6. Similarly to the SO4 modeling, ANN-DE is superior to the ANN-PSO in simulating SAR.
Further, the results were tested by using one-way analysis of variance (ANOVA) for verifying the robustness of the optimum ANN-DE and ANN-PSO models. Both tests were set at a 95% significance level. Thus, differences between the observed and simulated SO4 and SAR values were considered as significant differences when the resultant significance level (p) was lower than the 0.05 by use of two-tailed significance levels. The test statistics are given in Table 8. The ANN-DE model yields a small testing value with a high significance level for the ANOVA in the case of both the SO4 and SAR modeling. According to the test results, the ANN-DE seems to be more powerful than the ANN-PSO in this case.
. | SO4 . | SAR . | ||
---|---|---|---|---|
F-statistic . | Resultant significance level . | F-statistic . | Resultant significance level . | |
ANN-DE | 0.0159 | 0.8991 | 0.8980 | 0.3431 |
ANN-PSO | 4.579 | 0.0325 | 9.1861 | 0.0024 |
. | SO4 . | SAR . | ||
---|---|---|---|---|
F-statistic . | Resultant significance level . | F-statistic . | Resultant significance level . | |
ANN-DE | 0.0159 | 0.8991 | 0.8980 | 0.3431 |
ANN-PSO | 4.579 | 0.0325 | 9.1861 | 0.0024 |
CONCLUSIONS
This paper presents particle swarm optimization (PSO) and differential evolution (DE)-based ANN approaches for estimation of groundwater quality parameters (SO4 and SAR). Two powerful bio-inspired algorithms, PSO and DE, were compared in order to determine which one is more suitable to train an ANN. This is very important because the training of an ANN is one of the key issues to obtain a good generalization. Application of PSO- and DE-based ANN to estimate groundwater quality is a novel research area. A comparison between an ANN trained with the PSO and DE algorithms was performed when applied to estimate groundwater quality. The outcomes and finding of this study indicated that both ANN-PSO and ANN-DE are suitable approaches for simulating groundwater quality. However, it can be observed that the DE-based model exhibits better performance in the training as well as validation and test stages than those of the PSO-based model. The present study used ANN-PSO and ANN-DE models for estimating SAR and SO4 using other qualitative parameters. Further studies should be carried out using limited inputs to verify the generalization of the developed models. Nonetheless, studies around relating SO4 pollution with certain industrial discharges or with rainfall intensity would be of interest.
ACKNOWLEDGEMENTS
This study was supported by The Department of Soil Science, University of Tehran, Iran. The authors thank the editor and anonymous reviewers for their help in improving the quality of the manuscript.