Accurate measurement of groundwater levels is often difficult and involves great uncertainty. Therefore, simulating and predicting the fluctuating behavior of groundwater levels is necessary for water resource planning and management. In this study, radial basis function (RBF) neural networks and support vector machines (SVM) were employed to simulate groundwater level fluctuations. The time series data of precipitation, evaporation, and temperature were used as model inputs. Groundwater level data from the first 10 years, from 2003 to 2014, were used as the training dataset, while data from the last 2 years were used as the test dataset. Uncertainties caused by errors in the measurements of the variables or in outputs were estimated at 95% confidence intervals. The results showed that the SVM model had a superior simulation and prediction capability according to four statistical standards. The comparisons of the outputs and the confidence intervals of the two models showed that the SVM model was more accurate and had less uncertainty. The conclusions suggest that SVM is an effective method for simulating groundwater levels and analyzing model uncertainties using confidence intervals and can be used to facilitate sustainable groundwater management strategies.

## INTRODUCTION

Groundwater is an important component of the global freshwater supply and a precious natural resource for agricultural, domestic, and industrial purposes in many countries. Water shortages, the over-exploitation of groundwater, and related environmental and geological problems have attracted increasing attention and become one of the most critical global concerns, especially in arid, semi-arid, and the fragile ecological environments (Adamowski & Chan 2011). Da'an, a semi-arid region with the highest salinization rate in western Jilin Province of China, is located in an ecologically and economically fragile area. The variation in groundwater levels is the main indicator of the amount of groundwater resources, and the somewhat unstable changes in groundwater levels are the result of changes in many complex and interactive factors. Therefore, an accurate and reliable prediction of groundwater levels is essential in determining the resource quantity and allowable exploitation level of groundwater and in avoiding or reducing adverse effects such as the loss of pumpage in water wells, land surface subsidence, and aquifer compaction (Vahid *et al.* 2013; Verma & Singh 2013).

Mathematical models are generally used to improve our understanding of groundwater systems. There are many prediction models, such as nonlinear empirical models, mathematical groundwater models, and physically-based models, that have been used to simulate and forecast groundwater levels and applied to problems ranging from aquifer safe yield analysis to groundwater remediation and quality issues (Sun & Xu 2011; Emamgholizadeh *et al.* 2014). Although conceptual and physically-based models are used to depict hydrological variables and characterize the complex structures of aquifers, they have practical limitations (Nourani *et al.* 2008). These modeling techniques are very data- and labor-intensive, such as Darcy's law-based differential equation systems of groundwater dynamics (Bense *et al.* 2009; Ge *et al.* 2011). Since data are typically limited in regions under severe environmental conditions, it is difficult to analyze the geological parameters and predict the results accurately in those regions. Therefore, empirical models, such as artificial neural networks (ANN) and support vector machines (SVM), may serve as attractive alternatives, because they can provide useful results using a smaller amount of data, are less labor-intensive, more cost-effective, and suited to solve the dynamic nonlinear systems (Emamgholizadeh *et al.* 2014; Chang *et al.* 2015; Gong *et al.* 2016).

In various branches of hydrology, ANNs have been well-developed and applied for the prediction of nonlinear problems, such as precipitation (Nastos *et al.* 2014), sediment load (Afan *et al.* 2015), and river flow (He *et al.* 2014). The radial basis function (RBF) neural network is one of the ANN models and has superior performance to the back-propagation ANN model and has a fast impending speed. The applications of the RBF technique in hydrology range from real-time modeling to event-based modeling. It has been used for the prediction of rainfall and groundwater levels as well as for the modeling of stream flows and water quality (Garcia & Shigdi 2006; Ghose Dillip *et al.* 2010). SVM is a relatively new structure in modeling nonlinear systems. It is based on structural risk minimization (SRM) instead of the empirical risk minimization of ANN. SRM minimizes the empirical error and model complexity simultaneously, which can improve the generalization ability of SVM for classification or regression problems in many disciplines. SVM has been used to solve hydrogeological problems, such as estimating evapotranspiration in a semi-arid environment (Tabari *et al.* 2012), and predicting groundwater levels in a coastal aquifer (Yoon *et al.* 2011) and stream flow (Noori *et al.* 2011). These studies showed that two data-driven models could be applied in formal hydrology studies, and models could be improved or combined with other models for higher accuracy in results. However, the results of the numerical models are subject to randomness and uncertainty whether the models are combined or not, which makes it difficult to calculate the groundwater levels accurately. Thus far, very little research has been conducted on the analysis of the correlation between the results of numerical models with their uncertainties.

Using the extensive field monitoring data collected from 2003 to 2014 in Da'an, in western Jilin Province of China, this study aims to construct a groundwater level model by using RBF and SVM frameworks, examine the validity of the model, and compare the results of two frameworks. We investigate and analyze the impacts of uncertainty on the simulated results at a 95% confidence interval. The results provide an important theoretical basis for improving the accuracy of groundwater level simulations and predictions, and thus serve as a reference for sustainable exploitation, utilization, and protection of groundwater resources.

## METHODS

### RBF

*et al.*2001), and it also possesses the advantages of the optimal approximation point. In 1985, multivariate interpolation of the RBF method was proposed by Powell. In 1988, RBF neural network was applied to the design of ANN and this method was successfully applied to identify the nonlinear time series prediction field. Basically, a RBF network is composed of a large number of simple and highly interconnected artificial neurons and can be organized into several layers, i.e. input layer, hidden layer, and output layer as shown in Figure 1.

Input layer: An input pattern enters the input layer and is subjected to direct transfer function. The input layer serves as a distributor to the hidden layer and output from the input layer is also subjected to transfer function. The number of nodes in the input layer is equal to the dimension of input vector L. The output from the input layer with element I_{i(i=1 to L)} is I_{i}.

Hidden layer: The hidden layer does all the important processes and these nodes satisfy a unique property, being of radially symmetric structure. Being a radially symmetric structure, it must have the following:

(a) A center vector in the input space, made up of a cluster center with the element

_{(j=1 to M)}. ‘M ≤ P’, where M is the number of center vectors and P is the number of training patterns. The vector typically is stored as weight factors from the input layer to the hidden layer.

Output layer: There are weight factor _{(k=1 to N, j=1 to M)} between *k*th nodes of the output layer and *j*th nodes of the hidden layer. ‘N’ is the dimension of the output vector. Output from the output layer transferred through a transfer function like log sigmoid or tan sigmoid (Ghose Dillip *et al.* 2010).

### SVM

*et al.*2016). The architecture of SVM is shown in Figure 2.

### Study area description and data collection

^{2}, is in an ecologically fragile local environment, and belongs to the Songnen plain hinterland (Figure 3). Da'an experiences one of the most serious soil salinization problems in the western Jilin Province. The severe saline-alkali land area accounts for 60.3% of the total saline-alkali land area. Da'an features semi-arid climatic conditions with dry and windy weather in spring, rainy weather in summer, light precipitation in autumn and moderate snow in winter. The annual average temperature is 4.8 °C, the annual average precipitation is 422 mm, and the average annual evapotranspiration is 1,681 mm. The main types of groundwater aquifers in the region are phreatic and confined aquifers. Groundwater level is shallow and subsurface runoff is slow. With the recent developments in agriculture and urbanization, increased groundwater extraction has altered the natural dynamic equilibrium of groundwater and left the water resources issues unresolved.

*Y*is the normalized data,

*X*is the time-series data,

*X*

_{min}is the minimum value of the time-series data and

*X*

_{max}is maximum values of time-series data (Yoon

*et al.*2011).

### Performance criteria

*et al.*2015). The following equations were used for the computation of these parameters: where

*n*is the number of input samples, and are the observed and predicted groundwater level depths at time

*t*, and and are the means of the observed and predicted groundwater level values, respectively. The best fit between the observed and predicted values would have R = 1, RMSE = 0, MAE = 0 and NS = 1.

## RESULTS AND DISCUSSION

### The RBF modeling

Figure 4 shows that the values simulated by the RBF model reasonably match the observed groundwater levels in the training period. The correlation coefficient R^{2} between the RBF models simulated value and observed data was 0.8483, which indicates that the RBF models had good fitting accuracy in the training period.

### The SVM modeling

Figure 5 shows that the correlation coefficient R^{2} between the SVM model simulated values and observed data was 0.9307 during the training period. Compared with the results of RBF, the SVM model had better fitting accuracy in the training period. Thus, the two models can be used to simulate and predict monthly groundwater levels.

### Comparison of RBF and SVM models

The performance of the RBF model and SVM model during the training period and validation is summarized in Table 1 in terms of R, RMSE, MAE and NS.

RBF | SVM | |||
---|---|---|---|---|

Training period | Validation period | Training period | Validation period | |

R | 0.839 | 0.905 | 0.964 | 0.919 |

RMSE | 0.315 | 0.413 | 0.148 | 0.297 |

MAE | 0.257 | 0.336 | 0.082 | 0.208 |

NS | 0.679 | 0.676 | 0.928 | 0.849 |

RBF | SVM | |||
---|---|---|---|---|

Training period | Validation period | Training period | Validation period | |

R | 0.839 | 0.905 | 0.964 | 0.919 |

RMSE | 0.315 | 0.413 | 0.148 | 0.297 |

MAE | 0.257 | 0.336 | 0.082 | 0.208 |

NS | 0.679 | 0.676 | 0.928 | 0.849 |

In addition, if the NS and R criteria in a model are equal to 1, then that model is capable of producing a perfect estimation. In general, a model can be considered accurate and effective if the NS is higher than 0.8 (Shu & Ouarda 2008). The R values of the two models were over 0.8 and the NS values for the SVM model in the training stage were greater than 0.8 (Table 1). These values suggest that the SVM model achieved acceptable results, but RBF did not; and the SVM model is more capable of capturing the nonlinear relationships with the input data than the RBF model.

### Uncertainty analysis

According to Equations (12) and (13), the values of the d-factor for the SVM and RBF models were 0.91 and 2.16, respectively, which indicated that the overall uncertainty in SVM model results was lower than that in RBF in this case study. Figures 7 and 8 show the relationship between observed groundwater levels and the predicted values within 95% confidence interval for two models. The 95% confidence interval for RBF predictions was much wider than the interval for SVM predictions. The lower the model uncertainty is, the narrower the confidence interval is, and the more reliable the predicted results are. In addition, the majority of observed groundwater levels fell within the confidence interval, which shows that the confidence level in simulation results reached 95% (Figures 7 and 8).

## CONCLUSIONS

The accurate and reliable simulation and prediction of groundwater levels is one of the most important issues in water resources management. In this study, monthly groundwater data were used to assess the ability of SVM and RBF models to simulate and predict groundwater levels in Da'an, in Jilin Province of China. Hydrological variables were used as model inputs and monthly groundwater levels were used as the model output. Four standard statistical criteria, R, MAE, RMSE, and NS, were used for evaluating the performance of these two models.

The overall results showed that RBF and SVM models provided a good fit to the observed data. However, the values of four standard statistical parameters indicated that the SVM model was more reliable in simulating and predicting the groundwater levels compared to the RBF model during the training and validation steps. Another advantage of the SVM model over RBF, based on the objective criterion (d-factor), was the lower uncertainty (narrower confidence interval) in the results. Thus, the SVM model is considered an effective method for predicting the groundwater levels.

The uncertainty quantification is an important aspect of model predictions. Based on the results of deterministic simulation and prediction, the 95% confidence interval is proposed to calculate model uncertainty and predict the results in a probabilistic sense. Additional studies should be conducted to further explore this proposed method, which can improve the accuracy of the predictions under varied environmental conditions and facilitate the development of more effective and sustainable groundwater management strategies.

## ACKNOWLEDGEMENTS

The authors would like to thank the National Natural Science Foundation of China (41072255) and Science Foundation of Jilin Province (20150101116JC) for financially supporting this research. The authors also appreciate the anonymous reviewers and editors for their contributions and help to this research.