Meticulous prediction of hydrological processes, especially water budget, has an individual importance in environmental management plans. On the other hand, conservation of groundwater, a fundamental resource in arid and semi-arid areas, needs to be considered as a great priority in development plans. Prediction of a groundwater budget utilizing artificial intelligence was the scope of this study. For this aim, the Azarshahr Plain aquifer, East Azerbaijan, Iran, was selected because of its great dependence on groundwater and the necessity of cognizance of its budget in future programs. The long-term fluctuations of the water table in 13 piezometers were simulated by a wavelet-based artificial neural network (WANN) hybrid model, and their statistical gaps were covered. Then, the modelled water table was predicted for the next 12 months using genetic programming. The results of simulation and prediction were assessed by performance evaluation criteria such as R2, root mean squared error, mean absolute error and Nash–Sutcliffe efficiency. Thiessen polygons were then utilized, plotting the predicted unit hydrograph of the study area. The predicted water table from September 2012 to August 2013 revealed about 0.12 m depletion. Regarding the area of the Azarshahr Plain aquifer and its average storage coefficient, the aquifer budget will be reduced by about 0.3557 million cubic metres during this period.
Awareness of coming natural events, particularly hydrological processes, is a great challenge for environmental custodians, especially hydrologists. Notwithstanding their highly stochastic nature, the development of models capable of describing such complex phenomena is a growing area of research.
Groundwater as a major source of water supply for domestic, agricultural and industrial users and, of course, the main part of the hydrology cycle has a vital role in arid and semi-arid areas. In several such areas, much more groundwater is withdrawn than the recharge rate, leading to damaging environmental effects such as water level depletion, drying up of wells, abatement of water quality, amplified pumping prices and reduced well yields (Adamowski & Chan 2011). Effectively managing groundwater, predicting groundwater level ﬂuctuations, and quantifying these changes are strategic hydrological issues.
Recent years have seen a signiﬁcant rise in the number of scientiﬁc approaches applied to hydrologic modelling and forecasting, including the main popular ‘data-based’ or ‘data-driven’ approaches. Such modelling methods involve mathematical equations drawn not from the physical process in the watershed but from an analysis of simultaneous input and output time series (Solomatine & Ostfeld 2008). Meanwhile, it is becoming increasingly difficult to ignore the role of artificial intelligence (AI) in hydrological processes' prediction due to its efficiency in modeling complex physical processes based on certain data/information governing the process. There are numerous researches about groundwater level changes utilizing AI (e.g., Adamowski & Chan 2011; Kisi & Shiri 2012; Fallah-Mehdipour et al. 2013; Maheswaran & Khosa 2013; Moosavi et al. 2013; Nourani et al. 2015; Seo et al. 2015; Sivapragasam et al. 2015).
Nevertheless, AI is progressively being preferred primarily and many studies have already been reported in forecasting groundwater level changes; however, there is no direct forecasting of groundwater budget using this method. The low number of studies on groundwater budget modelling via AI demonstrates the need to consider groundwater and relevant issues.
Among various conceptual and black box models developed over the mentioned period, hybrid AI-based models have been among the most promising in simulating hydrologic processes (Nourani et al. 2014), and wavelet-AI is an example of these methods. On the other hand, genetic programming (GP) is an AI method that is based on the random iterative searching process to achieve an appropriate relationship between input and output. Conjugating the wavelet and GP methods can give an incredibly precise result in hydrological processes' prediction; for instance, Nourani et al. (2012) investigated the linkage of wavelet analysis to GP in constructing a hybrid model to detect seasonality patterns in rainfall–runoff. The hybrid model was useful in forecasting runoff.
In this study an attempt is made to model the variation of groundwater budget in the Azarshahr Plain aquifer using a conjugated wavelet-based artificial neural network (WANN)-GP model. This work differs from previously reported works in the sense that emphasis is given to improving the insight into the groundwater budget change and also linking the wavelet and GP for groundwater modelling.
The Azarshahr Plain is one of the Urmia Lake sub-basins, and is located in Azerbaijan province, northwest Iran. The study area is densely populated, with 100% of its drinking, domestic and industrial water and 80% of agricultural water supplied from groundwater resources.
Azarshahr Chay is the main stream flow in the study area, which originates from Sahand Mountain and rarely discharges into the lake due to percolation and evaporation losses, as well as diversion of water for irrigation. The average annual precipitation of the study area is about 221.2 mm for the long-term period 1982 to 2009, whereas the annual evaporation is about 1,500 mm.
As a contemporary tool of applied mathematics, wavelet transform (WT), is a signal processing strategy that has indicated higher performance contrasted with Fourier transform (FT) and short time FT in examining non-stationary signals. WT analysis, created during recent decades in the mathematics community, appears to be a more effective device than the FT in studying non-stationary time series (Partal & Kisi 2007). The principal point of preference for WTs is their capacity at the same time to get information on the time, location and frequency of a signal, while the FT will just give the frequency information of a signal.
The WT is implemented through discrete and continuous WT (DWT and CWT). Since different scales should be taken into consideration in CWT and using a numerical method, the equation integration for each scale is resolved. Calculating the wavelet coefficient is time-consuming in all scales and produces huge amounts of data. In other words, we can say that CWT consists of redundant and inefficient sections, which are its weak points (Adamowski 2007; Partal 2009); whereas the DWT has eradicated the CWT drawbacks. Meanwhile, it is an efficient alternative for the discrete data.
In DWT, the original time series is passed through high-pass and low-pass ﬁlters (digital filtering), getting time-scale signals. The results of digital filtering are detailed coefﬁcients and approximation series, obtained with the wavelet algorithm (Zhang & Li 2001). Every time that this procedure is repeated, the approximation and one or more details are gained.
Performing the above-mentioned transform, the raw data are divided into approximation (A) and details (D). The approximation consists of high scale and low frequency components of the signal. The details consist of low scale and high frequency components of the signal, which are obtained from low-pass and high-pass filters, respectively.
Consequently, the DWT was used to decompose the time series data belonging to the groundwater level for the wavelet analysis-artificial neural network (WA–ANN) models developed in this study.
The WT is appropriate in significant and potentially beneficial data mining, available in experimental sciences (prediction, reanalysis, global climate model simulations, etc.). Providing obvious information in a readable form, it can be applied to resolve analytical, classiﬁcation or forecasting issues. In a review of the applications of the WT in hydrologic time series modelling, Sang (2013) highlighted the complex information that can be drawn from such analysis: characterization and understanding of hydrologic series' multi-temporal scales, identiﬁcation of seasonality and trends, and data de-noising. Consequently, better interpreting of hydrological processes is derived from the decomposing ability of the WT (Nason & Sachs 1999; Adamowski 2008; Adamowski et al. 2009; Kisi 2010; Mirbagheri et al. 2010; Sang 2012).
Since AI has shown promise in modelling and forecasting non-linear hydrological processes and in handling large amounts of dynamicity and noise concealed in datasets, hybrid modelling of AI was employed for precise simulation of water level in piezometers and elimination of their statistical gaps in a long-term period. For this, the WT model was linked to ANN, producing WANN.
In this study, DWT using Mallat's (1998) algorithm was used for decomposing the time series signal. The time series signal in this study is the water table fluctuation in piezometers, used only as the mother signal, which must be decomposed. The multi-resolution analysis by Mallat's algorithm generates approximations and details for a given time series signal. The general trend of the original signal and high frequency components are held and depicted by an approximation and detail, respectively. This results in breaking down the original signal into lower resolution constituents. Nourani et al. (2009) introduced the L = Int [log (N)] for choosing the number of decomposition levels or DWs, where L was the decomposition level while N was the number of time series data.
N-level DWT decomposes a signal x (t) into D1, D2…DN and AN, where D1 to DN are details and AN is an approximation. D1, D2…DN and AN are used as input to the ANN. The second step corresponds to training and testing phases using the ANN.
The WANN algorithm is summarized as follows:
Step 1: Multilevel wavelet analysis using DWT decomposes a signal into details (D1, D2… DN) and approximation (AN), where N is the decomposition level. Water table time series data were decomposed into details D1 and D2 and an approximation A1 in this study. Decomposition levels have been selected with respect to the number of data used for each piezometer (the data of water table used for each piezometer) and they are shown in Table 1. For decomposition level, DL = log(No. Data) formula was used, following the suggestion of Wang & Ding (2003), Partal & Kisi (2007) and Nourani et al. (2009).
Step 2: ANN is trained and tested using the details and approximation as input and the model performance is evaluated.
aThe number of data for each piezometer (number of months).
GP is a kind of artiﬁcial intelligence method that is based on the random iterative searching process to achieve an appropriate relationship between input and output. The common structure of this method is the tree shape, representing the expression. Variables, functions and operators in this structure are situated in the nodes, which are linked together by branches.
GP is an evolutionary algorithm based on Darwinian theories of natural selection and survival of the fittest. The algorithm considers an initial population of randomly generated equations, derived from the random combination of input variables, random numbers and functions. The function can include arithmetic operators (plus, minus, multiply and divide), mathematical functions (sin, cos, exp, log) etc., which have to be chosen based on some understanding of the process. This population is then subjected to an evolutionary process and the fitness of the evolved programs are evaluated; individual programs that best fit the data are then selected from the initial population. The programs that best fit are selected to exchange part of the information between them to produce a better program through ‘crossover’ and ‘mutation’. The user must decide a number of GP parameters before applying the algorithm to the data. The program that fitted the data less well is discarded. This evolution process is repeated over successive generations and is driven towards finding symbolic expressions describing the data, which can be scientifically interpreted to derive knowledge about the process being modelled (Sivapragasam et al. 2015).
However, in this study, the simulated water tables by WANN were used as inputs to the GP model for time series prediction as can be seen from the flowchart of the study steps (Figure 1). It means that the simulated water tables of each piezometer have been entered into the GP model and forecasting has been done after that. For this aim, GeneXproTools was utilized and the training, testing and forecasting of 12-month ahead water tables was done.
Groundwater budget calculation
The procedure of monthly water level forecasting is as below:
Forecasted water level for each month = [[(polygon area of 1st piezometer × 1st piezometer water level) + (polygon area of 2nd piezometer × 2nd piezometer water level) + ··· + (polygon area of Nth piezometer × Nth piezometer water level)]/(polygon area summation)]
In hydrological studies, we need to know the water budget of the groundwater reservoir, where sometimes we do not have the data of the input and output parameters of the study area, such as precipitation, evaporation, etc.; in this situation, we can evaluate the reservoir groundwater budget by calculating the fluctuations in the water table of the aquifer during the given period (here from 2012 to 2013). Knowing the difference between the water table at the beginning and end of the period, and also knowing the specific storage, S, and the area of the reservoir (A), it is possible to calculate the changes of reservoir groundwater volume. This leads us to understand the groundwater budget, i.e., if the reservoir gained or lost water during the given period of time.
RESULT AND DISCUSSION
The evaluation of model performance was done by statistical criteria (R2, RMSE, MAE and NSE) and the results are shown in Table 2.
It can be clearly seen that the greater the fluctuation of the water level, the lower the performance of the model. The lower performance belongs to piezometer numbers 8, 11, 12 and 21, and may be driven by their situation. For example, piezometer No. 8 is located near the road and belongs to a tubing manufacturer that uses groundwater for its production, and therefore the water level experiences oscillation. Also piezometers numbers 11 and 12 are near the river and perhaps affected by river fluctuations. Thus it can be inferred that the model performs with lower accuracy and more error. Table 3 also shows the results of the predicted water level for the 13 piezometers and their Thiessen polygon area used for groundwater budget computing.
aNo. = piezometer number.
bH1…H12 = water level from 1st month to 12th month.
Using Equation (7), the predicted water table for the water budget domain, considering the aquifer domain area (81.65 km2) and storage coefficient, derived about 0.036 from pumping wells and qanat discharges in the Azarshahr Plain detailed collected data, done by the Azerbaijan Territorial Water Association (ATWA) in 2009. The reservoir volume, the groundwater budget of the study area, has reduced by about 0.3557 million cubic metres (MCM) during the prediction period (September 2012 to August 2013).
This paper has given an account for the widespread use of AI in the simulation and prediction of environmental processes. Hybrid modelling of wavelet and ANN in simulating the water table and then forecasting the future water table using GP was applied to determine the study area groundwater reservoir changes during the forecasted period of time. Performance assessment shows satisfying results, revealing the accuracy of AI hybrid models. Not only the WANN hybrid model but also the GP model has shown its ability in simulation and prediction of natural events. The next year's water budget was measured, using the Thiessen polygon method, knowing the area of the aquifer domain and its storage coefficient and forecasting the aquifer water table fluctuation. The groundwater reservoir has lost about 0.35 MCM of its storage during the predicted period, which indicates the importance of monitoring the groundwater resource and, of course, the predicted water budget can be taken into account for future environmental plans.