Reliable temporal modelling of groundwater level is significant for efficient water resources management in hydrological basins and for the prevention of possible desertification effects. In this work we propose a stochastic method of temporal monitoring and prediction that can incorporate auxiliary information. More specifically, we model the temporal (mean annual and biannual) variation of groundwater level by means of a discrete time autoregressive exogenous variable (ARX) model. The ARX model parameters and its predictions are estimated by means of the Kalman filter adaptation algorithm (KFAA) which, to our knowledge, is applied for the first time in hydrology. KFAA is suitable for sparsely monitored basins that do not allow for an independent estimation of the ARX model parameters. We apply KFAA to time series of groundwater level values from the Mires basin in the island of Crete. In addition to precipitation measurements, we use pumping data as exogenous variables. We calibrate the ARX model based on the groundwater level for the years 1981 to 2006 and use it to predict the mean annual and biannual groundwater level for recent years (2007–2010). The predictions are validated with the available annual averages reported by the local authorities.

INTRODUCTION

Groundwater is a significant source for the functions of a watershed. It replenishes the surface waters, supports the ecosystems viability and is a primary source for drinking and agricultural purposes. Groundwater level reduction affects the base flow in rivers, evapotranspiration rates, and water quality, as well as human welfare, agricultural and industrial activities (Famiglietti 2008). Examination and modelling of temporal trends of groundwater monitoring wells or of basin averages provide useful information about the aquifer temporal response to different meteorological (e.g. precipitation extremes) or anthropogenic effects (groundwater over-exploitation). Therefore, the modelling of temporal variations of the groundwater level provides a useful management tool in sparsely monitored basins that helps to assess future trends (Margat & Van der Gun 2013).

In the case of temporal estimation it is desirable to formulate a predictive model of groundwater table level that can incorporate various physical variables that determine the groundwater level, such as meteorological data (e.g. precipitation), evapotranspiration, runoff and water usage. A stochastic model is required due to the considerable uncertainty of certain variables (e.g., evapotranspiration), the spatial variability of conditions within the basin, the sparse nature of sampling in space-time and the inadequacy of available measurements. For example, water usage is estimated based on data from the official boreholes, whereas an unspecified number of unregulated boreholes operate in the area. Any physically-based model becomes stochastic once its inputs, parameters or outputs are treated as random functions (Christakos 2000).

For water management purposes, it is important to have information about water table level and especially about the dynamic relationship between water table level and hydro-meteorological variables that affect the water table level variation. Water table fluctuations have been successfully modelled using empirical time series to describe water table dynamics. In several studies linear transfer-function noise models (Knotters & Van Walsum 1997; Van Geer & Zuur 1997; Von Asmuth & Knotters 2004; Yi & Lee 2004; Von Asmuth et al. 2008; Fabbri et al. 2011) have been applied. Additionally, an approach based on model-predefined impulse response function in continuous time (Von Asmuth et al. 2002) has been used to model the dynamic relationship of precipitation surplus and water table level at basin scale (Manzione et al. 2010, 2012). Yet, an empirical model to simulate daily water table levels based on hydro-meteorological variables was proposed by Morgan & Stolt (2009), while an empirical time series predictive tool that incorporates daily precipitation was used to model the fluctuations of the water table depth in an alluvial aquifer in Italy (Calzolari & Ungaro 2012).

Data driven techniques have been widely developed to deal mainly with complex non-linear physical phenomena. However, most of them ignore the physical base of the problem. And yet, in many cases have been successfully applied in groundwater level forecasting e.g. genetic programming (GP) and neural fuzzy inference system (Fallah-Mehdipour et al. 2013), evolutionary polynomial regression (EPR) (Giustolisi et al. 2008), support vector machines (SVM) and wavelet transforms (Suryanarayana et al. 2014), and self-organizing maps (SOM) (Chen et al. 2010). Trained artificial neural network (ΑΝΝ) models have also been applied to model the water table temporal variability at various catchments, e.g. Trichakis et al. (2009, 2011) and Tapoglou et al. (2014), as well as at a single well at the study basin of this work (Daliakopoulos et al. 2005; Tsanis et al. 2008). ANN models in several occasions have been combined with autoregressive exogenous variable (ARX) functions, applied in this work, to enhance groundwater level predictions (e.g. Giustolisi & Simeone 2006; Izady et al. 2013).

One can find advantages and disadvantages in all data driven techniques depending on the modelled phenomenon and the available data characteristics: data size, extreme values, statistical distribution, data value range and time span. The major advantage of machine learning techniques is their capability to model complex non-linear phenomena. The general disadvantage of all the machine learning techniques is related to the loss of physical interpretation and the complexity of their structure, the a priori requirements and computational complexity (Todini 2007). Other drawbacks of ANN, SOM and GP are the calculation of locally optimal values, constants miscalculation and overfitting. The biggest disadvantages of the SVM and Wavelet Transforms lies in the choice of the right kernel (or wavelet) and on the slow training section (Abe 2005). EPR have provided an improved framework of operation compared to the classic machine learning methods, it searches via genetic algorithms for the best form of the function structure and like the other methods requires the dataset to be as large as possible to capture the behaviour of the modelled phenomenon. Finally, the major disadvantage of machine learning techniques is the requirement of extensive data. Small datasets are a drawback especially for the training section as a significant number of parameters need to be determined (Giustolisi & Savic 2006; Giustolisi et al. 2008; Solomatine & Ostfeld 2008; Anguita et al. 2010; Remesan & Mathew 2015).

ARX models, on the other hand, combine simple predictive models with a linear threshold function without hidden layers and a priori requirements. They are less complex than the machine learning methods and under specific conditions, optimal process interpretation, can be equally accurate. Thus, the application of a model that can physically explain the relationship between the groundwater level and hydrological variables is generally preferred to simulate and predict the groundwater level temporal variability. Such a model is the ARX. Furthermore, in this work the available datasets are small. The ARX model embedded in a Kalman filter has an advantage over other data driven techniques when small datasets are available due to the recursive update of the system parameters and of the predictions, conditionally on observed data, as the data are successively processed.

ARX models proposed by Knotters & Bierkens (2000) have also proved to be useful tools for modeling the dynamic relationship between meteorological variables and water table level (Knotters & Bierkens 2001; Webster & Heuvelink 2006). An extension of ARX models involves the regionalised version of the ARX model (Bierkens 2001; Bierkens et al. 2001; Knotters & Bierkens 2002).

This study aims to model the temporal variability of the mean annual groundwater level in a sparsely monitored basin of great interest for the welfare of the local community by exploiting the available field data and maximizing the possible information gain. We adopt the discrete time ARX model for the variation of the water table level (Knotters & Bierkens 2001, 2002). The term exogenous denotes that the model equations incorporate information from other variables besides water table level. In particular, the ARX discrete-time model herein incorporates information about the time variability based on the available time series of groundwater level, precipitation measurements and abstraction rates. The ARX model is embedded in a Kalman filter which enables the estimation of parameters and the prediction of the optimal groundwater level (Ljung 1999).

Similar works that involve ARX model assume prior knowledge of the model parameters from field monitoring, or estimate the parameters initially from available field data, or from calibration at a nearby location, or a different time period (Bierkens 2001; Bierkens et al. 2001; Webster & Heuvelink 2006; Yuan et al. 2008). In the Kalman filter approach proposed herein the ARX model parameters are estimated and optimized using the time series of the water table level. The difference of this approach from other works is that the ARX model parameters and the output of the state equation are simultaneously estimated by means of the Kalman filter adaptation algorithm (KFAA) (Ljung 1999; Lanzi et al. 2006). The initial values of the ARX model parameters are randomly selected due to lack of field information. To our knowledge, this approach is used for the first time in a hydrological application.

The ARX model has been originally used with a time step equal to one day (Bierkens et al. 2001). Herein we apply the model with a considerably larger time step equal to one year and biannually, since we are limited by the available measurements. Hence, we employ a coarse-grained version of the original model which has the same form due to the linearity of the ARX equation. We also include the impact of anthropogenic activities, i.e., pumping data, in the ARX model. As we show below, the model predictions are in very good agreement with the data after the initial period of parameter adaptation. The temporal variation of the mean annual and biannual groundwater level allows us to assess the aquifer's behavior during the last 30 years and its response to precipitation and pumping.

STUDY AREA

The study area is located in the south central part on the island of Crete in Greece (Figure 1). Mires basin is part of the Mesara catchment which encloses the largest and most productive valley of the island. The catchment's marginal groundwater resources are extensively used for agricultural activities and human consumption. The alluvium basin of the Mesara catchment is not a uniform hydrogeological unit, and therefore it is divided into two sub-basins (Kritsotakis & Tsanis 2009). This study focuses on the Mires basin of the Mesara valley due to the availability of hydro-geological data and to the significant decrease in the groundwater level in recent years (Varouchakis 2016). The Mires basin is a down-faulted plain with an area of 50.3 km2. The plain is filled mainly with Quaternary alluvial sediments which form an inter-bedded sequence of gravels, gravely sands, sands, silts, silty sands and clays. The alluvium aquifer of Mires basin is the most significant of the Mesara catchment. Its thickness decreases from the center of the basin to the south and north. The alluvial basin constitution changes from east to west. At the eastern part, the deposits are coarser with short layers of clay and silt, while at the western part the presence of clay and silt layers are significant. At the greater extent, the surface layer is mainly composed of gravel and sand allowing high infiltration.
Figure 1

Topographic map showing the locations of the 11 monitored wells (triangles) in Mires basin along with the corresponding surface elevation and the river path. The square marks the two telemetric stations that have been placed in the two wells which have been monitored since 2003.

Figure 1

Topographic map showing the locations of the 11 monitored wells (triangles) in Mires basin along with the corresponding surface elevation and the river path. The square marks the two telemetric stations that have been placed in the two wells which have been monitored since 2003.

Porosity decreases with depth below the surface in the range 0.05–0.12 whereas effective porosity, which is the percentage of pores (interconnected) that are available for fluid flow has been determined to be 0.085 in the Mires Basin. High values of hydraulic conductivity are present in the Mires basin, where it varies between 10 and 120 m/day reflecting the presence of several gravel and sand horizons in the alluvial sequence. Lower values occur in the northern side of the Mires basin, where lower Pleistocene rocks are present. The significant groundwater level drop in the past 30 years (>35 masl) has affected the hydraulic characteristics of the aquifer presenting a 25% average reduction in hydraulic conductivity values.

The basin's aquifer is characterized in hydro-geological reports as unconfined. Aquifer storage coefficients, obtained from pumping tests, in conjunction with the behavior of the water table (free surface existence) suggests, that although heterogeneous and locally confined, the aquifer behaves at the regional scale as an unconfined unit. A detailed hydro-geological description of the Mesara catchment including Mires basin is presented in Varouchakis (2016).

The spatial variability of the groundwater level in the basin has been studied using stochastic and deterministic geostatistical methods (Varouchakis & Hristopulos 2013a), applying non-linear data normalization techniques in combination with Kriging (Varouchakis et al. 2012) and by developing spatial trend models based on topographical variables and physical laws in association with Residual Kriging (Varouchakis & Hristopulos 2013b; Varouchakis et al. 2016a).

Overexploitation during the past 30 years has led to a dramatic decrease, in excess of 35 m, in the groundwater level of the Mires basin. Potential future climatic changes in the Mediterranean region, population increase, and extensive agricultural activity threaten the sustainability of the water resources. The accurate estimation of the temporal variability of the hydraulic head is important for integrated groundwater resources management plans that will help reduce the risk of desertification. Ten wells (Figure 1) were monitored between the years 1981 and 2003, whereas several others were monitored for shorter periods. Since 2003, the regular biannual monitoring of the operating boreholes has been replaced by the continuous monitoring of two telemetric stations placed in respective boreholes (one of which belongs in the set of the 10 boreholes). Measurements at four of the ten wells are also available biannually during the 2003–2006 period. The mean annual groundwater level variation in the Mires basin since 1981 is presented in Figure 2 and exhibits overall a declining trend. It is estimated that around 22 Mm3 on average are pumped per year from the basin (Varouchakis 2016). The temporal variability of the annual pumping volume along with the annual rainfall in the Mires basin is also shown in Figure 2. Rainfall data at biannual basis (wet-dry hydrological period) are also available and according to the river basin management report for the island of Crete (Special water secretariat of Greece 2015) for the Mesara area is reported that the 65% of pumping occurs during the dry hydrological season and the 35% during the wet hydrological seasons. Therefore, these rates can be used to develop the biannual pumping volumes that apply in the basin. The hydrological and hydrogeological data have been provided by the Administration of Land Reclamation of the Prefecture of Crete.
Figure 2

Mean annual groundwater level, annual rainfall and annual pumping rate for the time period 1981–2010.

Figure 2

Mean annual groundwater level, annual rainfall and annual pumping rate for the time period 1981–2010.

A correlation analysis, in terms of correlation coefficient R, shows regarding the annual values that groundwater level and precipitation have an R = 0.58, with p-value: 0.0007, groundwater level and pumping have: R = −0.52, p-value: 0.003 while precipitation and pumping: R = −0.70, p-value: 0.000017. The p-value assesses the hypothesis of no correlation. If p-value is small, i.e. less than 0.05, then the correlation coefficient R is significant. R values denote an important correlation for an interval of 0.5–0.70 and a strong correlation for the interval 0.7–0.9 (Tichy 1993). In addition, the correlation analysis of the sub-annual data provide the following results: groundwater level and precipitation have R = 0.76, p-value: 1 × 10−6, groundwater level and pumping have: R = −0.68, p-value: 1 × 10−6 while precipitation and pumping: R = −0.81, p-value: 1 × 10−6.

A hydrogeological tool (Haitjema & Mitchell-Bruker 2005) has been also applied to the basin (Varouchakis et al. 2016a) to examine if the aquifer is recharge or topography controlled. The equation that describes the tool involved in the numerator the average annual recharge or infiltration rate, i (m/d) and the square of the average distance between surface waters, L (m). In the denominator, a factor m between 8 and 16 for aquifers that are strip like or circular in shape respectively, the (horizontal) aquifer hydraulic conductivity, K (m/d), the aquifer thickness, H (m) and the maximum distance between the average surface water levels and the terrain elevation, d (m). If the ratio is greater than one then the aquifer is topography controlled while if less than 1 then it is recharged controlled. Solving the equation using the basin's hydrogeological data (Varouchakis 2016) the calculated ratio is equal to 0.7 (Varouchakis et al. 2016a) and thus less than 1. This result means that the aquifer is recharge controlled. Therefore, precipitation surplus is affecting the water levels by recharging the aquifer.

METHODS

Background of ARX groundwater level model

The ARX model introduced by Bierkens et al. (2001) and Knotters & Bierkens (2001) is a linear time series model that relates explicitly the precipitation surplus with water table level. The parameters of ARX model are physically motivated (Knotters & Bierkens 2001). At locations where sufficiently long time series of water table levels are available, the ARX parameters are obtained from the model calibration process which minimizes the mean square error between the measured and the modelled values.

The ARX function can be incorporated in the Kalman filter algorithm (Ljung 1999; Knotters & Bierkens 2001) to perform the following: (1) recursively determine the model parameters from the dataset; and (2) predict future water table levels conditionally on monitored values. Ideally, auxiliary information such as meteorological variables and water usage can be incorporated in the model. The predictions are updated as new water table level data enrich the time series. Updating is based on Bayes’ theorem. The method also permits estimating the statistics of water table fluctuations. The accuracy of the future predictions is assessed by estimating expected levels for time periods with available observations incorporating meteorological conditions (e.g. precipitation rates) of the monitoring period. The precipitation surplus was used as an exogenous variable in the ARX model. The model parameters are obtained by considering the ARX function as an equation of state inside the Kalman filter. The mean-square error of the filter innovations are subsequently minimized to deliver the optimal parameters. The ARX model is defined by the equation below, 
formula
1
where is the average groundwater level at time , is the time step, i.e. , and is the precipitation surplus over . Precipitation surplus is used as the only exogenous variable, as proposed by Knotters & Bierkens (2001). is defined as the spatial average of the cumulative annual precipitation of the nearby rainfall stations and the respective time interval minus the mean annual actual evapotranspiration, i.e., 
formula
2
where, 
formula
3
and is the average precipitation over the contributing rainfall stations.
The parameters a, b determine the dynamic response of the water table, whereas c determines the average water table level if . The variable εt is a discrete-time white noise process with the following properties (where denotes the expectation operator): 
formula
4
 
formula
5
where is the error variance and is the Kronecker delta defined by if and if .

Equation (1) has a physical background. It describes the water table variations for discrete time steps assuming zero surface runoff and a linear relation between the water table level and drainage from the groundwater zone to the surface water. Besides precipitation surplus, other variables as well can be incorporated in the model and treated as exogenous (e.g. pumping rates, surface water levels) (Bierkens et al. 2001; Knotters & Bierkens 2001). It is assumed that the precipitation surplus is a global, space-invariant variable. This assumption is reasonable for relatively small areas, such as the Mires basin.

ARX groundwater level model for the Mires basin

We apply the ARX model to simulate the mean annual and biannual groundwater level of the basin. We use a recursive discrete time ARX model, which we embed in a discrete-time Kalman filter to estimate the model parameters and predict the optimal mean annual and biannual groundwater level.

Herein we propose and apply an extension of the original model which incorporates the abstraction rate of the wells operating in the basin, 
formula
6
where is an estimate of the annual abstraction rate over a time period and d an additional parameters that determines in combination with a, b and c the dynamic response of the water table. The order of the ARX model is defined by the triplet (1,0,0). The first entry denotes that Equation (6) involves values of the water table level with a maximum delay equal to one time step. The second and the third entry denote that the inputs, i.e., the precipitation surplus and the abstraction rate, do not contain delayed terms.

The actual evapotranspiration is an important variable for the model, because it enters in the estimation of the precipitation surplus. The mean annual actual evapotranspiration over the island of Crete is 70% of the mean annual precipitation (Department of Water Resources Management 2000). In areas where the surface elevation is less than 300 metres above sea level it is 75% and in the Mesara valley has been estimated at 65% (Croke et al. 2000). The Mires basin though belongs to the Mesara watershed and it has elevation lower than 300 m. In addition, according to a recent work 75% evapotranspiration was found appropriate for modelling purposes at Mires basin (Varouchakis et al. 2016b).

Kalman filter identification of ARX model

Below we show how the Kalman filter framework can be used to adaptively estimate the parameters of the ARX model. The Kalman filter comprises two sets of equations: one set predicts the state at the next time step, whereas the other set updates the predictions using available state measurements. The filter can be applied to any system described by a linear, discrete-time state-space equation given by the following general formulation (Van Geer & Van Der Kloet 1985; Eigbe et al. 1998; Ljung 1999): 
formula
7
where is the vector of system state variables at time , is the state transition matrix from time to time , is the control matrix that represents the impact of the external inputs on the state of the system at time , whereas is a Gaussian vector that accounts for random noise. The noise properties are defined by 
formula
8
 
formula
9
where is the Kronecker delta previously defined and is the covariance matrix of estimation errors defined by .
The measurement equation relating the observed state variables and the true state of the system is expressed as follows (Van Geer & Van Der Kloet 1985; Eigbe et al. 1998; Ljung 1999): 
formula
10
where is the vector of observed state variables, is the observation matrix and is the observation noise vector that accounts for measurement errors. If is the observation of the state variable then , whereas . The vector represents multi-dimensional Gaussian white-noise random processes with the following statistical properties: 
formula
11
 
formula
12
where is the covariance matrix of the observation errors. It is usually assumed that the observation errors are independent, which means that is a diagonal matrix whose elements equal the variances of the corresponding observation errors. The observation errors are also independent of the system noise, i.e.: 
formula
13

The calculations can be divided into two steps. First, Equation (7) is used to predict the state at time based on measurements up to time . Secondly, at time as the new measurement becomes available the prediction is corrected by means of Equation (10). This yields the optimal linear estimate for the state at time based on the measurements up to time . The matrices F, B, M, Q and R must be known. The matrices F and B, however, are functions of the system parameters. These parameters are not known a priori for the ARX model in the water table level application. Therefore, the Kalman filter cannot be directly applied. The following set of equations describes the KFAA that estimates recursively the parameters and the state equation output of an ARX model (Ljung 1999).

KFAA of ARX model

Linear model structures such as ARX that are equivalent to linear regressions can be expressed in the following general framework (Ljung 1999; Lanzi et al. 2006): 
formula
14
In the above equation, represents the vector of the true parameters (true description of the system), is the regression vector which is equal to the gradient of the predicted model output, , with respect to the parameter vector, and is the measurement error innovation. The predicted output based on the parameter vector up to is given by 
formula
15

where is the prediction of and is the (n1) regression vector that represents the gradient of with respect to the parameters . Since the true system parameters are unknown a priori it is assumed that . The estimation algorithm minimizes the mean square prediction error, , which means that Equation (15) is solved for all time steps using the parameters .

The general recursive parameter identification equation is: 
formula
16
where is the vector of parameter estimates at time , is the observed output at time , and is the prediction of based on observations up to time . The (n1) vector is the Kalman gain that determines the sensitivity of the current prediction error on the update of the parameters estimate.
The above formulation assumes that the true system parameters follow a random walk described by 
formula
17
where is a Gaussian white noise random vector process with covariance matrix . is the (n×n) covariance matrix of the parameter random changes at each time step .
The following general form of the Kalman gain is derived by means of the least mean square parameter estimation algorithm (Ljung 1999; Lanzi et al. 2006), 
formula
18
where the (nn) covariance matrix is given by, 
formula
19
and 
formula
20

In the above equations, is the (nn) parameter error covariance matrix at , is the residual (innovation) covariance during the parameters update process, and is the scalar variance of the innovations in Equation (14), i.e. .

The (nn) parameter error covariance matrix is updated as follows: 
formula
21

The Kalman filter algorithm is entirely specified by the sequence of data , the gradient , the parameter covariance matrix , the variance of the innovations , the true parameters or an initial guess, and the parameter error covariance matrix . The recursive estimate of the parameters and the output of the ARX model is implemented in the Matlab® programming environment using originally developed code.

For the specific application, is the (41) vector of the parameter estimates (a, b, c and d) of the state Equation (6) at time , is a (41) vector, is the (41) regression vector corresponding to each variable input involved in the state Equation (6) (, unit value for parameter c), the (44) covariance matrix used in the Kalman gain, and is a (44) parameter error covariance matrix.

RESULTS AND DISCUSSION

The modelling of temporal groundwater level fluctuations for the Mires basin aims to analyse the aquifer behaviour in conjunction with physical quantities (precipitation, pumping) that affect the aquifer's water table level. Useful information therefore can be extracted from the relationship of mean annual groundwater level, annual precipitation and annual pumping as well as from biannual groundwater level, precipitation and pumping for future predictions.

The data used for the model calibration span the interval from the hydrological years 1980–1981 to 2005–2006 as during the last periods of the time span the extreme values of the dataset are present. The groundwater level is predicted consecutively for the hydrological years 2006–2007 to 2009–2010 and during time windows of four, three and two years. The model initially is run separately to predict the groundwater level for each year period as the number of time steps is small. Therefore, each time the model is run the optimal parameters of the system at the last time step are calculated. Based on these parameters the groundwater level prediction at the next time is provided considering the corresponding precipitation surplus and pumping. Every prediction is then added consecutively in to the model in order for the estimate at the next time period to be performed. Therefore, the system for each tested time period is updating the optimal parameters. Similarly, the model is run for longer time windows in order for the efficiency of the model to be assessed.

In contrast to the initial ARX model which uses the precipitation surplus as the only exogenous variable (Bierkens et al. 2001), we added a term proportional to the annual abstraction rate (pumping term), Equation (6). We then determined the ARX model parameters using the KFAA without and with the pumping term, first for the annual data and then for the biannual. The groundwater levels, the precipitation, and abstraction rates for the aforementioned time periods are known or can be determined at annual and biannual basis. Therefore, the groundwater levels predicted by the model can be validated against the real values in terms of two well known statistical metrics, the mean absolute error (MAE) (m) and the mean absolute relative error (MARE, %). As shown in Figures 3 and 4, calibration improves as more data are processed. In addition in Tables 1 and 2 the prediction results at different time windows are presented. The one step forward prediction is not optimal to assess the general model efficiency. Therefore the prediction window is increased in order to assess the model's effectiveness. The prediction window covers consecutively the last four years of the available dataset. The reason for such a prediction window is that at the previous steps the higher and lower fluctuations are included and cannot be ignored by the training process.
Table 1

Temporal validation results of the mean annual groundwater level data using MAE, MARE - ARX with precipitation surplus

Period 2007/08–2009/10 2008/09–2009/10 2006/07–2009/10 
3-year window 2-year window single-year window 
MAE (m) 2.86 1.83 0.93 
MARE 0.25 0.16 0.09 
Period 2007/08–2009/10 2008/09–2009/10 2006/07–2009/10 
3-year window 2-year window single-year window 
MAE (m) 2.86 1.83 0.93 
MARE 0.25 0.16 0.09 
Table 2

Temporal validation results of the mean annual groundwater level data using MAE, MARE - ARX with precipitation surplus and pumping

Period 2007/08-2009/10 2008/09-2009/10 2006/07-2009/10 
3-year window 2-year window single-year window 
MAE (m) 1.86 1.65 0.81 
MARE 0.17 0.14 0.07 
Period 2007/08-2009/10 2008/09-2009/10 2006/07-2009/10 
3-year window 2-year window single-year window 
MAE (m) 1.86 1.65 0.81 
MARE 0.17 0.14 0.07 
Figure 3

ARX model results using precipitation surplus as exogenous variable. Circles denote the measurements; blue crosses denote the ARX model output for a three-year prediction window; red boxes denote the ARX model predictions for a two-year prediction window and red stars for a single-year prediction window; VP denotes the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 3

ARX model results using precipitation surplus as exogenous variable. Circles denote the measurements; blue crosses denote the ARX model output for a three-year prediction window; red boxes denote the ARX model predictions for a two-year prediction window and red stars for a single-year prediction window; VP denotes the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 4

ARX model results using precipitation surplus and pumping rate as exogenous variables. Circles denote the measurements; blue crosses denote the ARX model output for a three-year prediction window; red boxes denote the ARX model predictions for a two-year prediction window and red stars for a single-year prediction window; VP denote the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 4

ARX model results using precipitation surplus and pumping rate as exogenous variables. Circles denote the measurements; blue crosses denote the ARX model output for a three-year prediction window; red boxes denote the ARX model predictions for a two-year prediction window and red stars for a single-year prediction window; VP denote the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

The parameters of the last single-year validation period are considered optimal based on the available data and can be used for future predictions. In order to test their efficiency they are inserted as initial parameters in the ARX model. From Figure 5 it can be observed that the model's adaptation is faster and better compared to the model's behaviour presented in Figures 3 and 4, where random values for initial parameters were used. A similar behaviour is observed if the parameters for the three-year prediction window are used as initial parameters, while the predictions are also improved. That similar behaviour is due to the similar parameter values that are obtained after a significant number of time steps. The two-year prediction window also delivers very similar calibration results and predictions in between the three-year and the single-year prediction windows. Therefore, to avoid confusion at the plotted results the outcomes of the two latter time windows only are presented. In addition, as will be explained below the four-year prediction window results are not presented as the observed values are calculated with significant error.
Figure 5

ARX model results using precipitation surplus and pumping rate as exogenous variables. Black circles denote the measurements; blue crosses denote the ARX model output using random initial parameters for a single-year prediction window; red boxes denote the ARX model output using as initial parameters the optimal calculated from the last step during the validation process for the same prediction window; black stars denote the predictions using as initial parameters those calculated for the three-year prediction window; VP denote the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 5

ARX model results using precipitation surplus and pumping rate as exogenous variables. Black circles denote the measurements; blue crosses denote the ARX model output using random initial parameters for a single-year prediction window; red boxes denote the ARX model output using as initial parameters the optimal calculated from the last step during the validation process for the same prediction window; black stars denote the predictions using as initial parameters those calculated for the three-year prediction window; VP denote the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

In this work, mean annual values have been used mainly because of pumping data shortage. Modelling of groundwater level at annual scale is useful because it can help to combine the proposed model with climate data models that usually provide information at annual scale (Tsanis et al. 2011; IPCC 2014). In addition, many water resources management reports regarding catchment management refer to actions at annual scale (Chen et al. 2002; Barthel 2011; Bozzola & Swanson 2014).

However, biannual groundwater level and precipitation data (including precipitation surplus) are available, while biannual pumping can be calculated as has been previously explained. Therefore, the proposed model can also be set at a biannual basis. The sub-annual model formation provides a more detailed approach to the groundwater level fluctuations at the basin and a clearer assessment of the proposed model efficiency and applicability.

The annual scale model predictions (Tables 1 and 2; Figures 3 and 4) only provide reliable results for smaller than a four-year prediction window. This is due to the extreme low value present at the training set that is not captured appropriately by the model. The parameters calculated at that step support the model to predict the groundwater level at a four-year period, however, only with high prediction error. On the other hand the model predictions at biannual basis are reliable at a four-year (eight biannual periods) prediction window (Tables 3 and 4; Figures 6 and 7) as the biannual values of that time step are captured well by the model. In both model forms (annual and biannual) it is clearly better the model efficiency when the pumping term is used.
Table 3

Temporal validation results of the biannual groundwater level data using MAE, MARE - ARX with precipitation surplus

Period 2006/07–2009/10 2006/07–2009/10 
4-year window single-year window 
MAE (m) 2.65 1.72 
MARE 0.27 0.20 
Period 2006/07–2009/10 2006/07–2009/10 
4-year window single-year window 
MAE (m) 2.65 1.72 
MARE 0.27 0.20 
Table 4

Temporal validation results of the biannual groundwater level data using MAE, MARE - ARX with precipitation surplus and pumping

Period 2006/07–2009/10 2006/07–2009/10 
4-year window single-year window 
MAE (m) 1.45 0.92 
MARE 0.20 0.14 
Period 2006/07–2009/10 2006/07–2009/10 
4-year window single-year window 
MAE (m) 1.45 0.92 
MARE 0.20 0.14 
Figure 6

ARX model results using precipitation surplus as exogenous variable. Circles denote biannual measurements; blue crosses denote the ARX model output for a four-year (eight periods biannual) prediction window; red dots denote the ARX model predictions for a single-period (biannual) prediction window; VP denotes the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 6

ARX model results using precipitation surplus as exogenous variable. Circles denote biannual measurements; blue crosses denote the ARX model output for a four-year (eight periods biannual) prediction window; red dots denote the ARX model predictions for a single-period (biannual) prediction window; VP denotes the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 7

ARX model results using precipitation surplus and pumping rate as exogenous variables. Circles denote biannual measurements; blue crosses denote the ARX model output for a four-year (eight periods biannual) prediction window; red dots denote the ARX model predictions for a single-period (biannual) prediction window; VP denote the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

Figure 7

ARX model results using precipitation surplus and pumping rate as exogenous variables. Circles denote biannual measurements; blue crosses denote the ARX model output for a four-year (eight periods biannual) prediction window; red dots denote the ARX model predictions for a single-period (biannual) prediction window; VP denote the validation period. Please refer to the online version of this paper to see this figure in colour: http://dx.doi.org/10.2166/hydro.2016.063.

At an annual basis the differences might be small (Tables 1 and 2) but they are still significant for prediction accuracy. The model form with the abstraction term is more accurate in terms of MAE by 54% for the three-year window predictions, 11% for the two-year window predictions and 14% for the single-year window predictions compared to the original model form. On the other hand, at the biannual basis the calibration period (Figures 6 and 7) has a similar behaviour with the calibration period of mean annual levels. The four-year window predictions (Tables 3 and 4) were improved from the presence of the pumping term in the model by 82% while, the single-year window predictions by 84%. Similarly, this applies for the three- and two-year window predictions where an improvement of 75% occur. As observed in Figures 37, as the time window becomes shorter the predictions also become more accurate. However, the ARX model with the support of KFAA can provide accurate results for longer time windows. According to the results it is clearly obtained that the pumping term addition in the ARX model improves the model's efficiency and the accuracy of results especially in the longer prediction windows and with the biannual data.

The general aim of this paper is to introduce the KFAA in hydrological applications. The specific goal is to model the Mires basin aquifer temporal response. The ARX model Equation (6) used for the mean annual and biannual groundwater level modelling delivers accurate predictions as evidenced by the low validation errors (Tables 14). The model is calibrated for the period 1981–2006 whereas the period 2007–2010 is used as the validating period. The ARX model is embedded in a discrete-time Kalman filter, which is calibrated adaptively up to the period preceding the prediction window. We include in the calibration period of the ARX model the recent extreme groundwater level fluctuations (year 2002–2003) in order to calibrate the model more efficiently. As observed in Figures 37 the accuracy of the estimates improves as more of the training data are processed by the method whereas the recent extremes are captured reasonably well by the model.

According to Equation (6), the mean annual groundwater level value estimate at time depends on groundwater level measured at the previous step, rainfall and/or abstraction volume for the specified prediction period , and the estimated parameters that describe the aquifer dynamic response. The optimal parameter values are adaptively updated as new data values become available. The validation results of the two ARX model forms show that the predictions for the years beyond 2006 follows closely the groundwater level trend providing accurate results for the different time windows assessed. In addition, the predictions can be used in place of measurements to calculate the parameters for future predictions.

ARX estimates will be mostly reliable for short time periods up to a year ahead for future predictions, as long term or even short term predictions of rainfall intensity involve high uncertainty. In addition the abstraction volume rate is also uncertain, because it depends on anthropogenic activities that cannot be accurately predicted for longer than a year. The tables of results and the figures show that the incorporation of the abstraction rate in Equation (6) improves the estimates. The improvement might be small in the single year prediction window of annual data, but considerable for biannual and longer prediction periods. Overall, the model adaptation is better when the abstraction rate is applied.

The initial data analysis showed that the precipitation is primarily correlated to the groundwater level while pumping has a lower correlation. However, the declining trend results from the extensive pumping and the poor recharge, while the effect of pumping along with the precipitation on the groundwater level is combined. Therefore, the proposed model is trained for an initial period that includes the extremes of all the variables involved, in order for the parameters of the model to be appropriately determined after every update so as to follow the best possible primary variable trend. The correlation with the groundwater level might be lower compared to precipitation, but the pumping variable helps the model to better capture the level's trend. If the correlation of the groundwater level was favoured to pumping and the model did not capture well the groundwater level with its original form, then the addition of the pumping variable would have more clearly improved the estimations.

The annual precipitation overall has declined at a rate of around 5.5 mm/yr over the past 30 years, while pumping has an increasing trend of 0.22 Mm3/yr. This means that a groundwater deficit of almost 7 Mm3 has been created in the basin and therefore a declining trend exists. The latter value agrees also with the findings of the recent water resources management report of the region of Crete for the area (Special Water Secretariat of Greece 2015).

The strong negative correlation between rainfall and pumping leads to a similar effect to the groundwater level as both affect the groundwater level in the same direction. Increased rainfall and decreased pumping both lead to groundwater level increase, while the opposite also applies. Thus, the original model was trained first to capture the declining trend according to the rainfall variability and then pumping was added which follows closely the opposite of precipitation trend. Therefore, because of this correlation, similar but improved results (effect) are expected from the additional use of the pumping variable, as indeed have been observed. The use of pumping improves the model results by capturing more accurately the groundwater level trend and this is presented clearer in the modelling results with the biannual data. This is also the reason to include other auxiliary variables at the original equation (Bierkens et al. 2001). The falling trend is already captured by the use of the rainfall and the use of a correlated variable, such as pumping, reinforces the predicting capability of the model.

The root-mean square prediction error (RMSPE) metric is employed to assess the adaptation of the ARX model to the measurement values. RMSPE is described by the following equation, 
formula
22
where is the predicted groundwater level, is the corresponding measured value, and K is the length of the time series. After initial fluctuations (Figure 4) the adaptation of the ARX model with precipitation and abstraction rate is better, leading to m, than for the one ARX model with the precipitation surplus only (Figure 3), which leads to m. The former ARX estimates initially display considerable fluctuations, 9.6 m for the 1981–1995 period, which are then reduced to 5.8 m for the 1996–2006 period, according to Equation (22). Similar occurs for the biannual data approach where for the first approach (Figure 7) m while, for the second (Figure 6) m. Thus, the abstraction rate is significant for the modeling of the groundwater level temporal variability.

Finally, future ARX predictions should involve mainly reliable rainfall estimations from climate models or from statistical approaches. In addition, improvement of ARX model predictions in longer time windows should involve more data on shorter time scales and knowledge of potential pumping activity of unauthorized wells.

CONCLUSIONS

This work was initially motivated by the dramatic decrease in recent years of groundwater levels in Mesara valley due to overexploitation. In light of this development, the sparse monitoring of the basin and the expected adverse effects of climate change on water resources accurate temporal modeling of the groundwater level is necessary.

We use an ARX time series model to relate mean annual and biannual groundwater level to respective precipitation surplus and abstraction rates. The ARX model is embedded in a Kalman filter under the adaptation algorithm functionality to adaptively estimate the model parameters. After initial considerable fluctuations, the ARX model accurately tracks the level's temporal evolution. The ARX model estimates become progressively more accurate as more data are incorporated, and the model parameters are recursively refined during each update. The recursive nature of the parameter inference procedure implies that the model becomes more accurate as the length of the time series increases. As shown, the groundwater level in the Mires basin exhibits a definite declining trend. The model captures this trend accurately for both time series, captures the level's extremes and accurately predicts the groundwater level for the time periods 2007–2010. A reliable prediction of future groundwater levels for the Mires basin can be used to develop sound management plans for the exploitation of the groundwater resources in the area. For example, the impact of different scenarios for pumping activity and precipitation trends can be investigated. Model predictions should be incorporated with realistic estimates of the water needs of the area, leading to management recommendations such as control of pumping rates, especially during years when the groundwater level is expected to drop.

REFERENCES

REFERENCES
Abe
S.
2005
Support Vector Machines for Pattern Classification
.
Springer
,
Dordrecht
,
The Netherlands
.
Anguita
D.
Ghio
A.
Greco
N.
Oneto
L.
Ridella
S.
2010
Model selection for support vector machines: Advantages and disadvantages of the Machine Learning Theory
. In:
The 2010 International Joint Conference on Neural Networks (IJCNN)
,
Barcelona
, pp.
1
8
.
Bierkens
M. F. P.
Knotters
M.
Hoogland
T.
2001
Space-time modeling of water table depth using a regionalized time series model and the Kalman filter
.
Water Resources Research
37
,
1277
1290
.
Chen
Z.
Grasby
S. E.
Osadetz
K. G.
2002
Predicting average annual groundwater levels from climatic variables: an empirical model
.
Journal of Hydrology
260
,
102
117
.
Chen
L.
Chen
C.
Lin
D.
2010
Application of integrated back-propagation network and self-organizing map for groundwater level forecasting
.
Journal of Water Resources Planning and Management
137
,
352
365
.
Christakos
G.
2000
Modern Spatiotemporal Geostatistics
.
Oxford University Press
,
New York
.
Croke
B.
Cleridou
N.
Kolovos
A.
Vardavas
I.
Papamastorakis
J.
2000
Water resources in the desertification-threatened Messara Valley of Crete: estimation of the annual water budget using a rainfall-runoff model
.
Environmental Modelling and Software
15
,
387
402
.
Daliakopoulos
I. N.
Coulibaly
P.
Tsanis
I. K.
2005
Groundwater level forecasting using artificial neural networks
.
Journal of Hydrology
309
,
229
240
.
Department of Water Resources Management
2000
Integrated Water Resources Management of Crete, Prefecture of Crete
.
Planning and Development Section
,
Heraklion
(in Greek)
.
Eigbe
U.
Beck
M. B.
Wheater
H. S.
Hirano
F.
1998
Kalman filtering in groundwater flow modelling: problems and prospects
.
Stochastic Hydrology and Hydraulics
12
,
15
32
.
Fabbri
P.
Gaetan
C.
Zangheri
P.
2011
Transfer function-noise modelling of an aquifer system in NE Italy
.
Hydrological Processes
25
,
194
206
.
Fallah-Mehdipour
E.
Bozorg Haddad
O.
Mariño
M. A.
2013
Prediction and simulation of monthly groundwater levels by genetic programming
.
Journal of Hydro-Environment Research
7
,
253
260
.
Famiglietti
J.
2008
Groundwater. Report Terrestrial ECV 03
.
Food and Agriculture Organization of the United Nations
,
Rome
,
Italy
.
Giustolisi
O.
Savic
D. A.
2006
A symbolic data-driven technique based on evolutionary polynomial regression
.
Journal of Hydroinformatics
8
,
207
222
.
Giustolisi
O.
Doglioni
A.
Savic
D. A.
Di Pierro
F.
2008
An evolutionary multiobjective strategy for the effective management of groundwater resources
.
Water Resources Research
44
.
doi: 10.1029/2006WR005359.
Haitjema
H. M.
Mitchell-Bruker
S.
2005
Are water tables a subdued replica of the topography
?
Groundwater
43
,
781
786
.
IPCC
2014
Climate Change 2014–Impacts, Adaptation and Vulnerability: Regional Aspects
.
Cambridge University Press
,
Cambridge
,
UK
.
Izady
A.
Davary
K.
Alizadeh
A.
Moghaddam Nia
A.
Ziaei
A. N.
Hasheminia
S. M.
2013
Application of NN-ARX model to predict groundwater levels in the Neishaboor Plain, Iran
.
Water Resources Management
27
,
4773
4794
.
Knotters
M.
Bierkens
M. F. P.
2000
Physical basis of time series models for water table depths
.
Water Resources Research
36
,
181
188
.
Knotters
M.
Bierkens
M. F. P.
2002
Accuracy of spatio-temporal RARX model predictions of water table depths
.
Stochastic Environmental Research and Risk Assessment
16
,
112
126
.
Kritsotakis
M.
Tsanis
I.
2009
An integrated approach for sustainable water resources management of Messara basin, Crete, Greece
.
European Water
27
,
15
30
.
Lanzi
P. L.
Loiacono
D.
Wilson
S. W.
Goldberg
D. E.
2006
Prediction update algorithms for XCSF: RLS, Kalman filter, and gain adaptation
. In:
Proceedings of the 8th annual conference-Genetic and Evolutionary Computation Conference (GECCO)
,
ACM
, pp.
1505
1512
.
Ljung
L.
1999
System Identification – Theory for the User
.
Prentice Hall
,
Upper Saddle River, NJ
.
Margat
J.
Van der Gun
J.
2013
Groundwater Around the World: a Geographic Synopsis
.
CRC Press
,
Boca Raton
,
USA
.
Remesan
R.
Mathew
J.
2015
Hydrological Data Driven Modelling
.
Springer
,
Dordrecht
,
The Netherlands
.
Solomatine
D. P.
Ostfeld
A.
2008
Data-driven modelling: some past experiences and new approaches
.
Journal of Hydroinformatics
10
,
3
22
.
Special Water Secretariat of Greece
2015
River basin management report for the water sector of Crete
.
Ministry of Environment & Energy
,
Athens
(in Greek).
Suryanarayana
C.
Sudheer
C.
Mahammood
V.
Panigrahi
B. K.
2014
An integrated wavelet-support vector machine for groundwater level prediction in Visakhapatnam, India
.
Neurocomputing
145
,
324
335
.
Tapoglou
E.
Karatzas
G. P.
Trichakis
I. C.
Varouchakis
E. A.
2014
A spatio-temporal hybrid neural network-Kriging model for groundwater level simulation
.
Journal of Hydrology
519
(
Part D
),
3193
3203
.
Tichy
M.
1993
Applied Methods of Structural Reliability
.
Springer
,
Dordrecht
,
The Netherlands
.
Todini
E.
2007
Hydrological catchment modelling: past, present and future
.
Hydrology and Earth System Sciences
11
,
468
482
.
Trichakis
I. C.
Nikolos
I. K.
Karatzas
G. P.
2011
Artificial neural network (ANN) based modeling for Karstic groundwater level simulation
.
Water Resources Management
25
,
1143
1152
.
Tsanis
I. K.
Koutroulis
A. G.
Daliakopoulos
I. N.
Jacob
D.
2011
Severe climate-induced water shortage and extremes in Crete
.
Climatic Change
106
,
667
677
.
Van Geer
F. C.
Van Der Kloet
P.
1985
Two algorithms for parameter estimation in groundwater flow problems
.
Journal of Hydrology
77
,
361
378
.
Varouchakis
E. A.
2016
Integrated water resources analysis at basin scale: a case study in Greece
.
Journal of Irrigation and Drainage Engineering (ASCE)
142
,
05015012
.
Varouchakis
E. A.
Hristopulos
D. T.
Karatzas
G. P.
2012
Improving Kriging of groundwater level data using nonlinear normalizing transformations-a field application
.
Hydrological Sciences Journal
57
,
1404
1419
.
Varouchakis
E. A.
Kolosionis
K.
Karatzas
G. P.
2016
Spatial variability estimation and risk assessment of the aquifer level at sparsely gauged basins using geostatistical methodologies
.
Earth Science Informatics
9
(
4
),
437
448
.
Varouchakis
E. A.
Spanoudaki
K.
Hristopulos
D. T.
Karatzas
G. P.
Corzo Perez
G. A.
2016b
Stochastic modeling of aquifer level temporal fluctuations based on the conceptual basis of the soil-water balance equation
.
Soil Science
181
,
224
231
.
Von Asmuth
J. R.
Knotters
M.
2004
Characterising groundwater dynamics based on a system identification approach
.
Journal of Hydrology
296
,
118
134
.
Von Asmuth
J. R.
Bierkens
M. F. P.
Maas
K.
2002
Transfer function-noise modeling in continuous time using predefined impulse response functions
.
Water Resources Research
38
,
231
243
.
Von Asmuth
J. R.
Maas
K.
Bakker
M.
Petersen
J.
2008
Modeling time series of ground water head fluctuations subjected to multiple stresses
.
Groundwater
46
,
30
40
.
Webster
R.
Heuvelink
G. B. M.
2006
The Kalman filter for the pedologist's tool kit
.
European Journal of Soil Science
57
,
677
692
.