## Abstract

Spatiotemporal geostatistical analysis of groundwater levels is a significant tool for groundwater resources management. This work presents a valid spatiotemporal geostatistical model for the groundwater level variations of an aquifer in Crete, Greece. The goal of this approach is to accurately map the aquifer level at variable time-steps using joint space–time information. The proposed model applies the space–time ordinary kriging (STOK) methodology using joint space–time covariance functions. A space–time experimental variogram is determined from the monthly groundwater level time-series between the hydrological years 2009 and 2014 at 11 sampling stations. The experimental spatiotemporal variogram is successfully fitted by the product–sum model using a Matérn spatial and temporal function. STOK was used to predict the monthly groundwater level at each sampling station from January to May 2015. Validation results show low prediction errors that range from 0.95 to 1.45 m, while the kriging variance accurately determines the variability of predictions. Maps of groundwater level predictions and uncertainty are developed for significant months of the validation period to assess the aquifer spatiotemporal variability. This work demonstrates that space–time geostatistics can successfully model the spatial dynamic behaviour of an aquifer when the space–time dependencies are appropriately modelled, even for a sparse dataset.

## INTRODUCTION

Geostatistical analyses usually deal with the spatial distribution and variability of measured data. In areas with spatial and temporal data availability, the application of space–time models can improve the reliability of predictions by incorporating space–time correlations instead of purely spatial ones (Lee *et al.* 2010). Spatiotemporal continuity provides a more stable base for exploring dynamic processes that evolve in time and space, since temporal neighbours can be as informative and useful as spatial neighbours. A spatial-only analysis would be less advantageous than a space–time analysis of such processes because it would ignore the spatial and temporal nature of the involved dependencies. A major advantage of space–time distributed data, especially when they are sparse, is that a larger number of data points are applied to support model parameter estimation and prediction.

In a statistical context, these data points can be considered random fields spread out in space and evolving in time (space–time random fields or S/TRF). Space–time geostatistical analysis is based on the joint spatial and temporal dependence between observations. There are two ways to represent space–time random variables (Christakos 1991): (1) full space–time models using separable or non-separable space–time covariance functions or variograms and (2) vectors of temporally correlated spatial random fields (SRFs) , where *T* is the number of temporally correlated SRFs or vectors of spatially correlated time series , where *n* is the number of locations. The representation depends on the domain density (space or time).

Usually, spatiotemporal interpolation is performed by applying the standard kriging algorithms extended in a space–time frame. The space–time kriging method employs the first type of model previously described. The two main tasks of space–time analysis are interpolation and extrapolation. The first refers to the estimation of variable values at unmeasured locations inside the spatial extent of the study area, while the latter extends the estimations ahead of the boundaries of the observations in space or time. The main assumption used in interpolation and extrapolation is that the specific patterns extracted from the available data analysis deliver sufficient information to capture the spatiotemporal dynamics of the observed data (Lee *et al.* 2010).

Space–time geostatistical approaches have been successfully applied to model hydrological data variability. Rouhani & Hall (1989) applied space-time kriging in geohydrology, using intrinsic random functions (polynomial spatiotemporal covariance) for the space–time geostatistical analyses of piezometric data. Rouhani & Myers (1990) discussed potential drawbacks of the space–time geostatistical analysis of geohydrological data (piezometric data). More recently, space-time kriging was applied to estimate the water level of the Queretaro-Obrajuelo aquifer (Mexico) using a product–sum model with spherical components on a large space–time dataset (Júnez-Ferreira & Herrera 2013) and to map the seasonal fluctuation of water-table depths in Dutch nature conservation areas using a metric space–time exponential variogram model (Hoogland *et al.* 2010). Furthermore, the space–time ordinary kriging (STOK) method was used to design rainfall networks and analyse precipitation variations in space and time (Rodriguez-Iturbe & Mejia 1974; Biondi 2013; Raja *et al.* 2016), and it was tested in a comparison study for estimating runoff time series in ungauged locations (Skøien & Blöschl 2007).

Space-time kriging has also been used in a wide range of scientific fields and areas, such as agriculture (Stein *et al.* 1998; Heuvelink & Egmond 2010), atmospheric data (De Iaco *et al.* 2002; Nunes & Soares 2005), soil science/water content (Snepvangers *et al.* 2003; Jost *et al.* 2005), surface temperature data (Hengl *et al.* 2011), wind data (Gneiting 2002), gamma radiation data (Heuvelink & Griffith 2010), epidemiology (Gething *et al.* 2007) and forecasting municipal water demand (Lee *et al.* 2010).

Sparsely monitored watersheds are not regularly monitored through space and time, and therefore data availability is a factor limiting purely spatial or temporal analysis. This problem, and the associated challenges around uncertainty of boundary conditions, means that it is difficult to set up a dynamic numerical model. However, the combination of measured data can create a very useful dataset for spatiotemporal modelling and analysis by incorporating joint spatiotemporal correlations.

Space–time geostatistical analysis could therefore be used as a surrogate to model the aquifer behaviour in space and time and to identify basin locations of significant interest to aid the water resources management of the area. In this work, space–time geostatistical approaches in terms of STOK for the spatiotemporal monitoring and prediction of the groundwater level in a sparsely gauged basin were applied. Space–time dependencies were determined by calculating a space–time experimental variogram from a monthly groundwater level time series between the years 2009 and 2014 at 11 sampling stations that was fitted to an appropriate theoretical spatiotemporal variogram model. The purpose of this work was therefore to model the dynamic behaviour of the aquifer and to predict the water-table spatial variation at selected time steps (January–May 2015) by exploiting the available space–time information. The efficiency of the approach is evaluated in a real case study.

## STUDY AREA AND DATA

The Mires Basin of the Messara Valley is located on the island of Crete in Greece. The basin has an area of almost 50 km^{2}, and is crossed by a river of intermediate flow. The aquifer is unconfined, consisting primarily of homogeneously distributed alluvial deposits. Although it is the largest of the island, it is sparsely monitored. Since 1981, the groundwater resources have been significantly reduced due to over-pumping (Varouchakis 2016a). Between 2003 and 2009 the basin was rarely monitored, but from 2009 to 2015 (until May), the local authorities and owners performed monthly monitoring at 11 wells (Figure 1). However, gaps were present in the time series, and these were substituted by applying an autoregressive exogenous variable (ARX) model to each well that used the precipitation surplus as an exogenous variable (Varouchakis 2016b). The accuracy of the ARX results was evaluated in comparison to reported basin averages to validate the proposed spatiotemporal model.

Several different techniques have previously been applied at the case study site. Groundwater flow modelling and aquifer level estimation were performed using the MODFLOW code focussing on the early 2000s (Kritsotakis & Tsanis 2009). At that time, extensive fieldwork was performed in the area due to the preparation of the island's first water resources management plan. Another study in the area estimated the annual groundwater withdrawal based on the surface and groundwater hydrological balance, providing the groundwater level temporal variation in the basin (Tsanis & Apostolaki 2009). Artificial neural networks have also been applied for the temporal modelling of the basin's groundwater level (Tsanis *et al.* 2008), while for the same purpose but using more recent data, an ARX model using a Kalman filter has been used as well (Varouchakis 2016b). In addition, projections of surface water resources availability have been studied based on climate scenarios predicting a decreasing trend during the next decades, connecting these results to lower recharge rates and, therefore, reduced aquifer levels (Koutroulis *et al.* 2013). Spatial analysis of aquifer levels using geostatistical techniques and auxiliary information providing locations of significant importance for water resources management purposes has also been applied in the area for different hydrological years (Varouchakis & Hristopulos 2013; Varouchakis *et al.* 2016). All the aforementioned studies have provided the aquifer status temporally or spatially. Only the numerical approach that used the MODFLOW model has studied the aquifer in a space–time context but based on hydrogeological data.

This work employs a data-driven approach that combines space–time groundwater level data, estimating their interdependence to predict the aquifer level at unvisited locations and different time steps. This study is important, as by developing an appropriate geostatistical space–time model, future aquifer level spatiotemporal variations can be estimated and used in groundwater resources management plans. In addition, the model can be used to continuously process new space–time data to improve the spatial and temporal predictions.

Using spatiotemporal geostatistics, the groundwater level dataset can be usefully exploited in order to identify the spatiotemporal behaviour of the aquifer and to obtain useful information regarding the space–time data correlations for more accurate space–time predictions. Space–time geostatistical analysis involved the following steps: (1) space–time variogram calculation, (2) application of STOK for prediction, and (3) estimation of prediction accuracy.

## SPATIOTEMPORAL GEOSTATISTICS

The main goal of space–time analysis is to model multiple time series of data at spatial locations where a distinct time series is allocated. The time variable is considered an additional dimension in geostatistical prediction. A spatiotemporal stochastic process can be represented by where the variable of interest of random field *Z* is observed at *N* space–time coordinates , while the optimal prediction of the variable in space and time is based on (Christakos 1991; Cressie & Huang 1999).

### Spatiotemporal two-point function

*d*is the space dimension), and is the temporal domain, with expected value (Myers

*et al.*2002): , and covariance function: where , , . The covariance function depends only on the lag vector and not on location or time, while it must satisfy the positive-definiteness condition in order to be a valid covariance function. Hence, for any , any real , and any positive integer

*N*, must satisfy the following inequality:

*second-order stationary*. Spatial and spatiotemporal geostatistical prediction methodologies generally rely on stationarity (stationary mean and covariance or variogram). In addition, the field is isotropic if: meaning that the covariance function depends only on the length of the lag.

*var*denotes the variance. The function only depends on the lag vector . The quantity is called the semi-variance at lag .

The primary concern when modelling space–time structures is to ensure that the chosen model is valid and that the model is suitable for the data. The space–time kriging estimator can be applied if the space–time covariance function satisfies the positive definiteness condition, , explained above (Cressie & Huang 1999). The model's suitability is ensured by testing a series of available structures on the data. The variogram function must be conditionally negative definite to ensure that the space–time kriging equations have valid unique solutions (Myers *et al.* 2002; De Iaco 2010).

The STOK method is a well-established method for space–time interpolation, and the significant part in the space–time process is the choice of the variogram or covariance model and the estimation of its parameters (Christakos *et al.* 2001; De Cesare *et al.* 2001). Two categories of models are used for variogram or covariance modelling. The first includes separable models whose covariance function is a combination of a spatial and a temporal covariance function; the second includes non-separable models, usually physically based, in which a single function models the spatiotemporal data dependence. Separable models, however, may suffer from unrealistic assumptions and properties (Snepvangers *et al.* 2003; Hengl *et al.* 2011). Both space–time covariance models are valid (Cressie & Huang 1999; De Iaco *et al.* 2001, 2002). In this work, pure non-separable covariance functions are not examined, as those developed so far for space–time use do not concern the topic of groundwater (Cressie & Huang 1999; De Iaco *et al.* 2001; Gneiting 2002; Kolovos *et al.* 2004; Porcu *et al.* 2008).

### Spatiotemporal covariance or variogram models

This work examines the efficiency of two well-known space–time variogram functions in groundwater level data, and a comprehensive description of them follows.

*et al.*2001, 2002) is a generalisation of the product and the sum model, while it constitutes the starting point for its integrated product–sum versions. It is defined as: are purely spatial and temporal covariance models with , and . In terms of the variogram, the above equation is expressed as:

*et al.*2001): where .

### Summary of spatiotemporal models' characteristics

The product model and the product–sum model are produced by separate space and time functions. The main advantage of such models is their ease of use in modelling and estimation. The product model is separable and integrable, while the product–sum model is non-integrable with respect to and and non-separable. The space–time variogram models described above are typically used to model space–time experimental variograms, because an arbitrary space–time metric is not required and the fitting process is similar to that for spatial variograms (Gething *et al.* 2007; De Iaco 2010).

### Spatiotemporal geostatistical analysis and prediction

**r**

*= ||*

_{s}*s*||,

_{i}–s_{j}*r*= |

_{t}*t*| and

_{i}– t_{j}*N*() is the number of pairs in

*N*(). The space–time experimental variogram is estimated as half the mean squared difference between data separated by a given spatial and temporal lag .

*μ*compensating for the uncertainty of the mean value: The prediction is also described in matrix notation below where the system is solved to estimate the spatiotemporal weights : where is the matrix of the spatiotemporal variogram between the observed space–time data locations, are the spatiotemporal weights and

**c**is the matrix of the spatiotemporal variogram between the observed space–time data locations and the space–time estimation location.

Space–time predictions are usually based on a space–time neighbourhood that encloses observations inside a search radius in space and time; the search radii depend on the space and time correlation lengths *x*_{s}, *x*_{t} estimated from the variogram fitting process. For small datasets, the entire dataset is used for predictions.

### Spatiotemporal prediction of groundwater level data in Mires Basin

First, the experimental variogram is determined. Then, it is modelled with theoretical spatiotemporal variogram functions. The product sum and the product models were applied here, as they have been successfully applied in other space–time studies, providing better results than their counterparts (Gething *et al.* 2007; De Iaco 2010). Their main characteristic is their flexibility in modelling and estimation. The Matérn variogram model was chosen to simulate the spatial and temporal continuity of the data within the space–time models. The purely spatial geostatistical analysis of groundwater level data in previous works at different time periods has shown that the Matérn model describes the spatial correlation of the observed data very well (Varouchakis & Hristopulos 2013).

*ξ*is the range parameter, is the smoothness parameter,

*Γ*is the gamma function, Κ is the modified Bessel function of the second kind of order

*,*is the space lag vector and is the time lag. The Matérn variogram is a bounded model that reaches the sill asymptotically. For , the exponential model is recovered, whereas the Gaussian model is obtained for tending to infinity. The case was introduced by Whittle (Pardo-Iguzquiza & Chica-Olmo 2008). The functions above are inserted in the product (8) and product–sum (10) space–time variogram structures.

Next, the STOK methodology is applied, performing cross-validation to examine the efficiency of the proposed model. Cross-validation of the estimates was performed at the 11 observation wells for the first five months of 2015. The geostatistical analysis was performed using originally developed codes in the Matlab^{®} programming environment (Matlab v.7.10), while standardised coordinates in the interval [0,1] were used to avoid numerical instabilities.

## RESULTS AND DISCUSSION

Spatiotemporal geostatistical analysis of Mires Basin groundwater level data was applied in order to identify the monthly spatiotemporal behaviour of the aquifer since 2009 and to undertake predictions based on the space–time data correlations. The space–time experimental variogram was determined from the monthly groundwater level time series at the 11 sampling stations for the period 2009–2014.

The theoretical space–time variogram model fitted using the product–sum and the product variogram structures on the experimental space–time variogram obtained from the observed data are presented in Figure 2. The separate spatial and temporal variograms of the spatial and temporal averages of the data series are also presented. It can be observed that the Matérn function fits the experimental variograms very well, proving that it is appropriate for the joint space–time modelling of the data interdependence. According to Figure 2, the product–sum model clearly provides a better fit, as it completely captures the experimental variogram's trend. The respective parameters for the product–sum variogram type are = 44.22 m^{2}, = 0.25 (≈3 km), = 1.51, ≈ 5 months and = 0.81 with , and .

The prediction involves STOK application to estimate the groundwater level at the specified location and time during the period January–May 2015. The validation results for the spatial monthly average using both the examined space–time variogram models are presented in Tables 1 and 2 in terms of well-known statistical measures, such as the mean error, mean absolute error, mean absolute relative error, root mean square error and coefficient of determination.

Month | MAE (m) | ME/BIAS (m) | MARE | RMSE (m) | R^{2} |
---|---|---|---|---|---|

January | 1.10 | 0.18 | 0.12 | 2.75 | 0.90 |

February | 0.95 | 0.08 | 0.11 | 2.14 | 0.91 |

March | 1.25 | −0.25 | 0.13 | 3.41 | 0.88 |

April | 1.45 | −0.32 | 0.15 | 3.80 | 0.88 |

May | 1.24 | 0.20 | 0.13 | 3.34 | 0.88 |

Month | MAE (m) | ME/BIAS (m) | MARE | RMSE (m) | R^{2} |
---|---|---|---|---|---|

January | 1.10 | 0.18 | 0.12 | 2.75 | 0.90 |

February | 0.95 | 0.08 | 0.11 | 2.14 | 0.91 |

March | 1.25 | −0.25 | 0.13 | 3.41 | 0.88 |

April | 1.45 | −0.32 | 0.15 | 3.80 | 0.88 |

May | 1.24 | 0.20 | 0.13 | 3.34 | 0.88 |

ME, mean error; MAE, mean absolute error; MARE, mean absolute relative error; RMSE, root mean square error; R^{2}, coefficient of determination.

Month | MAE (m) | ME/BIAS (m) | MARE | RMSE (m) | R^{2} |
---|---|---|---|---|---|

January | 1.42 | 0.26 | 0.16 | 3.15 | 0.89 |

February | 1.37 | 0.19 | 0.16 | 3.34 | 0.90 |

March | 1.54 | 0.34 | 0.18 | 3.74 | 0.88 |

April | 1.72 | −0.41 | 0.19 | 4.05 | 0.87 |

May | 1.62 | −0.31 | 0.18 | 3.91 | 0.88 |

Month | MAE (m) | ME/BIAS (m) | MARE | RMSE (m) | R^{2} |
---|---|---|---|---|---|

January | 1.42 | 0.26 | 0.16 | 3.15 | 0.89 |

February | 1.37 | 0.19 | 0.16 | 3.34 | 0.90 |

March | 1.54 | 0.34 | 0.18 | 3.74 | 0.88 |

April | 1.72 | −0.41 | 0.19 | 4.05 | 0.87 |

May | 1.62 | −0.31 | 0.18 | 3.91 | 0.88 |

ME, mean error; MAE, mean absolute error; MARE, mean absolute relative error; RMSE, root mean square error; R^{2}, coefficient of determination.

As presented in Table 1, STOK using the product–sum variogram structure delivers more accurate results overall compared to the product variogram function. Specifically, for the five-month validation period in absolute terms, it delivers 22% less mean absolute prediction error, leading to a closer agreement with the reported values. Another metric to assess the prediction efficiency of STOK in terms of the two variogram models applied is the root mean square standardised error (RMSSE). It is used to assess the adequacy of the kriging variance as an estimate of the prediction uncertainty. The RMSSE is close to one if the kriging variance accurately estimates the variability of the predictions, but if it is greater or less than one, then the prediction variability is underestimated or overestimated, respectively, by the kriging variance. Here, the RMSSE for the five-month validation period when the product–sum variogram type is applied is equal to 0.91, while it is equal to 1.35 when the product type is applied. Thus, STOK with the product–sum model better captures the prediction variability, providing results very close to the measurement values. The RMSSE indicator further supports the prediction efficiency of the product–sum variogram structure compared to the product one.

The aquifer level map is then derived using STOK with the product–sum spatiotemporal variogram structure for February, March, April and May of 2015, the last month of available data and the most recent to date. The January map is very similar to that of February, as at those time periods, the aquifer levels in the basin are usually similar due to the gradual recharge of the aquifer. Thus, to avoid redundancy, only the February map is presented. In March, the highest levels are recorded, because the recharge from the winter period has been completed. In April and May, the pumping period starts, with significant pumping rates in the area. The contour maps of groundwater level spatial variability and uncertainty in physical space are shown in Figures 3 and 4. The maps are constructed using estimates only at points inside the convex hull of the monitoring locations. The estimation uncertainty depends on the data density and the variogram function. For all the maps produced here, the same variogram function and its parameters were used, and the data density was also the same. Therefore, the STOK standard deviation maps (estimation uncertainty) are similar for all the prediction periods. Hence, the uncertainty of estimations for May expresses the uncertainty for other periods as well.

The cross-validation results can be characterised as very satisfactory, as the examined dataset was very small compared to the extensive datasets used in similar works. The delivered cross-validation results and the standard deviation error are directly comparable with those of the two works most closely related to this study (Hoogland *et al.* 2010; Júnez-Ferreira & Herrera 2013). Specifically, the proposed space–time model delivers lower error values compared to the similar studies. The Matérn function has a third shape parameter that leads to the better modelling of the experimental variogram. Thus, the estimated parameters are more reliable, and the space–time kriging estimator provides more accurate results. Compared to similar works (Hoogland *et al.* 2010; Júnez-Ferreira & Herrera 2013), this study extends the applicability of space-time kriging to relatively small space-time datasets without involving auxiliary information. In addition, it addresses a drawback regarding the functions that can be used as spatial and temporal components of a space-time variogram structure to model efficiently both space and time variability (Rouhani & Myers 1990). It shows that if a flexible component is chosen as the Matérn function, the space-time kriging method is reliable to be applied to sparse space-time datasets. The Matérn model controls the degree of smoothness of the examined dataset introducing fitting flexibility (Pardo-Iguzquiza & Chica-Olmo 2008).

Figures 3 and 4 present the spatial variability of the groundwater level for February–May, estimated based on the space–time correlations of the data that consider the dynamic aquifer behaviour. It shows that the groundwater level changes from the eastern part of the basin towards the western part following the ground surface elevation. Figure 4 also presents the uncertainty (STOK standard deviation) of estimations and provides the basin locations where further sampling is needed, mainly at the central eastern part of the aquifer, to obtain more accurate estimations. Estimation variances take the highest values when the density of data is poor and the lowest values when the data distribution is good (Hohn 1999). In addition, the variance range depends on the optimal fitting of the theoretical variogram model to the experimental one (Chiles & Delfiner 1999), as is performed here (Figure 2). Therefore, the spatiotemporal properties of the data are optimally determined, leading to accurate estimates with low error variances.

The scope of this work was to model spatiotemporally the Mires Basin aquifer response since 2009. Reliable modelling provides the grounds for spatiotemporal predictions with the highest possible accuracy. A novel goal of this study was to assess the application of the product–sum variogram model to sparse groundwater level data. The model delivered an excellent variogram fit and very accurate estimates. After the variogram fitting, the spatial correlation length was determined to be almost 3 km, and the temporal length was determined to be almost five months. This explains the accurate estimates, as the spatiotemporal correlations were reliably represented inside the five-month estimation window.

Finally, the aquifer level maps (Figures 3 and 4) show that around a quarter of the examined basin area (western part), which corresponds to a significant part of the productive agricultural land, has a decreased aquifer level compared to the basin's groundwater level presented in previous works during previous hydrological years (Varouchakis & Hristopulos 2013; Varouchakis 2016a; Varouchakis *et al.* 2016).

## CONCLUSIONS

Reliable space–time estimates are important for groundwater resources management. This work examined the spatiotemporal modelling of groundwater levels in a hydrological basin where the groundwater resources had presented a significant reduction in the past 30 years. The spatiotemporal approach involves the application of the product–sum variogram function, which has been applied successfully in other disciplines areas, adapting the flexible Matérn function. This non-separable spatiotemporal structure fits the experimental space–time variogram of the groundwater level very well, capturing the space–time correlations of the available data. The STOK estimates accurately present the groundwater level variability for the examined validation period and provide the spatial distribution of the aquifer level at ungauged locations. The approach is shown to provide a reliable alternative in the spatiotemporal modelling of aquifer levels that requires less data than a numerical model to represent the head field and in less computational time.

## ACKNOWLEDGEMENTS

The author would like to thank the two anonymous reviewers and the editor for their valuable comments and effort to improve the manuscript.

## REFERENCES

*.*

*.*

*.*

*.*

*.*