Abstract

Groundwater level prediction and forecasting using univariate time series models are useful for effective groundwater management under data limiting conditions. The seasonal autoregressive integrated moving average (SARIMA) models are widely used for modeling groundwater level data as the groundwater level signals possess the seasonality pattern. Alternatively, deseasonalized autoregressive and moving average models (Ds-ARMA) can be modeled with deseasonalized groundwater level signals in which the seasonal component is estimated and removed from the raw groundwater level signals. The seasonal component is traditionally estimated by calculating long-term averaging values of the corresponding months in the year. This traditional way of estimating seasonal component may not be appropriate for non-stationary groundwater level signals. Thus, in this study, an improved way of estimating the seasonal component by adopting a 13-month moving average trend and corresponding confidence interval approach has been attempted. To test the proposed approach, two representative observation wells from Adyar basin, India were modeled by both traditional and proposed methods. It was observed from this study that the proposed model prediction performance was better than the traditional model's performance with R2 values of 0.82 and 0.93 for the corresponding wells' groundwater level data.

INTRODUCTION

Hydrological systems are widely studied using models that describe and relate various physical processes. Conceptual or physical models are the key tools to understand and predict the variable responses of the system (Anderson & Woessner 1992). However, most of the time, the absence of adequate physical parameters and poor understanding of the physical system may lead to poor modeling of the system. Hence, stochastic modeling that preserves the statistical relationship within the past observations is often used to predict/forecast the response of a system. A time series model is an empirical model for stochastically simulating and forecasting the behavior of uncertain hydrologic systems (Chatfield & Weigend 1994; Sannasiraj et al. 2004; Kim et al. 2005). This kind of model is often useful to model any hydrological system such as groundwater system for groundwater level fluctuation modeling and forecasting where there is limited availability of additional auxiliary data sets.

In general, stochastic time series models can be classified into two kinds: univariate and multivariate models. Univariate models consist of single variable series which can be modeled by autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) group of models (Young 1999; Ahn 2000; Bidwell 2005; Adamowski & Chan 2011; Pektas & Cigizoglu 2013; Bayer et al. 2017), or traditional decomposition-based time series models (Peng & Liu 2000; Yang et al. 2009; Lu et al. 2013; Moeeni et al. 2017). On the other hand, multivariate models involve two or more input variables and their dynamic relationships with the output can be modeled by transfer function-noise (TFN) modeling technique. For example, TFN models predict the groundwater level response with respect to rainfall pulses (Tankersley et al. 1993; Bierkens et al. 1999; Bidwell 2005; Yu & Chu 2012; Shirmohammadi et al. 2013; Mohanasundaram et al. 2017). Multiple linear regression (MLR) techniques for predicting groundwater level involve the process of correlating the groundwater levels with one or more input variables which are causing the groundwater level responses (Sahoo & Jha 2015; Seeboonruang 2015; Seo et al. 2015). The application of artificial neural networks (ANN) and wavelet decomposition and the combination of ANN and wavelet decomposition models are also widely used for predicting groundwater level signals (Coulibaly et al. 2001; Coppola et al. 2003; Daliakopoulos et al. 2005; Kisi 2010; Mohanty et al. 2010; Adamowski & Chan 2011; Sreekanth et al. 2011; Maheswaran & Khosa 2013; Maiti & Tiwari 2014; Khaki et al. 2015; Nourani et al. 2015). However, the accuracy of the prediction and forecasting efficiency of the univariate time series model is mainly dependent on the proper conceptualization, model identification, and parameter estimation. Therefore, time series models for groundwater level signals must be considered for the seasonal component properly as monthly groundwater level signals are mostly influenced by periodicity or seasonality.

Seasonal autoregressive integrated moving average (SARIMA) is one of the varieties of ARIMA models. SARIMA models are especially applied to time series data having a strong seasonality component. Seasonal component is the significant component in the hydrological time series variables such as rainfall, stream flow, and groundwater level time series data. In general, finding suitable ARIMA and SARIMA models for modeling a univariate system is a trial and error process to find the suitable model structure with appropriate model parameters based on Box and Jenkins time series methodology (Box & Jenkins 1976). There have been many studies illustrating the application of SARIMA models for flow modeling and forecasting the flow in river systems (Govindasamy 1991; Mondal & Wasimi 2006; Modarres 2007; Wong et al. 2007; Fernández et al. 2008; Martins et al. 2011). Comparison of ARIMA models against deseasonalized ARMA (DARMA) models for weekly, monthly, and bi-monthly time series of karstic flow showed that DARMA model forecasting performance was better for the bi-monthly and monthly time series flow data, whereas the ARIMA model performed only slightly better than DARMA models for weekly data (Ghanbarpour et al. 2012). Similarly, there have been many studies illustrating the application of SARIMA models to predict and forecast the groundwater level fluctuations (Ahn & Salas 1997; Ahn 2000; Montanari et al. 2000; Adhikary et al. 2012; Shirmohammadi et al. 2013).

Paul (2008) developed a univariate time series model to forecast groundwater levels. Two different multiplicative SARIMA models, SARIMA (2,0,0)(3,1,0)12 and SARIMA ((1,2,10),0,0)(3,1,0)12 were developed. The developed models were compared for their forecasting accuracy in terms of root mean square (RMSE) values. The SARIMA ((1,2,10),0,0)(3,1,0)12 model performed better than the SARIMA (2,0,0)(3,1,0)12 model. Similarly, Adhikary et al. (2012) developed various seasonal ARIMA models and traditional decomposition time series models to fit the weekly groundwater level fluctuation data. Seasonal ARIMA (1,1,1)(1,1,1)52 model was selected among other forms of seasonal ARIMA models since the selected model error was free from autocorrelation process. In addition to that, it was proved in their study that the forecasting performance of the chosen seasonal ARIMA (1,1,1)(1,1,1)52 model was performing better in terms of mean absolute error (MAE), mean square error (MSE), and maximum absolute error (Max AE) when compared to traditional decomposition models. As the seasonal component is significant in the raw groundwater level signals, the coefficient for fitting the seasonal AR and MA components in SARIMA models perhaps have huge uncertainty in reproducing the seasonal variation in the prediction or validation period.

Decomposition-based models are developed based on decomposing the actual time series signal into trend, seasonal, and random components. Decomposition model applications on groundwater level response have been widely studied, for example, Seeboonruang (2014) studied the groundwater level response against climate and anthropogenic activities by decomposing deep groundwater signal into trend, seasonal, and random components. Eight different methods of trend estimation technique were applied to long-term groundwater levels monitored in the lower Chao Phraya basin in Thailand. Fifth-order polynomial interpolation of trendlines significantly matched with groundwater withdrawal pattern. Similarly, the periodic behavior occurring from detrended residuals was compared with the global climatic variability. They concluded that the magnitude response of deep groundwater in the region was much affected by the anthropogenic activities as compared to the global climatic variability. Similarly, Almedeij & Al Ruwaih (2006) investigated the periodic pattern of groundwater level fluctuation for six monitoring wells located in residential areas in Kuwait. Monthly groundwater level measurements obtained from observation wells were correlated with monthly averaged temperature and monthly totaled rainfall time series data over 10 years. The cyclic pattern of groundwater level together with rainfall and temperature was examined in the frequency domain using periodogram analysis. The trend component of groundwater level series was fitted by simple regression method in their study. Furthermore, they estimated the seasonal component by calculating monthly averages of detrended groundwater level series. However, the traditional way of fitting the trend component and estimating the seasonal component might cause a significant error when the mean and variance in the groundwater level time series data is a heteroscedastic process (varying mean and variance). Therefore, there is a need to consider a proper trend and seasonal estimate in the groundwater level time series data for minimizing the error in real-time prediction and forecasting the groundwater levels using decomposition-based time series models.

Traditional deseasonalization technique for monthly groundwater level data is done by calculating long-term monthly average values for the corresponding months from the time series data. The long-term monthly average values are used as the seasonal component in the traditional decomposition-based models. In general, these seasonal values denote the representative average values from the long-term groundwater level fluctuation data. These seasonal values may be representative average values for the stationary groundwater level fluctuation data as mean and variance is almost constant over time. However, these seasonal values might not be representative average values for non-stationary groundwater level fluctuation data as mean and variance varies over time.

Therefore, in this study, we have developed a novel technique to properly estimate the seasonal component and deseasonalize the time series by adopting a 13-month trend and corresponding confidence interval-based approach. The specific objectives of this study are: (1) to construct the confidence interval for irregularly varying groundwater level signal based on 13-month moving average trend values to estimate the seasonal component and (2) to compare the prediction performance of the proposed DARMA model against the traditional DARMA and SARIMA models.

METHODOLOGY

Basic steps in time series modeling

Time series analysis is done by four major steps: (1) model identification, (2) parameter estimation, (3) residual verification, and (4) model application. Before identifying the model structure, the raw time series data were checked for non-stationary unit root process using an augmented Dickey–Fuller (ADF) test. The ADF test tests the null hypothesis that the model has a unit root against an alternative hypothesis of a stationary process. The ADF test is given by: 
formula
(1)
the null hypothesis is,

H0: = 0

and the alternate hypothesis is,

H1: < 0

where is the time series value at time t; is the differencing operator; p is the number of lagged difference terms; is the autoregressive parameter; is the white noise residual; and c is the constant.

Non-stationary data can be corrected by first- or higher-order differencing of the time series data to make them stationary. Once the data are converted to the stationary process, a suitable number of AR and MA model parameters were identified by looking at the autocorrelation function (ACF) and partial autocorrelation function plots (PACF). After identifying the ARMA model structure, the parameter values were estimated by the maximum likelihood technique. Finally, the estimated model residuals were checked for independence, normality, and homoscedastic conditions prior to prediction and forecasting applications during the validation time period.

Deseasonalized ARMA models

Traditional deseasonalization method

Traditionally, the seasonal component was estimated by calculating the long-term monthly average values for each of the months from raw time series data. The deseasonalized time series was obtained by subtracting the estimated seasonal component from raw time series data. The traditional deseasonalized series was modeled with ARMA model parameters, represented here as TDs-ARMA as follows (Hipel & McLeod 1994): 
formula
(2)
 
formula
 
formula

where are non-seasonal autoregressive parameters of order p; are non-seasonal moving average parameters of order q; is the groundwater level value during the month j and year i; is the estimated mean monthly values corresponding to the month j.

Proposed deseasonalization method

The proposed 13-month moving average and corresponding confidence interval-based seasonal estimation and deseasonalization methods are explained in four major steps as follows.

Step 1. Estimating 13-month trend-cycle term: The trend component of the time series data was estimated by cell centered 13-month moving average filter. This rolling 13-month moving average filter calculates the trend component at the center cell based on the adjacent six cells' values on both the sides of the center cell.

Step 2. Constructing confidence interval around the 13-month moving average trend line: The mean and standard deviation values of each 13-cell sliding window were calculated along with the estimated trend values. The confidence interval was constructed around the 13-month trend line using the estimated mean and standard deviation values. The standard equations of 99% confidence interval were used for constructing the confidence interval around the trend line. The upper bound (Ub) and lower bound (Lb) equations for 99% confidence interval are given as follows: 
formula
(3)
 
formula
(4)
where is mean of the sliding window values; is standard deviation of the sliding window values; N is number of sliding window cells (13 numbers).

Step 3. Modified deseasonalizing: Unlike traditional deseasonalizing, the range of the raw time series data used for calculating the seasonal component in modified deseasonalizing were restricted with the bounded regions based on the constructed confidence interval limits (Equations (3) and (4)). As the confidence interval region excludes some of the extremely varying groundwater level values, the smoothed range of the groundwater level data set was only used for calculating the seasonal component. This way of limiting the groundwater level time series data from extreme values appropriately estimates the seasonal component for heteroscedastic groundwater level signals. But, the traditional deseasonalization method considers the entire actual range of groundwater level time series data for calculating the seasonal component, which might be overestimated in the case of highly varying groundwater level conditions. Furthermore, the estimated modified seasonal component was detected from the raw time series data to obtain the modified deseasonalized time series.

Step 4. Fitting the ARMA model for modified deseasonalized time series data: The modified deseasonalized time series data were fitted with proper AR and MA models, represented here as MDs-ARMA as follows: 
formula
(5)
where is the monthly average estimate corresponding to the month j from the modified deseasonalization approach.

SARIMA models

Multiplicative ARIMA model, also called SARIMA model, is an extended version of ARIMA model where seasonal autoregressive (SAR) and seasonal moving average (SMA) terms can be modeled along with non-seasonal AR and MA models. It is represented as follows: 
formula
(6)
where 
formula
 
formula
 
formula
 
formula

where are non-seasonal autoregressive parameters; are non-seasonal moving average parameters; are seasonal autoregressive parameters; are seasonal moving average parameters; is the water level value at time t; is the white noise; , are non-seasonal orders; , are seasonal orders; c is a constant; D is the order of non-seasonal differencing; and is the order of seasonal differencing.

Model information criterion indices

Different forms of SARIMA models can be specified by changing the number of seasonal and non-seasonal parameters for the same time series data. Therefore, an information criterion-based value will help in selecting a better model by compromising the error from the models along with the number of significant model parameters used in the model and the number of data values (n). In this context, the Akaike information criterion (AIC) and Bayesian information criterion (BIC) are used in this study to compare the SARIMA models and choose the best one among them. AIC (Akaike 1974) and BIC (Schwarz 1978) formulations are given as follows: 
formula
(7)
 
formula
(8)
 
formula
(9)
where L is the log likelihood value; m is the number of parameters in the model; n is the number of observations; and is the residual sum square.

Residual test

Residuals of the calibrated time series models were tested for their serial independence. Ljung–Box (LB) test was used to test if the model residual autocorrelation coefficients were statistically different from zero at a given level of significance or not. LB test is given by (Ljung & Box 1978): 
formula
(10)
where n is sample size; m is the lag length; is the autocorrelation coefficient; and is the Chi-square value with m degrees of freedom.

Model performance indices

Root mean square error (RMSE) and coefficient of determination (R2) indices were used to assess the developed time series model performances during the validation data period. The RMSE and R2 formulations are given as follows: 
formula
(11)
 
formula
(12)
where N is the sample size; is the observed value; is the forecasted/predicted value; and is the observed mean value.

Application site: Adyar basin

The Adyar basin is in the northeast coastal part of Tamil Nadu, India (Figure 1). The basin receives rainfall during both southwest (June–September) and northeast monsoons (October–December) with the northeast monsoon contributing more than 65% of the total annual average rainfall. The southwest monsoon rainfall is erratic in nature and summer rainfall is negligible. Long-term annual average rainfall is about 1,315 mm (WRO 2007). Elevation ranges from 183 m in the west to sea level in the eastern part of the basin. Soils in the basin have been classified into clayey, black, red sandy, alluvial, and colluvial soils. Black soils occur in the depressions adjacent to hilly areas in the western part. Alluvial soils occur along the river courses and eastern part of the coastal areas. Major hydrogeology in the basin has been classified as unconsolidated, semi-consolidated, and weathered and fractured rock formations. Groundwater occurs under phreatic and semi-confined conditions in inter-granular pore spaces in sands, sandstones, and in bedding planes and thin fractures in shales (WRO 2007). The Adyar basin river network with the locations of observation wells is shown in Figure 1. Observation wells are mostly concentrated in the eastern part of the Adyar basin.

Figure 1

Study area with observation wells' location map.

Figure 1

Study area with observation wells' location map.

Data used

Twenty-nine observation wells monthly depth to water table data were procured from the Institute for Water Studies (IWS), Taramani. However, for brevity, the proposed time series method, MDs-ARMA, along with TDs-ARMA and SARIMA was tested on two observation wells (well-13012 and well-13235) as shown in Figure 1 (as selected wells). The depth to water table data were converted to groundwater level elevation above mean sea level after subtracting the depth to water table values from digital elevation model values of the corresponding well locations in Adyar basin. The total length of the monthly groundwater level data used in this study is about 240 months (N = 240) from January 1988 to December 2007. Out of 240 months, the first 180 months' values (3/4th of the total data set) were used to calibrate the times series models and the remaining 60 months' values (1/4th of the total data set) were used to validate the models.

RESULTS AND DISCUSSION

Autocorrelation and partial autocorrelation analysis

As the first step in time series modeling, raw time series data were analyzed for the unit root non-stationary process (Figure 2). From the ADF test, it was observed that the null hypothesis for both the wells (well-13012 and well-13235) were accepted with the p values of 0.57 and 0.43 at 95% confidence level, respectively. According to ADF test, both the wells (well-13012 and well-13235) groundwater level times series data were implying non-stationarity due to the existence of the unit root process. Therefore, in SARIMA models, a seasonal differencing was done to change the non-stationary series to a stationary series. However, in the case of deseasonalized models, long-term average monthly values subtracted from raw time series (deseasonalized series) data itself changed the non-stationary data into stationary data. Therefore, no further differencing was done in the deseasonalized data as the deseasonalized time series itself was satisfying the stationarity test.

Figure 2

Groundwater level data for (a) well-13012 and (b) well-13235.

Figure 2

Groundwater level data for (a) well-13012 and (b) well-13235.

ACF and PACF plots of raw time series data for both the wells show the strong seasonality pattern in the monthly groundwater level time series data (Figures 3 and 4). This is obvious from the fact that the groundwater level response is a function of climatic variables such as rainfall and evapotranspiration which are also seasonal in nature. Therefore, the SARIMA models were chosen with seasonal differencing along with SAR and SMA parameters. The partial autocorrelation was significant at the lag of 1 month for both the wells which indicates that the data need to be modeled with simple non-seasonal AR process with one AR and MA parameters for both SARIMA and DARMA models.

Figure 3

ACF plots for (a) well-13012 and (b) well-13235.

Figure 3

ACF plots for (a) well-13012 and (b) well-13235.

Figure 4

PACF plots for (a) well-13012 and (b) well-13235.

Figure 4

PACF plots for (a) well-13012 and (b) well-13235.

Deseasonalized time series using traditional and proposed methods

Groundwater level signal along with 13-month moving average trend and the constructed confidence interval for both the wells are shown in Figures 5(a) and 6(a). The extreme variation in the groundwater level signal out of the confidence interval appears during pre-monsoon (June) and post-monsoon (January) seasons within the year and a repeated pattern was observed for all the years in the calibration data set (Figures 5(a) and 6(a)). The seasonal component was estimated by averaging the corresponding monthly values from January to December in the observed groundwater level signals. The resulting 12-month values were repeated for the remaining years during the calibration period. This is shown in Figures 5(b) and 6(b) as the traditional seasonal component. On the other hand, the modified seasonal component was estimated from January to December by averaging the corresponding months using the time series data in which the extremely varying groundwater level signals beyond the confidence interval were replaced with the corresponding boundary values of the confidence interval. The resulting 12-month values were repeated for the remaining years in the calibration period. This is shown in Figures 5(b) and 6(b) as modified seasonal component. As the extremely varying time series data were smoothed by confidence interval values, the resulting estimated modified seasonal component values were relatively lesser in magnitude when compared to traditionally estimated seasonal values (Figures 5(b) and 6(b)).

Figure 5

Decomposition of the time series data by MDs-ARMA model for (a) trend, (b) seasonal, and (c) deseasonal components of well-13012.

Figure 5

Decomposition of the time series data by MDs-ARMA model for (a) trend, (b) seasonal, and (c) deseasonal components of well-13012.

Figure 6

Decomposition of the time series data by MDs-ARMA model for (a) trend, (b) seasonal, and (c) deseasonal components of well-13012.

Figure 6

Decomposition of the time series data by MDs-ARMA model for (a) trend, (b) seasonal, and (c) deseasonal components of well-13012.

A significant difference among the traditional and modified seasonal component values was observed. For example, the estimated traditional and modified seasonal values ranged from 3.5 m to 7 m and 4.7 m to 6.4 m, respectively, for well-13012. A similar difference in the computed seasonal values among traditional and modified time series methods was observed for the other well (well-13235) also. Furthermore, the deseasonalized time series for TDs-ARMA and MDs-ARMA were obtained by subtracting the respective estimated traditional and modified seasonal components from the raw groundwater level time series signals for both the wells (Figures 5(c) and 6(c)). As there was a difference among traditional and modified seasonal estimates, there were corresponding changes in the calculated deseasonalized time series data (Figures 5(c) and 6(c)).

Model identification and parameter estimation

SARIMA models

The initial model structure can be guessed based on ACF and PACF plots of the time series data with unique combinations of seasonal and non-seasonal parameters for SARIMA models (Figures 3 and 4). Therefore, the three most suitable SARIMA model forms were identified for each well's time series data, from which the best one was selected based on AIC and BIC values (Table 1). The three identified forms of SARIMA models for both the wells are as follows:

Table 1

Model assessment indices of the identified SARIMA models using calibration data set

Well no. Model AIC BIC 
13012 SARIMA(1,0,0)(1,1,0)12 470.6 480.2 
 SARIMA(1,0,1)(1,1,0)12 470.0 482.5 
 SARIMA(1,0,1)(0,1,1)12 429.9 442.7 
13235 SARIMA(1,0,0)(1,1,0)12 585.8 595.4 
 SARIMA(1,0,1)(1,1,0)12 587.6 600.4 
 SARIMA(1,0,1)(0,1,1)12 554.5 567.3 
Well no. Model AIC BIC 
13012 SARIMA(1,0,0)(1,1,0)12 470.6 480.2 
 SARIMA(1,0,1)(1,1,0)12 470.0 482.5 
 SARIMA(1,0,1)(0,1,1)12 429.9 442.7 
13235 SARIMA(1,0,0)(1,1,0)12 585.8 595.4 
 SARIMA(1,0,1)(1,1,0)12 587.6 600.4 
 SARIMA(1,0,1)(0,1,1)12 554.5 567.3 
SARIMA(1,0,0)(1,1,0)12 
formula
(13)
SARIMA(1,0,1)(1,1,0)12 
formula
(14)
SARIMA(1,0,1)(0,1,1)12 
formula
(15)

Table 1 shows that the AIC and BIC values were comparatively less for the third SARIMA model, SARIMA(1,0,1)(0,1,1)12, when compared to the other two SARIMA models for both the wells. Therefore, the best model was chosen as SARIMA(1,0,1)(0,1,1)12 for further parameter estimation and model predictions. The best SARIMA model parameters were estimated using maximum likelihood technique and the estimated values and corresponding statistics are given in Table 2. Statistical t-test and p-value for AR and SMA parameters were highly significant at 0.01 alpha level (99% confidence level), whereas one of the MA parameters for well-13235 was not statistically significant. The constant parameters were also not statistically significant for both wells (Table 2). Therefore, it can be concluded that the SARIMA (1,0,1)(0,1,1)12 process during the calibration time period was mainly driven by AR and SMA parameters rather than MA and constant parameters.

Table 2

Estimated parameters of the selected SARIMA model

Well no. Model Parameter Value Standard error t-value p-value 
13012 SARIMA(1,0,1)(0,1,1)12 Constant −0.02 0.02 −0.77 0.22 
  ϕ1 0.65 0.08 8.41 0.00 
  θ1 0.20 0.08 2.41 0.01 
  ϴ12 −0.79 0.05 −15.00 0.00 
13235 SARIMA(1,0,1)(0,1,1)12 Constant 0.01 0.02 0.38 0.35 
  ϕ1 0.80 0.05 17.04 0.00 
  θ1 0.03 0.07 0.48 0.32 
  ϴ12 −0.77 0.05 −15.99 0.00 
Well no. Model Parameter Value Standard error t-value p-value 
13012 SARIMA(1,0,1)(0,1,1)12 Constant −0.02 0.02 −0.77 0.22 
  ϕ1 0.65 0.08 8.41 0.00 
  θ1 0.20 0.08 2.41 0.01 
  ϴ12 −0.79 0.05 −15.00 0.00 
13235 SARIMA(1,0,1)(0,1,1)12 Constant 0.01 0.02 0.38 0.35 
  ϕ1 0.80 0.05 17.04 0.00 
  θ1 0.03 0.07 0.48 0.32 
  ϴ12 −0.77 0.05 −15.99 0.00 

TDs-ARMA model

TDs-ARMA model parameters were estimated using maximum likelihood method and the corresponding statistics are given in Table 3. The p-value of autoregressive parameters for both the wells' data was highly significant at 0.01 alpha level (99% confidence level). However, the MA parameters for both the wells show relatively less significance than AR parameters during the calibration period.

Table 3

Estimated parameters for TDs-ARMA model

Well no. Model Parameter Value Standard error t-value p-value 
13012 TDs-ARMA(1,1) Constant 0.00 0.06 0.05 0.48 
  ϕ1 0.74 0.07 10.90 0.00 
  θ1 0.15 0.09 1.62 0.05 
13235 TDs-ARMA(1,1) Constant 0.00 0.08 0.03 0.49 
  ϕ1 0.81 0.05 16.79 0.00 
  θ1 0.01 0.07 0.12 0.45 
Well no. Model Parameter Value Standard error t-value p-value 
13012 TDs-ARMA(1,1) Constant 0.00 0.06 0.05 0.48 
  ϕ1 0.74 0.07 10.90 0.00 
  θ1 0.15 0.09 1.62 0.05 
13235 TDs-ARMA(1,1) Constant 0.00 0.08 0.03 0.49 
  ϕ1 0.81 0.05 16.79 0.00 
  θ1 0.01 0.07 0.12 0.45 

MDs-ARMA model

The AR and MA model coefficients for the MDs-ARMA model were estimated using the maximum likelihood estimation technique using MATLAB software. The estimated AR and MA model parameters for MDs-ARMA model are given in Table 4. The AR coefficients for both the wells were significant at alpha 0.01 level (99% confidence level), but the MA coefficients were significant at 0.05 alpha level and 0.01 alpha level for well-13012 and well-13235, respectively (Table 4). This shows that the modified deseasonalized series model yields significant values for both AR and MA coefficients, unlike the traditional deseasonalized time series model.

Table 4

Estimated parameters for MDs-ARMA model

Well no. Model Parameter Value Standard error t-value p-value 
13012 MDs-ARMA(1,1) Constant −0.02 0.08 −0.29 0.39 
  ϕ1 0.71 0.08 8.80 0.00 
  θ1 0.23 0.10 2.33 0.01 
13235 MDs-ARMA(1,1) Constant 0.02 0.11 0.18 0.43 
  ϕ1 0.75 0.06 11.79 0.00 
  θ1 0.17 0.08 2.13 0.02 
Well no. Model Parameter Value Standard error t-value p-value 
13012 MDs-ARMA(1,1) Constant −0.02 0.08 −0.29 0.39 
  ϕ1 0.71 0.08 8.80 0.00 
  θ1 0.23 0.10 2.33 0.01 
13235 MDs-ARMA(1,1) Constant 0.02 0.11 0.18 0.43 
  ϕ1 0.75 0.06 11.79 0.00 
  θ1 0.17 0.08 2.13 0.02 

Residual analysis for the developed models

A diagnostic check for the calibrated SARIMA, TDs-ARMA, and MDs-ARMA models was done using residual analysis for white noise process (zero mean with unit variance). The Ljung and Box statistical test proves that the overall serial correlation of the modeled error was statistically insignificant for both the wells for all three models.

Groundwater level prediction and validation by developed models

One-month rolling ahead prediction of the developed time series models along with the observed groundwater level data during the validation period for both the wells are shown in Figure 7(a) and 7(b). In general, it was observed that the decomposition-based ARMA models (TDs-ARMA and MDs-ARMA) predict the groundwater levels better than the SARIMA model. Although the TDs-ARMA and MDs-ARMA model performances were relatively close to each other, the MDs-ARMA model performs comparatively better than the TDs-ARMA model for both the wells.

Figure 7

One-month rolling ahead prediction of groundwater levels using TDs-ARMA, MDs-ARMA, and SARIMA models for (a) well-13012 and (b) well-13235.

Figure 7

One-month rolling ahead prediction of groundwater levels using TDs-ARMA, MDs-ARMA, and SARIMA models for (a) well-13012 and (b) well-13235.

The improved model performance of the MDs-ARMA model over TDs-ARMA model was mainly due to the improved estimated seasonal component based on 13-month trend and confidence interval approach. However, the traditional way of estimating the seasonal component values were relatively higher than the proposed model thus reducing the overall model performance for the TDs-ARMA model (Figure 7(a) and 7(b)).

SARIMA models are the least performing models for both the wells as they highly over- and under-predict the actual groundwater level signal peaks and troughs during pre-monsoon and post-monsoon seasons (Figure 7). This is mainly due to the fact that the estimated SMA parameters were not able to capture the actual variation in the extreme peaks and troughs of the groundwater level signals. This uncertainty in SMA parameters might have been lower for the homoscedastic (constant mean and constant variance) groundwater level signals as the seasonal variation among the years is perhaps less variable. However, in this study, both the groundwater level signals show a strong heteroscedastic process, the estimated SMA parameters fail to mimic the actual seasonal variation in the groundwater level signals.

Figure 8 shows the predicted and observed groundwater levels on 1:1 diagonal plot for all three models. As well-13012 is located in the highly elevated hard rock terrain part of the Adyar basin, the variation in the groundwater level signal is naturally varying in higher magnitude as the rainfall response in the fractured geological system is highly dynamic. This is the reason for all models predicting the groundwater level signal with comparatively higher deviation from the 1:1 line (Figure 8(a)). However, well-13235 is located in the central part of the basin where the geological condition is highly porous, thus the rainfall response in this region on groundwater level signal is more or less uniform. As a result, the developed time series model predictions were close to the 1:1 line (Figure 8(b)). However, as expected, the MDs-ARMA model predicted values were lying closer to the 1:1 line than the other two models for both the well locations. SARIMA model-predicted values during some periods highly deviated from the diagonal line, thus reducing this model's overall performance significantly compared to TDs-ARMA and MDs-ARMA models (Figure 8).

Figure 8

Predicted and observed groundwater levels on 1:1 scatter line for TDs-ARMA, MDs-ARMA, and SARIMA models for (a) well-13012 and (b) well-13235.

Figure 8

Predicted and observed groundwater levels on 1:1 scatter line for TDs-ARMA, MDs-ARMA, and SARIMA models for (a) well-13012 and (b) well-13235.

The calculated RMSE and R2 values during the validation period for the developed time series models are shown in Table 5. MDs-ARMA model prediction performance during the validation period was always higher when compared to the other two models. The reduction in RMSE values for the MDs-ARMA model over the TDs-ARMA and SARIMA models was 10 cm to 30 cm and 100 cm to 180 cm, respectively, for both wells (Table 5).

Table 5

Performance assessment of the developed time series models during the validation period

Well Models Validation statistics
 
RMSE (m) R2 
13012 SARIMA 1.75 0.13 
 TDs-ARMA 0.90 0.76 
 MDs-ARMA 0.79 0.82 
13235 SARIMA 2.53 0.00 
 TDs-ARMA 0.93 0.86 
 MDs-ARMA 0.64 0.93 
Well Models Validation statistics
 
RMSE (m) R2 
13012 SARIMA 1.75 0.13 
 TDs-ARMA 0.90 0.76 
 MDs-ARMA 0.79 0.82 
13235 SARIMA 2.53 0.00 
 TDs-ARMA 0.93 0.86 
 MDs-ARMA 0.64 0.93 

CONCLUSIONS

A unique deseasonalized ARMA model, MDs-ARMA, was developed to assess its prediction ability over traditional SARIMA and TDs-ARMA models in this study. A novel deseasonalizing method was proposed in this study based on 13-month moving average trend and confidence interval method for estimating the seasonal component. These models were tested for two observation wells' groundwater level signals. The groundwater level signal for both wells shows a strong seasonal pattern which was further identified from the ACF and PACF plots. Based on ACF and PACF plots, suitable SARIMA models with different forms were recommended, from which a best SARIMA model (SARIMA (1,0,1)(0,1,1)12) was selected based on AIC and BIC values. Similarly, the deseasonalized time series models (TDs-ARMA and MDs-ARMA) were modeled with one AR and MA parameter as they were already adjusted for the seasonal component.

Maximum likelihood method was adopted to estimate the developed model parameters. Autoregressive parameters were more highly significant than other parameters in all the developed time series models. The calibrated models were then used to predict the groundwater levels in a validation period and corresponding RMSE and R2 values were calculated for the respective models. It was observed that the proposed method was performing better than traditional deseasonalized ARMA and SARIMA models. The specific conclusions from this study can be summarized as follows:

  • Deseasonalized ARMA models consider less number of parameters when compared to SARIMA models as deseasonalized ARMA models were modeled with one AR and MA parameters, whereas SARIMA models were modeled with one extra seasonal MA parameter along with one nonseasonal AR and MA parameter.

  • Both deseasonalized ARMA models (TDs-ARMA and MDs-ARMA) reproduced the seasonal variation pattern of the actual groundwater level signal during the validation period.

  • The estimated seasonal component by traditional deseasonalizing approach was comparatively higher than the proposed deseasonalizing approach as the actual variation in the groundwater level signal was used to estimate the long-term monthly average values. This reduces the TDs-ARMA model performances when compared to MDs-ARMA model.

  • The proposed MDs-ARMA model consistently performed better with both the wells' groundwater level signals, thus, the application of this model is robust across various geological conditions.

In this study, these time series models were evaluated with 1-month rolling ahead prediction method. However, the multi-time step prediction and forecasting performance of the developed models need to be analyzed for long-term forecasting applications. Although univariate time series models are useful in prediction and forecasting of groundwater level signals under data limiting conditions, they cannot be used to predict the response due to climate change and land use change scenarios. Under such conditions, the time series model and their coefficients need to be related to the physical process and parameters to assess the climatic and land use change cases.

ACKNOWLEDGEMENTS

We would like to acknowledge the State Ground and Surface Water Resources Data Centre, IWS, Taramani, Chennai for providing groundwater level data to carry out this study. We would like to acknowledge the Water Engineering and Management program at Asian Institute of Technology for providing the computing facility. Finally, we would like to thank the reviewers for their valuable comments to improve this manuscript.

REFERENCES

REFERENCES
Adamowski
J.
&
Chan
H. F.
2011
A wavelet neural network conjunction model for groundwater level forecasting
.
Journal of Hydrology
407
(
1
),
28
40
.
Adhikary
S. K.
,
Rahman
M. M.
&
Gupta
A. S.
2012
A stochastic modelling technique for predicting groundwater table fluctuations with time series analysis
.
International Journal of Applied Science and Engineering Research
1
(
2
),
238
249
.
Ahn
H.
&
Salas
J. D.
1997
Groundwater head sampling based on stochastic analysis
.
Water Resources Research
33
(
12
),
2769
2780
.
Akaike
H.
1974
A new look at the statistical model identification
.
IEEE Transactions on Automatic Control
19
(
6
),
716
723
.
Almedeij
J.
&
Al-Ruwaih
F.
2006
Periodic behavior of groundwater level fluctuations in residential areas
.
Journal of Hydrology
328
,
677
684
.
Anderson
M. P.
&
Woessner
W. W.
1992
Applied Groundwater Modeling: Simulation of Flow and Advective Transport
.
Academic Press
,
San Diego, CA
,
USA
.
Bayer
F. M.
,
Bayer
D. M.
&
Pumi
G.
2017
Kumaraswamy autoregressive moving average models for double bounded environmental data
.
Journal of Hydrology
555
,
385
396
.
Bidwell
V. J.
2005
Realistic forecasting of groundwater level, based on the eigenstructure of aquifer dynamics
.
Mathematics and Computers in Simulation
69
,
12
20
.
https://doi.org/10.1016/j.matcom.2005.02.023
.
Bierkens
M. F. P.
,
Knotters
M.
&
van Geer
F. C.
1999
Calibration of transfer function-noise models to sparsely or irregularly observed time series
.
Water Resources Research
35
(
6
),
1741
1750
.
Box
G. E. P.
&
Jenkins
G. W.
1976
Time Series Analysis: Forecasting and Control
,
revised edn
.
Holden-Day
,
San Francisco, CA
,
USA
.
Coulibaly
P.
,
Anctil
F.
,
Aravena
R.
&
Bobée
B.
2001
Artificial neural network modeling of water table depth fluctuations
.
Water Resources Research
37
(
4
),
885
896
.
Daliakopoulos
I. N.
,
Coulibaly
P.
&
Tsanis
I. K.
2005
Groundwater level forecasting using artificial neural networks
.
Journal of Hydrology
309
,
229
240
.
Fernández
C.
,
Vega
J. A.
,
Fonturbel
T.
&
Jiménez
E.
2008
Streamflow drought time series forecasting: a case study in a small watershed in North West Spain
.
Stochastic Environmental Research and Risk Assessment
23
(
8
),
1063
1070
.
Ghanbarpour
R. M.
,
Amiri
M.
,
Zarei
M.
&
Darvari
Z.
2012
Comparison of stream flow predicted in a forest watershed using different modelling procedures: ARMA, ANN, SWRRB, and IHACRES models
.
International Journal of River Basin Management
10
,
281
292
.
Govindasamy
R.
1991
Univariate Box-Jenkins forecasts of water discharge in Missouri river
.
International Journal of Water Resources Developemnt
7
,
168
177
.
Hipel
K. W.
&
Mcleod
A. I.
1994
Time Series Modeling of Water Resources and Environmental Systems
.
Developments in Water Science Series
, Vol.
45
.
Elsevier
,
New York
,
USA
.
Khaki
M.
,
Yusoff
I.
&
Islami
N.
2015
Simulation of groundwater level through artificial intelligence system
.
Environmental Earth Sciences
73
,
8357
8367
.
Ljung
G. M.
&
Box
G. E. P.
1978
On a measure of lack of fit in time series models
.
Biometrika
65
(
2
),
297
303
.
Martins
O. Y.
,
Ahaneku
I. E.
&
Mohanned
S. A.
2011
Parametric linear stochastic modelling of Benue river flow process
.
Open Journal of Marine Science
1
(
3
),
73
81
.
Modarres
R.
2007
Streamflow drought time series forecasting
.
Stochastic Environmental Research and Risk Assessment
21
(
3
),
223
233
.
Mohanasundaram
S.
,
Narasimhan
B.
&
Suresh Kumar
G.
2017
Transfer function noise modelling of groundwater level fluctuation using threshold rainfall-based binary-weighted parameter estimation approach
.
Hydrological Sciences Journal
62
,
36
49
.
Mohanty
S.
,
Jha
M. K.
,
Kumar
A.
&
Sudheer
K. P.
2010
Artificial neural network modeling for groundwater level forecasting in a river island of eastern India
.
Water Resources Management
24
,
1845
1865
.
Mondal
M. S.
&
Wasimi
S. A.
2006
Generating and forecasting monthly flows of the Ganges river with PAR model
.
Journal of Hydrology
323
(
1–4
),
41
56
.
Montanari
A.
,
Rosso
R.
&
Taqqu
M. S.
2000
A seasonal fractional ARIMA Model applied to the Nile River monthly flows at Aswan
.
Water Resources Research
36
(
5
),
1249
1259
.
Nourani
V.
,
Alami
M. T.
&
Vousoughi
F. D.
2015
Wavelet-entropy data pre-processing approach for ANN-based groundwater level modeling
.
Journal of Hydrology
524
,
255
269
.
Paul
P. K.
2008
Ground Water Table Forecasting in A Shallow Aquifer: SARIMA Modeling
.
PhD thesis
,
University of Rajshahi
,
Bangladesh
.
Pektas
A. O.
&
Cigizoglu
H. K.
2013
ANN hybrid model versus ARIMA and ARIMAX models of runoff coefficient
.
Journal of Hydrology
500
,
21
36
.
Peng
M. H.
&
Liu
J. K.
2000
Groundwater Level Forecasting with Time Series Analysis
.
Geospatial World
.
Sahoo
S.
&
Jha
M. K.
2015
On the statistical forecasting of groundwater levels in unconfined aquifer systems
.
Environmental Earth Sciences
73
,
3119
3136
.
Sannasiraj
S. A.
,
Zhang
H.
,
Babovic
V.
&
Chan
E. S.
2004
Enhancing tidal prediction accuracy in a deterministic model using chaos theory
.
Advances in Water Resources
27
(
7
),
761
772
.
Schwarz
G.
1978
Estimating the dimension of a model
.
The Annals of Statistics
6
,
461
464
.
Seeboonruang
U.
2014
An empirical decomposition of deep groundwater time series and possible link to climate variability
.
Global NEST Journal
16
,
87
103
.
Shirmohammadi
B.
,
Vafakhah
M.
,
Moosavi
V.
&
Moghaddamnia
A.
2013
Application of several data-driven techniques for predicting groundwater level
.
Water Resources Management
27
,
419
432
.
Sreekanth
P. D.
,
Sreedevi
P. D.
,
Ahmed
S.
&
Geethanjali
N.
2011
Comparison of FFNN and ANFIS models for estimating groundwater level
.
Environmental Earth Sciences
62
,
1301
1310
.
Tankersley
C. D.
,
Graham
W. D.
&
Hatfield
K.
1993
Comparison of univariate and transfer function models of groundwater fluctuations
.
Water Resources Research
29
(
10
),
3517
3533
.
Wong
H.
,
Ip
W. C.
,
Zhang
R. Q.
&
Xia
J.
2007
Non-parametric time series models for hydrological forecasting
.
Journal of Hydrology
332
(
3–4
),
337
347
.
WRO (Water Resources Organization)
2007
Micro Level Study, Chennai Basin
,
Vol. I
.
Institute of Water Studies
,
Taramani
,
Chennai
,
India
Young
P. C.
1999
Nonstationary time series analysis and forecasting
.
Progress in Environmental Science
1
(
1
),
3
48
.
Yu
H. W.
&
Chu
H. J.
2012
Recharge signal identification based on groundwater level observations
.
Environmental Monitoring and Assessment
184
,
5971
5982
.