## Abstract

Seawater intrusion is one of the most serious issues to threaten coastal aquifers. Tourian aquifer, which is selected as the case study, is located in Qeshm Island, Persian Gulf. In this study, first the vulnerability of the region to seawater intrusion is assessed using chloride ion concentration value, then by using the autoregressive integrated moving average (ARIMA) model, the vulnerability of the region is predicted for 14 wells in 2018. The results show that the Tourian aquifer experiences moderate vulnerability and the area affected by seawater intrusion is wide and is in danger of expanding. It is also found that 0.95 km^{2} of the region is in a state of high vulnerability with Cl concentration being in a dangerous condition. The prediction model shows that ARIMA (2,1,1) is the best model with mean absolute error of 13.3 mg/L and Nash–Sutcliffe value of 0.81. For fitted and predicted data, mean square error is evaluated as 235.3 and 264.3, respectively. The prediction results show that vulnerability is increasing through the years.

## INTRODUCTION

Groundwater resources supply a considerable proportion of water consumption in many countries, hence it is important to identify the sources of contamination and provide approaches to reduce the risk of contamination. Coastal aquifers can be exposed to seawater intrusion due to their geological status, so it is crucial to recognize and manage this phenomenon. Qeshm island is the biggest island located in the Persian Gulf which is affected by seawater intrusion and investigating its current and future status is very important.

Different approaches have been applied by researchers for quantifying seawater intrusion in aquifers such as developing indices. One of the most common indices used in vulnerability assessment is called DRASTIC, which was developed by Aller (1985). DRASTIC considers seven parameters including depth to water, net recharge, aquifer media, soil media, topography, impact of the vadose zone, and hydraulic conductivity. Lobo-Ferreira *et al.* (2005) provided an index for assessing the vulnerability of coastal aquifers to seawater intrusion. The proposed index, called GALDIT, is based on six parameters including groundwater occurrence, aquifer hydraulic conductivity, height of groundwater level above sea elevation, distance from shore, impact magnitude of the existing seawater intrusion, and thickness of the aquifer. Kura *et al.* (2015) assessed the vulnerability of groundwater to seawater intrusion in tropical islands using DRASTIC and GALDIT indices. They also used sensitivity analysis to determine the effect of each parameter on the final value. Trabelsi *et al.* (2016) mapped seawater intrusion vulnerability by using GALDIT and groundwater quality index for seawater intrusion (GQISWI). They determined the trend of groundwater contamination by seawater intrusion in the coastal aquifer in Tunisia. The results show that the coastal area is experiencing seawater intrusion and the northern part is the most affected. One of the newest indices was introduced by Ballesteros *et al.* (2016). They developed an index to assess the vulnerability of the coastal aquifer in Spain called the SITE index, which stands for surface affected, intensity of intrusion, temporality and evolution. The GALDIT method does not consider dynamic features of the region and the SITE index covers this shortcoming. They applied this index for four different coastal aquifers in Spain and compared the results. Kazakis *et al.* (2018) modified the GALDIT method by developing a modified fuzzy multi-criteria categorization method to assess seawater intrusion vulnerability in a coastal aquifer of northern Greece. The modified GALDIT-F method achieved higher discretization of vulnerability zones which is essential to build a rational management plan to prevent seawater intrusion. Baena-Ruiz *et al.* (2018) assessed seawater intrusion vulnerability with 2D conceptual cross sections using Cl concentration. They delivered vulnerability maps considering Cl concentration values. They also analyzed the evolution of historical time series of hydraulic head to assess resiliency of the region.

Other approaches, such as using conceptual estimation, numerical modeling, and predicting the distribution of seawater intrusion for the near future, have been carried out by researchers. Lin *et al.* (2009) developed a numerical model of variable-density groundwater flow and salt transport to investigate the extent of seawater intrusion in the Gulf coast aquifers of Alabama in the USA. They used SEAWAT, a numerical model designed to simulate three-dimensional variable-density groundwater flow, to solve the density-dependent groundwater flow and solute transport governing equations. Using the calibrated model, they ran a predictive 40-year simulation which indicates that further seawater intrusion into the coastal aquifers can occur in the study area. Taormina *et al.* (2012) applied feed forward neural networks (FFNs) for simulating groundwater levels hourly in a coastal aquifer located in Venice, Italy. The results show that the FNN model has high accuracy. They also concluded that the suggested network can be used as a viable alternative to physical-based models to simulate the future status. Saidi *et al.* (2013) prepared groundwater management alternatives within a geographical information system for coastal aquifers by studying their vulnerability and, in particular, the risk of seawater intrusion. They also proposed some agricultural policies aimed at conserving groundwater resources. Gholami *et al.* (2015) used a multilayer perceptron (MLP) to simulate groundwater level fluctuations in alluvial aquifers. The results showed a high degree of accuracy and efficiency in the simulation of groundwater levels. Mirzavand & Ghazavi (2015) used different prediction models in Kashan plain, Isfahan province, Iran, which is an arid and semi-arid environment, to predict groundwater level fluctuations. They applied five time series models including AR, MA, ARMA, ARIMA, and SARIMA for 36 wells. The results showed that the AR model with a two-times lag (AR(2)), shows the best performance with a high accuracy of R^{2} (0.85). Yan & Ma (2016) delivered a hybrid model to predict monthly groundwater level in China for two observation wells. They used the ARIMA model to estimate the linear principal of time series and radial basis function network (RBFN) to predict the nonlinear residuals. They compared predicted values of the hybrid model with the ARIMA model and RBFN model. The result shows that the proposed hybrid model has the advantages of both models. Katimon *et al.* (2018) utilized ARIMA model to predict water quality variables, namely, pH, color (TCU), turbidity (ppm), Al^{3+} (ppm), Fe^{2+} (ppm), NH_{4}^{+} (ppm), and Mn^{2+} (ppm), and salinity for Johor River, Malaysia. Results showed that ARIMA is a suitable model for predicting water quality variables in short-term periods. Rahaman & Kalra (2019) used a seasonal ARIMA model to predict monthly groundwater level in Colorado River Basin. They concluded that the seasonal ARIMA model is able to predict with reasonable accuracy. De Moraes Takafuji *et al.* (2019) predicted groundwater level increase which effects the seawater intrusion in São Paulo State, Brazil using ARIMA (autoregressive integrated moving average) and SGS (sequential Gaussian simulation). The results show that ARIMA has higher accuracy while SGS can consider severe situations such as drought. Kumar & Rathnam (2019) predicted monthly groundwater level trends of Warangal district for 40 wells for 2015 using the ARIMA model. The results showed that some wells experience significant positive trends while others experience significant negative trends. ARIMA (2,0,1), ARIMA (1,1,1), and ARIMA (1,0,1) fitted for most of the wells.

Previous studies concentrated on groundwater fluctuations in coastal aquifers or rivers. In this study, using chloride ion concentration, a vulnerability map is delivered for the region. Then, by utilizing the ARIMA model, the distribution of Cl concentration and vulnerability status of the region are predicted for one year ahead.

## METHODOLOGY

In this study, the vulnerability of seawater intrusion is evaluated by using the SITE index and a Cl concentration map is provided to show the distribution of vulnerability. The SITE index is used in this study based on the following advantages. (1) It is one of the newest indices for applying in seawater intrusion (introduced in 2016). (2) Its simplicity: the formulation and concept is easy and understandable. (3) Required data: it only needs Cl concentration which is common data measured in most observation wells. Some indices need other water quality parameters such as H^{−}CO_{3} ion which is not measured in some stations such as the current case study (Tourian aquifer, Iran). (4) Considering the seasonality and dynamic features of the case study, which is not widely considered in previous studies. (5) Its structure and parameters provide the opportunity to be used in prediction models. Then, by applying the ARIMA model, the status of the region is predicted and is compared with the current condition. A sensitivity analysis is done to determine the importance of each parameter used in the SITE index.

### The SITE index

The SITE index (Ballesteros *et al.* 2016) is one of the recent indices introduced for assessing vulnerability of an aquifer to seawater intrusion. It uses chloride concentration obtained from observation wells as an input in different time periods. Three terms, including Vr, CS, and PS, are used in this approach and are described first: Reference value (Vr) refers to the common concentration of chloride ion in coastal aquifers. Values more than the Vr indicate the presence of saline water in the region. The World Health Organization (1996) suggested a chloride ion concentration threshold for human consumption as 250 mg/L, which is used as the Vr value in this study. Current situation (CS) is the latest four years and preceding situation (PS) indicates the four-year period before CS. The SITE index consists of four main parameters which are described in the following.

#### Surface affected (S)

*S*represents the surface of the aquifer which is affected by seawater and the concentration of chloride ion in that area is greater than Vr. To calculate this parameter, the isochloride map of the area is prepared by plotting isolines for current situation (CS). The isolines are usually defined as 1Vr, 2Vr, and 4Vr. If the chloride concentration is high in the region, more isolines can be drawn, such as 8Vr or 16Vr. The value of the parameter

*S*can be obtained using the following equation:in which

*S*is the total surface of the aquifer and

_{t}*S*is the surface in which the amount of chloride ion concentration exceeds the reference value.

_{a}#### Intensity of the intrusion (I)

*I*indicates the average salinity of the aquifer in the current situation. This parameter is obtained from the isochloride map. To evaluate this parameter, first the area between the isolines (

*S*) (with a value greater than Vr) is calculated and then is multiplied by the mean concentration of chloride in that area (

_{i}*Cl*). The result is divided by surface that exceeds the amount of chloride ion concentration from the reference value (

_{i}*S*):

_{a}#### Temporality (T)

*T*indicate that it is difficult for the region to return to its initial state after seawater intrusion occurred. It is calculated by Equation (3):where

*f(x)max*and

*f(x)min*are the maximum and minimum concentration of Cl, respectively,

*fmax*is the mean maximum value in current situation and

*n*is number of years considered in the current situation state (in this study it is four years,

*n*= 4).

#### Evolution (E)

*E*parameter considers the effect of water quality evolution in two specific periods called CS and PS. Equation (4) shows the formulation:in which is average chloride concentration in CS and is the average chloride concentration in the PS. Table 1 shows the classification and assigned values for each parameter used in the SITE index.

Parameter . | Range . | Characteristic . | Assigned value . |
---|---|---|---|

S | 0.00–0.10 | Very low | 0 |

0.11–0.25 | Low | 1 | |

0.26–0.50 | Moderate | 2 | |

0.51–0.75 | High | 3 | |

0.76–1.00 | Extreme | 4 | |

I | 250 | Very low | 0 |

251–500 | Low | 1 | |

501–1,000 | Moderate | 2 | |

1,001–2,000 | High | 3 | |

2,000 | Extreme | 4 | |

T | 0.00–0.10 | Very low | 0 |

0.11–0.20 | Low | 1 | |

0.21–0.30 | Moderate | 2 | |

0.31–0.40 | High | 3 | |

0.40 | Extreme | 4 | |

E | 0.85 | Significant recovery | −2 |

0.85–0.97 | Moderate recovery | −1 | |

0.98–1.02 | Stable | 0 | |

1.03–1.15 | Moderate deterioration | 1 | |

1.15 | Significant deterioration | 2 |

Parameter . | Range . | Characteristic . | Assigned value . |
---|---|---|---|

S | 0.00–0.10 | Very low | 0 |

0.11–0.25 | Low | 1 | |

0.26–0.50 | Moderate | 2 | |

0.51–0.75 | High | 3 | |

0.76–1.00 | Extreme | 4 | |

I | 250 | Very low | 0 |

251–500 | Low | 1 | |

501–1,000 | Moderate | 2 | |

1,001–2,000 | High | 3 | |

2,000 | Extreme | 4 | |

T | 0.00–0.10 | Very low | 0 |

0.11–0.20 | Low | 1 | |

0.21–0.30 | Moderate | 2 | |

0.31–0.40 | High | 3 | |

0.40 | Extreme | 4 | |

E | 0.85 | Significant recovery | −2 |

0.85–0.97 | Moderate recovery | −1 | |

0.98–1.02 | Stable | 0 | |

1.03–1.15 | Moderate deterioration | 1 | |

1.15 | Significant deterioration | 2 |

The maximum value is 30, hence the equation is divided by 30 for standardization. The SITE index categories are given in Table 2.

The SITE index value . | Status of intrusion . |
---|---|

0.10 | Very low |

0.11–0.25 | Low |

0.26–0.50 | Moderate |

0.51–0.75 | High |

0.76–1.00 | Extreme |

The SITE index value . | Status of intrusion . |
---|---|

0.10 | Very low |

0.11–0.25 | Low |

0.26–0.50 | Moderate |

0.51–0.75 | High |

0.76–1.00 | Extreme |

### Time series modeling (ARIMA)

For predicting seawater intrusion, the autoregressive integrated moving average (ARIMA) method is applied. To achieve this goal, chloride ion concentration is predicted for 2018 based on recorded values between 2010 and 2017. In order to validate the results, the predicted values are compared to the measured values.

The ARIMA model, introduced by Box & Jenkins (1976), is a stochastic model which is one of the most popular linear models in time series prediction. This model is capable of simulating dependent variables that have many predictors and contain some nonstationary degrees. The ARIMA model has some advantages such as predicting capability and richer information on time-related changes (Mishra & Desai 2005). In most time series, there is a serial correlation among the observations. Such a characteristic is efficiently considered by ARIMA.

*p,d,q*) model is as follows:where

*W*is the

_{t}*d*

^{th}difference of the original data and

*ζ*is the white noise. and are the lag polynominals,

*B*is the lag operator, and

*p*are the parameter and order of autoregressive (AR) part, and

*q*are parameter and order of moving average (MA) part, respectively, while

*d*represents the degree of ordinary differencing applied to make the series stationary.

*et al.*1980). The performance of the ARIMA model was evaluated by applying two criteria: Nash–Sutcliffe (NS) and mean absolute error (MAE). The Nash–Sutcliffe criterion is the explained proportion of the variance of the observed data which is shown in Equation (7) (Nash & Sutcliffe 1970). NS value higher than 0.5 indicates that the model is predicting well.where

*y*is the predicted value,

_{i}*o*is the observed value, and is mean of observation values. The mean absolute error is the average of the absolute errors presented as the following equation:

_{i}## STUDY AREA

Qeshm Island is located in the Persian Gulf, Iran (Figure 1). It is the biggest island in the Persian Gulf and has an area of 1,491 km^{2} and is about 135 km long. The mean rainfall and temperature in this region are about 183 mm and 27 °C, respectively. It is also the most populated island of Iran according to the 2016 census, which iterates that the population is about 150,000 (Statistical Center of Iran 2016). Drinking water is supplied by groundwater resources, specifically the Tourian aquifer which has an area of 17.21 km^{2} and is located on the north-east side of the island. A previous study showed that the aquifer is only affected by the northern shore due to the structure and topography of the region (Iran Water Resource Management Company 2010). Previous studies in the region did not focus on seawater intrusion vulnerability and prediction. In this study, these issues are covered.

The chloride ion concentration data for 14 observation wells were obtained from Iran Water Resource Management Company (2019). The values were collected and recorded monthly in the region from 2010 to 2017.

## RESULTS AND DISCUSSION

### The SITE index

First, the Cl distribution map is constructed from the mean value over the time period (2010–2017) (Figure 2). Then, it is categorized into 250 (Vr), 500 (2Vr), and 1,000 (4Vr) mg/L sections for implementation in the SITE index and which is shown in Figure 3. Finally, the SITE index is calculated for an eight-year period (2010–2017), in which the current situation (CS) is considered between 2014 and 2017 and the preceding situation (PS) is between 2010 and 2013.

Figure 2 shows that Cl concentration varies between 104 and 566 mg/L. The highest and lowest value of Cl concentration occurred near the shoreline and southern part of the aquifer, respectively. According to Figure 3, 0.95 km^{2} of the region is in a high vulnerability state with Cl concentration being in a dangerous condition, while 8.54 km^{2} is in alerting state and 7.72 km^{2} of the aquifer experiences no threat as yet.

*S* and *I* parameters are computed based on the area which has Cl concentration value more than reference value (Vr), which is presented in Table 3. *T* and *E* parameters are calculated based on current and preceding chloride concentration in the region and shown in Table 4.

Chloride concentration (mg/L) . | 0–250 . | 250–500 . | 500–1,000 . | Chloride concentration range (mg/L) . | 250–500 . | 500–1,000 . |
---|---|---|---|---|---|---|

S | 7.72 | 8.54 | 0.95 | (mg/L) | 375 | 750 |

S _{t} | 17.21 | 3,558.75 | 7,117.5 | |||

S | 0.552 | I | 390.157 |

Chloride concentration (mg/L) . | 0–250 . | 250–500 . | 500–1,000 . | Chloride concentration range (mg/L) . | 250–500 . | 500–1,000 . |
---|---|---|---|---|---|---|

S | 7.72 | 8.54 | 0.95 | (mg/L) | 375 | 750 |

S _{t} | 17.21 | 3,558.75 | 7,117.5 | |||

S | 0.552 | I | 390.157 |

Maximum annual . | Minimum annual . | Difference Max/Min . | T
. | concentration in 2010–2013 . | concentration in 2014–2017 . | E
. |
---|---|---|---|---|---|---|

800 | 18.5 | 781.5 | 0.962 | 198.35 | 216.44 | 0.916 |

Maximum annual . | Minimum annual . | Difference Max/Min . | T
. | concentration in 2010–2013 . | concentration in 2014–2017 . | E
. |
---|---|---|---|---|---|---|

800 | 18.5 | 781.5 | 0.962 | 198.35 | 216.44 | 0.916 |

The SITE index is computed based on all four aforementioned parameters. The result is shown in Table 5.

Parameters . | S
. | I
. | T
. | E
. | The SITE index . |
---|---|---|---|---|---|

Values | 0.552 | 390.157 | 0.962 | 0.916 | 0.475 |

Assigned value | 3 | 1 | 4 | −1 | |

Characterization | High | Low | Extreme | Moderate recovery | Moderate |

Parameters . | S
. | I
. | T
. | E
. | The SITE index . |
---|---|---|---|---|---|

Values | 0.552 | 390.157 | 0.962 | 0.916 | 0.475 |

Assigned value | 3 | 1 | 4 | −1 | |

Characterization | High | Low | Extreme | Moderate recovery | Moderate |

The results of the SITE index show that the Tourian aquifer experiences moderate vulnerability with a value of 0.475, which is close to extreme category, hence comprehensive planning and management should be considered for this region. The *S* parameter is in the high category, which shows that the area affected by seawater intrusion is broad and in danger of expanding. Intensity of intrusion is still at low level but the seasonality is in an extreme situation that shows high variation of Cl concentration through the year. Also, the *E* parameter indicates moderate recovery from seawater intrusion. It suggests that the aquifer is not in a condition that will easily return to its initial state.

### Time series modeling (ARIMA)

The ARIMA model has three parameters termed *p*, *d*, and *q*, which are lag order, degree of differencing, and order of moving average, respectively. For evaluating d, first the time series of Cl ion concentration is plotted (Figure 4). As can be seen, there is a positive trend in the time series. This is verified by the Mann–Kendall test (Kendall 1975). Based on the Mann–Kendall test, the Z-score is computed as greater than 3.5, which represents a positive trend in the time series.

For removing the trend, one lag time difference is applied and the Z-score is reduced to −0.398. By considering a 95% confidence interval, the time series has become stationary (Figure 5).

Initial values for parameters *p* and *q* are based on autocorrelation function (ACF) and partial autocorrelation function (PACF). As can be seen in Figures 6 and 7, the initial estimate for *p* and *q* is considered 2 and 1, respectively.

Akaike information criterion (AIC) is used to find the ARIMA model with the best simulation performance (Akaike 1974). Based on this criterion, different values for *q* and *p* are considered for different model simulations. The result showed that ARIMA (2,1,1) is the best model and hence it is selected in this study. The chosen ARIMA model is applied using Equation (6) and then the Cl concentration values are predicted for all 14 observation wells in 2018. Then, new isochloride maps are prepared for predicted values and a new SITE index is calculated. Figure 8 shows the comparison between observation and predicted values for observation well 1. It is one of the observation wells in which the Cl concentration is higher than Vr. The mean absolute error (MAE) and Nash–Sutcliffe (NS) for the ARIMA model are 13.43 mg/L and 0.81, respectively. Mean square error (MSE) for fitted and predicted data is 235.3 and 264.3, respectively. According to MAE, NE, and MSE error values, the ARIMA model is predicting well. ARIMA fit residual error line and density is shown in Figures 9 and 10, respectively.

The Cl concentration is predicted for all 14 observation wells and the SITE index is calculated for new predicted values. The results are shown in Figures 11 and 12. Figure 11 shows that NS value for each observation well is more than 0.5, which indicates that the model is predicting well. Figure 12 illustrates that the SITE index value increased from 0.475 to 0.481, which indicates that more areas of the aquifer are shifting to the high status of vulnerability. This is important for preparedness purposes since the model can alert that hazard might be occurring.

### Sensitivity analysis

A sensitivity analysis is also applied to determine the importance of each parameter used in the SITE index. The method used for sensitivity analysis is one factor at a time (OAT), which considers the effect of one parameter on output value while the others are supposed to be constant. Table 6 shows the output variation (the SITE index) in terms of 10, 20, 30, and 40% changes in *S*, *I*, *T*, and *E* parameters.

Parameters . | Percent of variation of parameters . | |||||||
---|---|---|---|---|---|---|---|---|

±10% . | ±20% . | ±30% . | ±40% . | |||||

Lower . | Upper . | Lower . | Upper . | Lower . | Upper . | Lower . | Upper . | |

S | −26.31% | 0% | −26.31% | 0% | −26.31% | 0% | −26.31% | 26.31% |

I | 0% | 0% | 0% | 0% | 0% | 15.79% | −15.79% | 15.79% |

T | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% |

E | −7.02% | 7.02% | −7.02% | 14.04% | −7.02% | 21.05% | −7.02% | 21.05% |

Parameters . | Percent of variation of parameters . | |||||||
---|---|---|---|---|---|---|---|---|

±10% . | ±20% . | ±30% . | ±40% . | |||||

Lower . | Upper . | Lower . | Upper . | Lower . | Upper . | Lower . | Upper . | |

S | −26.31% | 0% | −26.31% | 0% | −26.31% | 0% | −26.31% | 26.31% |

I | 0% | 0% | 0% | 0% | 0% | 15.79% | −15.79% | 15.79% |

T | 0% | 0% | 0% | 0% | 0% | 0% | 0% | 0% |

E | −7.02% | 7.02% | −7.02% | 14.04% | −7.02% | 21.05% | −7.02% | 21.05% |

The results show that three of the parameters are sensitive to changes. Parameter *S* is sensitive even in lower band changes (10%) while parameter *I* varies when the percentage of change is high (30%). Parameter *E* is also sensitive from lower bands of change (10%) to higher bands (40%) but compared to parameter *S* it has a lower percentage of variation. According to the categorization delivered in Table 1, parameter *T* is divided into five categories: 0–0.1, 0.11–0.2, 0.21–0.3, 0.31–0.4, and greater than 0.4. *T* is evaluated to be 0.962 in this study so it is included in the last category (greater than 0.4). It needs a dramatic change in its value to be included in the third category (0.31–0.4), which is the reason parameter *T* does not vary even with 40% change in the sensitivity analysis.

As can be seen from the study, the ARIMA model showed its advantage in predicting water quality parameters in short time periods.

## SUMMARY AND CONCLUSION

In this study, the SITE index is used to assess seawater intrusion vulnerability of the Tourian aquifer located on Qeshm island, Persian Gulf. The results showed that the aquifer is categorized as moderate vulnerability (the SITE index value of 0.475). In the next step, the ARIMA model is applied to predict the vulnerability of the aquifer for 2018. The results show that Cl concentration is increasing in the following year. The ARIMA model worked with acceptable accuracy (NE value of 0.81 and MAE value of 13.43 mg/L). Finally, a sensitivity analysis was done to determine the importance of each parameter in the region. Parameter *S* (surface affected) was shown to be the most sensitive parameter among them. The results of this study suggest that the residual error of the ARIMA model increases over time and makes the model less applicable. Hence, the ARIMA model can be a good tool for groundwater quality prediction in a short time period. In this study, Cl concentration is used as an indicator for seawater intrusion vulnerability. It is suggested that for future studies other water quality parameters are also considered to evaluate the effects on the results. Also, a comparison can be made between different prediction models to investigate each model accuracy in the case of seawater intrusion.