Joint estimation of states and parameters of two-layer coastal aquifers based on ENKF


 Management of groundwater resources has become a source of heated discussion in coastal hydrogeology. Thus, we introduced an Ensemble Kalman Filter (ENKF) into a two-layer confined groundwater model based on the interactive operation between the MATLAB and GMS to investigate the capability of ENKF under complex conditions and obtain a relatively new forecasting method. ENKF was employed to assimilate and forecast groundwater levels, and invert the hydraulic conductivity (K) of the heterogeneous study area, where the initial values of K were obtained by using trial-and-error based on the two-period groundwater levels. After comparing the efficiencies in forecasting groundwater levels among ENKF, the modified model, and the initial model, four major conclusions could be drawn. ENKF converged fast when forecasting groundwater levels and the accuracy was high. Various convergent results would be represented by ENKF when K in different layers was observed in the same error. ENKF performed better than the initial simulation when monitored data subjected to a certain range of interferences. Forecasting accuracy in the middle of the study area could be enhanced by the large improvement degree of K through ENKF. Therefore, this analytical method could be a theoretical reference for groundwater resources management in coastal areas.

impacts, together with those of excessive human water use have changed the country's water availability structure (Khaki et al. ). At the same time, the spatial and temporal characterizations of groundwater were also altered (Chakraborty et al. ). Therefore, identifying the characterizations of groundwater has become one of the key points in scientifically evaluating and managing groundwater resources. Many methods can be used to predicting the dynamics of groundwater now, such as the evaluation of water balance (Portoghese et al. ), numerical simulation (finite-difference method (Ansarifar et al. ), finite-element method (Huang et al. )), regression analysis (Chenini & Msaddek ), and spectral analysis (Kim et al. ). With the improvement of computing techniques, some calculation methods, such as the artificial neural network (Khedri et al. 2020) and wavelet transform (Rahman et al. ), have been applied to forecast the groundwater level gradually. At present, the relatively traditional methods have been widely used in the analyses of groundwater dynamics. However, they were difficult to integrate realtime data into the model efficiently. The method of sequential data assimilation could utilize the monitored value in time, update the parameters and variables based on historical estimations and the latest data (Tong et al. ). As one of the most popular methods of sequential data assimilation, the Ensemble Kalman Filter (ENKF) was first proposed in 1994 (Evensen ) and was refined by Burgers et al. (). ENKF extended the traditional linear Kalman Filtering (KF) (Zhou et al. ) to the non-linear field, and showed a more stabilized calculation than Extended Kalman Filter (EKF) (Evensen ) in the non-linear field.
ENKF has previously been applied for large-scale nonlinear models in oceanography (Haugen & Evensen ) and hydrology (Margulis et al. ) successfully. The first application of ENKF for flow in porous media was petroleum engineering (Naevdal et al. ). Then, ENKF was introduced into the hydrogeology gradually to acquire the hydraulic conductivity and the migration of pollutant plumes in the synthetic models (Huang et   However, the above-mentioned studies about ENKF were almost based on the virtually synthetic models and rarely applied in the real world. The ENKF was not integrated into the two-layer confined model in the coastal area so far. Thus, conducting a study on the complex groundwater model in the real world is needed (e.g. the groundwater model of the coastal area subjected to strong tidal effects.). This study aims to evaluate the capability of ENKF in forecasting groundwater levels and inverting the hydraulic conductivity of the heterogeneous field in the relatively complex study area. The advantages and drawbacks of ENKF were also been comparatively analyzed. In this study, we introduced ENKF into a petrochemical project in the coastal area of Tianjin city, took the two-layer confined flow model as a prediction function, assimilated the measured groundwater levels into the model, forecast groundwater levels, and inverted the hydraulic conductivity to obtain better prediction results. This paper was structured as follows. Firstly, the operation of ENKF, site description, calculations schemes, and evaluating indexes about the capability of ENKF were described. Then the results of forecasting groundwater levels and second inverse K by ENKF were presented. To show the efficiency of ENKF in forecasting aspects, comparative analyses were depicted as discussions. Finally, conclusions and guidelines for future work were given.

Ensemble Kalman Filter (ENKF)
The Ensemble Kalman Filter (ENKF), a widely used sequential data-assimilation method based on a Monte Carlo approximation, that can update model parameters as well as model variables included in the state vector using various types of serially dynamic observations. ENKF has a better performance than the standard Kalman Filter (KF) due to KF need compute and propagate the error covariance matrix explicitly in time that may pose a significant computation burden for large or nonlinear problems. In contrast, ENKF abandoned the forecasting of the error covariance matrix in the model and directly carried out the multiple integrations based on ensembles via the Monte Carlo method which means it can be calculated based on the continuously updated ensemble of realizations (Chen and Zhang 2006). Therefore, one of the reasons why ENKF widely used was that the number of calculations was greatly reduced. The main procedure for ENKF is as flows: (1) Generation of the initial ensemble where X is an ensemble which concluded some samples; x i is a sample in the ensemble X; m is the size of samples; (2) Forecasting the parameter and state vector where h f k,j is the forecasting value of the sample k at time step j, j stands for the time step, k stands for the number of the sample; h a k,jÀ1 is the analyzed value of the sample k at time step j À 1, the superscripts f and a indicate the forecast and assimilation procedure, respectively; F is a forecast operator, representing the flow equations for our study.
(3) Updating the parameter and state vector where h a k,j is the analyzed value of the sample k at time step j; K k denotes the Kalman gain; d obs,j is the observation vector (monitored-data) at the time step j; H k is the observation operator which represents the relationship between the state vector and the observation vector; C x f k denotes the state error covariance matrix; C d obs,k is the error covariance matrix of the observations.
(4) The assimilated value of ENKF Finally, averaged values 〈h a j 〉 of all samples were set as the assimilated value at time step j.
The four calculated steps mentioned above were the main procedure of ENKF at time step j, then the forecast model will run until new observations become available.
so values at next time step j þ 1 could be forecast by returning the 2nd calculated step, namely Equation (2).
And the forecasting values could fit the monitored-data at the next time step better by assimilating the monitored-data at the last time step into the ENKF model (Shen ). The above steps were realized by an interactive operation between MATLAB and GMS in this study.

Site description
The study area located in the intertidal zone which was reclaimed by the silty sand was about 15 km 2 . According to the hydrogeological survey, the aquifer in the intertidal zone was composed of two confined layers, and the depth of groundwater level was relatively shallow due to the influence of overlying additional pressure and the surrounded seawater. Figure 1 showed the location and the specific environment of the study area. Q1, Q2, Q3, Q4, Q5, Q7, Q8, W4, and W5 were observation wells. Figure 1(b) presented the partitioned blocks of K in layer 2 which was divided into 9 blocks. This was the first inversion of K.
And the first inversion was obtained by using the method of trial-and-error based on the two-period groundwater levels. That is to say, the first inversion K was obtained through changing the value of K and making the predicted groundwater levels fit to monitored levels. The process of changing the values and blocks of K of the groundwater model was the implementation of trial-and-error method.
The K of 9 blocks in layer 2 were named by observation wells, i.e., Q1-K2, Q2-K2, Q3-K2, Q4-K2, and so on. In each block, as shown in Figure 1(b), there was a notation like 'W5-K2(2.8)' which meant the first inverse value of the hydraulic conductivity was 2.8 m À1 d through the method of trial-and-error in the W5 block. Since the hydraulic conductivity of layer 1 was not partitioned (not displayed in the figure), there were 10 inverse blocks of the hydraulic conductivity which would be inverted by ENKF. The first inverse value of partitioned K was utilized as the initial value in ENKF which means the 10 blocks values of K would be inverted by ENKF secondly.

Calculations schemes
ENKF was applied to assimilate and forecast the groundwater level of a two-layer confined model for a petrochemical project in the coastal area in Tianjin. As ] is the source or sink term, which was set zero because of the aquifer was confined in this study; S s [L À1 ] is the specific storage; h 0 (x, y) [L] is the initial groundwater head on the 0th day, of which the depth was about 1.51-5.19 m; τ 1 is the boundary condition of the first type, in this study, the prescribed head was adopted for four sides of the model and the impervious boundary was applied for the bottom of model. The thickness of aquifer was assigned according to the borehole data of hydrogeological survey.
The study area was discretized into squares as shown in Combined with previous studies and considering the area of this study, the 100 samples were adopted, that is, m ¼ 100 On the other hand, considering the accuracy of water level probes and the location of study area is in the intertidal zone, the observation error in the groundwater level was set as 0.01 m and the groundwater level was monitored at each assimilated step during the operation of ENKF (Hu ). The observation error of K was set as 5% of their value and it was only observed in the first time step (i.e., 1.84 d).
The relative observation error of 5% was adopted to study the soil moisture in some previous studies (Shi et al. ; Hu ). In this study, the relative observation error of 5% for K was employed to study whether the same observation error used in both layers could present the same convergent results in ENKF. In reality, it is difficult to define the observation error of K, the value of relative error (5%) was just to test the performance of ENKF in the two-layer confined aquifers.

Evaluating indexes of ENKF
To evaluate the efficiency of assimilation and forecast in ENKF, the RMSE (root-mean-square error) (Naevdal et al.

Results of forecasting groundwater levels
Results of forecasting groundwater levels through ENKF

Results of second inverse K
The 10 blocks values of hydraulic conductivity (K ) in the two-layer confined aquifer were inverted by ENKF secondly of which the initial values were obtained by the first inversion based on the two-period water levels. The second inverse values of K at the final assimilated step, the initial values of K, and the improvement degrees were shown in Table 1. The improvement degrees were calculated by the differences between the second inverse values and the initial values dividing the initial values. It can be seen from Table 1 that the second inverse K of layer 2 smoothly converged to a specific value within 16 assimilation steps, with little difference from initial values. That means when the relative observation error of K was 5%, the assimilation process was convergent for the layer 2, while the second inverse K of the layer 1 was quite different from the initial value, even went beyond the initial scope of interferences. To further explain the differences of inverse K between the two layers in the same model when taking the same observation error, the convergent details of 100 samples in the assimilation process were drawn in Figure 6. To avoid   represented the samples convergences of K in layer 1 for the relative observation error adopted 5%, 4%, 3%, respectively.).

8
X. Huang et al. | Case study of Tianjin city coastal area Water Supply | in press | 2020 Uncorrected Proof change in the mean value of samples after the 8th assimilation step. While the divergence appeared in the assimilated process of ENKF for K1, values of samples could not converge as shown in Figure 6(b). The reason why the divergence appeared was that the noise of model was too large to make the model inaccurate or the observation error was too big to make the Kalman gain matrix K k small, which made the innovation account for a large proportion and lead to the inability to fit the real-field monitored values (Evensen ; Dan et al. ). Because of the same numerical simulation adopted in this paper, the system noise was considered as zero, so we try to make the assimilated process of K1 converge by reducing the observation error and increasing the proportion of the original value. When relative observation error 4% was adopted for the layer 1 in ENKF, although the mean value of K1 did not converge to a specific value within 16 assimilation steps, it was obvious that the convergence of samples appeared as seen in Figure 6(c). While for the relative observation error 3%, ENKF performed better and converged to a specific value in a short time as demonstrated by Figure 6(d).
The observation error had a great influence on the assimilation process as concluded from the above. K2 (the 9 blocks of K in layer 2) could converge with a larger observation error in the ENKF system, while for the K1 (the K in layer 1), a smaller observation error should be applied to make the assimilation converge. It is interesting that ENKF shows a huge difference for the same parameter in the same model. The reason may because the model is two-layer and the location of study area is an intertidal zone. Anthropogenic activities and tidal influences are two main factors that can strongly affect the inversion of K, especially for layer 1 in this study. From section 2.2, the K of layer 1 was not partitioned and the initial inversion based on the twoperiod groundwater levels could have a larger error compared with the K in layer 2. While for the operation of ENKF, the same observation error of 5% was adopted in the two layers, which lead to the divergence of ENKF for K1.
The above analysis showed that even if the same error was adopted in the same model for the same parameter, the performance of ENKF could be different. Especially for the real monitored data subjected to much more external interferences, the inversion of K had a larger error based on the monitored data. If the inverse K with a large error and the inverse K with a small error were inverted by ENKF simultaneously, the possibility of filter divergence was greater.
Thus, observation errors of the same parameter should be considered different in the real-field model, especially for the two or more layers of groundwater models.

DISCUSSION
Comparative analyses between the assimilation before and after Comparative analyses between the modified and the initial simulation As mentioned before, the modified model was obtained by introducing the second inverse K presented in Table 1 into the initial simulation. Relative errors of 9 observation points were displayed in Figure 8. was the highest among all inverse blocks from the Table 1.
The relative large improvement degree in the hydraulic conductivity made Q5 present a large relative error meant that the modified model was greater than the initial model.
That means forecasting accuracy of groundwater levels in the middle of the study area could be enhanced by the large improvement degree of K through ENKF. Figure 8

CONCLUSIONS
In this study, an assimilated system based on ENKF and 2D confined flow model of two layers was constructed. Groundwater levels of the study area were forecast gradually with different time intervals by assimilating the real-field monitored data. And 10 blocks of hydraulic conductivity were inverted by ENKF secondly. Comparative analyses among ENKF, the modified model according to the second inverse K, and the initial model were discussed in this study. The key findings were: (1) ENKF could utilize all the real-field monitored data efficiently, the performance was well acceptable when assimilating and forecasting groundwater levels. However, when monitored data at some observation points were beyond the intensity of interferences accepted by ENKF, the performance was not good. (b) showed the observation points with fluctuated accuracy. Q1-0.059% meant the improvement degree of K2 was 0.059% in the Q1 block.).
(2) Results of the second inverse hydraulic conductivity demonstrated that even if the same parameter in the same model, the relative observation errors of the same parameter could be different, especially for the two-layer or more -layer groundwater model. To improve the accuracy of forecasting, different relative observation errors of the same parameter should be considered in ENKF for the complex groundwater model.